The problem is that OpenAI don't really have the enterprise market at all. Their APIs are closer in that many companies are using them to power features in other software, primarily Microsoft, but they're not the ones providing end user value to enterprises with APIs.
As for ChatGPT, it's a consumer tool, not an enterprise tool. It's not really integrated into an enterprises' existing toolset, it's not integrated into their authentication, it's not integrated into their internal permissions model, the IT department can't enforce any policies on how it's used. In almost all ways it doesn't look like enterprise IT.
This remind me why enterprise don't integrated OpenAI product into existing toolset, trust is root reason.
It's hard to provide trust to OpenAI that they won't steal data of enterprise to train next model in a market where content is the most valuable element, compared office, cloud database, etc.
Microsoft 360 has over 300 million corporate users - trusting it with email, document management, and collaboration etc. It’s the defacto standard in larger companies especially in banking, medicine and finance that have more rigorous compliance regulations.
The administrative segments that decide to sell their firstborn to Microsoft all have their heads in the clouds. They'll pay Microsoft to steal their data and resell it and they'll defend their decisions making beyond their own demise.
As such Microsoft is doing the right choice in outright stealing data for whatever purpose. It will have no real consequences.
IT policy flick of the switch disables that, such as at my organization. That was instead intended to snag single, non-corporate, user accounts (still horrible, but I mean to convey that MS at no point in all that expected a company's IT department to actually leave that training feature enabled in policy).
It doesn't need to / it already is – most enterprises are already Microsoft/Azure shops. Already approved, already there. What is close to impossible is to use anything non Microsoft - with one exception – open source.
They betrayed their customers in the Storm-0558 attack.
They didn't disclose the full scale and charged the customers for the advanced logging needed for detections.
Not to mention that they abolished QA and outsourced it to the customer.
Maybe they aren't, but when you already have all your documents in sharepoint, all your emails in outlook and all your databases VMs in Azure, then Azure OpenAI is trusted in the organization.
For some reason (mainly because Microsoft has orders of magnitude more sells reps than anything else) companies have been trusting Microsoft for their most critical data for a long time.
For example when they backed the CEOs coup against the board.
With AI-CEOs - https://ai-ceo.org - This would never have happened because their CEOs have a kill switch and mobile app for the board for full observability
OpenAi enterprise plan especially says that they do not train their models with your data. It's in the contract agreement and it's also visible on the bottom of every chatgpt prompt window.
It seems like a damned if you do, damned if you don't. How is ChatGPT going to provide relevant answers to company specific prompts if they don't train on your data?
My personal take is that most companies don't have enough data, and not in sufficiently high quality, to be able to use LLMs for company specific tasks.
The model from OpenAI doesn’t need to be directly trained on the company’s data. Instead, they provide a fine-tuning API in a “trusted” environment. Which usually means Microsoft’s “Azure OpenAI” product.
But really, in practice, most applications are using the “RAG” (retrieval augmented generation) approach, and actually doing fine tuning is less common.
> The model from OpenAI doesn’t need to be directly trained on the company’s data
Wouldn't that depend on what you expect it to do? If you just want say copilot, summarize texts or help writing emails then you're probably good. If you want to use ChatGPT to help solve customer issues or debug problems specific to your company, wouldn't you need to feed it your own data? I'm thinking: Help me find the correct subscription to a customer with these parameters, then you'd need to have ChatGPT know your pricing structure.
One idea I've had, from an experience with an ISP, would be to have the LLM tell customer service: Hey, this is an issue similar to what five of your colleagues just dealt with, in the same area, within 30 minutes. You should consider escalating this to a technician. That would require more or less live feedback to the model, or am I misunderstanding how the current AIs would handle that information?
100% this. If they can figure out trust through some paradigm where enterprises can use the models but not have to trust OpenAI itself directly then $200 will be less of an issue.
> It's hard to provide trust to OpenAI that they won't steal data of enterprise to train next model
Bit of a cynical take. A company like OpenAI stands to lose enormously if anyone catches them doing dodgy shit in violation of their agreements with users. And it's very hard to keep dodgy behaviour secret in any decent sized company where any embittered employee can blow the whistle. VW only just managed it with Dieselgate by keeping the circle of conspirators very small.
If their terms say they won't use your data now or in the future then you can reasonably assume that's the case for your business planning purposes.
lawsuits over the legality of using using someone's writing as training data aren't the same thing as them saying they won't use you as training data and then doing so. they're different things. one is people being upset that their work was used in a way they didn't anticipate, and wanting additional compensation for it because a computer reading their work is different from a person reading their work. the other is saying you won't do something and then doing that anyway and lying about it.
It's not that anyone suspects OpenAI doing dodgy shit. Data flowing out of an enterprise is very high risk. No matter what security safeguards you employ. So they want everything inside their cloud perimeter and on servers they can control.
IMO no big enterprise will adopt chatGPT unless it's all hosted in their cloud. Open source models lend better to the enterprises in this regard.
> IMO no big enterprise will adopt chatGPT unless it's all hosted in their cloud
80% of big enterprises already use MS Sharepoint hosted in Azure for some of their document management. It’s certified for storing medical and financial records.
Cynical? That’d be on brand… especially with the ongoing lawsuits, the exodus of people and the CEO drama a while back? I’d have a hard time recommending them as a partner over Anthropic or Open Source.
It's not enough for some companies that need to ensure it won't happen.
I know for a fact a major corporation I do work for is vehemently against any use of generative A.I. by its employees (just had that drilled into my head multiple times by their mandatory annual cybersecurity training), although I believe they are working towards getting some fully internal solution working at some point.
Kind of funny that Google includes generative A.I. answers by default now, so I still see those answers just by doing a Google search.
I've seen the enterprise version with a top-5 consulting company, and it answers from their global knowledgebase, cites references, and doesn't train on their data.
I recently (in the last month) asked ChatGPT to cite its sources for some scientific data. It gave me completely made up, entirely fabricated citations for academic papers that did not exist.
The behavior you're describing sounds like an older model behavior. When I ask for links to references these days, it searches the internet the gives me links to real papers that are often actually relevant and helpful.
I don’t recall that it ever mentioned if it did or not. I don’t have the search on hand but from my browser history I did the prompt engineering on 11/18 (which perhaps there is a new model since then?).
I actually repeated the prompt just now and it actually gave me the correct, opposite response. For those curious, I asked ChatGPT what turned on a gene, and it said Protein X turns on Gene Y as per -fake citation-. Asking today if Protein X turns on Gene Y ChatGPT said there is no evidence, and showed 2 real citations of factors that may turn on Gene Y.
So sorry to offend your delicate sensibilities by calling out a blatant lie from someone completely unrelated to yourself. Pretty bizarre behavior in itself to do so.
as just another example, chatgpt said in the Okita paper that they switched media on day 3, when if you read the paper they switched the media on day 8. so not only did it fail to generate the correct reference, it also failed to accurately interpret the contents of a specific paper.
I’m a pretty experienced developer and I struggle to get any useful information out of LLMs for any non-trivial task.
At my job (at an LLM-based search company) our CTO uses it on occasion (I can tell by the contortions in his AI code that isn’t present in his handwritten code. I rarely need to fix the former)
And I think our interns used it for a demo one week, but I don’t think it’s very common at my company.
Won’t name my company, but we rely on Palantir Foundry for our data lake.
And the only thing everybody wants [including Palantir itself] is to deploy at scale AI capabilities tied properly to the rest of the toolset/datasets.
The issues at the moment are a mix of IP on the data, insurance on the security of private clouds infrastructures, deals between Amazon and Microsoft/OpenAI for the proper integration of ChatGPT on AWS, all these kind of things.
But discarding the enterprise needs is in my opinion a [very] wrong assumption.
Very personal feeling, but without a datalake organized the way Foundry is organized, I don’t see how you can manage [cold] data at scale in a company [both in term of size, flexibility, semantics or R&D]. Given the fact that IT services in big companies WILL fail to build and maintain such a horribly complex stack, the walled garden nature of the Foundry stack is not so stupid.
But all that is the technical part of things. Markets do not bless products. They bless revenues. And from that perspective, I have NO CLUE.
This is what's so brilliant about the Microsoft "partnership". OpenAI gets the Microsoft enterprise legitimacy, meanwhile Microsoft can build interfaces on top of ChatGPT that they can swap out later for whatever they want when it suits them
I think this is good for Microsoft, but less good for OpenAI.
Microsoft owns the customer relationship, owns the product experience, and in many ways owns the productionisation of a model into a useful feature. They also happen to own the datacenter side as well.
Because Microsoft is the whole wrapper around OpenAI, they can also negotiate. If they think they can get a better price from Anthropic, Google (in theory), or their own internally created models, then they can pressure OpenAI to reduce prices.
OpenAI doesn't get Microsoft's enterprise legitimacy, Microsoft keep that. OpenAI just gets preferential treatment as a supplier.
On the way up the hype curve it's the folks selling shovels that make all the money, but in a market of mature productionisation at scale, it's those closest to customers who make the money.
$10B of compute credits on a capped profit deal that they can break as soon as they get AGI (i.e. the $10T invention) seems pretty favorable to OpenAI.
I’d be significantly less surprised if OpenAI never made a single $ in profit than if they somehow invented “AGI” (of course nobody has a clue what that even means so maybe there is a chance just because of that..)
Leaving aside the “AGI on paper” point a sibling correctly made, your point shares the same basic structure as noting that any VC investment is a terrible deal if you only 2x your valuation. You might get $0 if there is a multiple on the liquidation preference!
OpenAI are clearly going for the BHAG. You may or may not believe in AGI-soon but they do, and are all in on this bet. So they simply don’t care about the failure case (ie no AGI in the timeframe that they can maintain runway).
OAI through their API probably does but I do agree that ChatGPT is not really Enterprise product.For the company the API is the platform play, their enterprise customers are going to be the likes of MSFT, salesforce, zendesk or say Apple to power Siri, these are the ones doing the heavy lifting of selling and making an LLM product that provides value to their enterprise customers. A bit like stripe/AWS. Whether OAI can form a durable platform (vs their competitors or inhouse LLM) is the question here or whether they can offer models at a cost that justifies the upsell of AI features their customers offer
That's why Microsoft included OpenAI access in Azure. However, their current offering is quite immature so companies are using several prices of infra to make it usable (for rate limiting, better authentication etc.).
> As for ChatGPT, it's a consumer tool, not an enterprise tool. It's not really integrated into an enterprises' existing toolset, it's not integrated into their authentication, it's not integrated into their internal permissions model, the IT department can't enforce any policies on how it's used. In almost all ways it doesn't look like enterprise IT.
What according to you is the bare minimum of what it will take for it to be an enterprise tool?
SSO and enforceable privacy and IP protections would be a start. RBAC, queues, caching results, and workflow management would open a lot of doors very quickly.
have used it at 2 different enterprises internally, the issue is price more than anything. enterprises definitely do want to self host, but for frontier tech they want frontier models for solving complicated unsolved problems or building efficiencies in complicated workflows. one company had to rip it out for a time due to price, I no longer work there anymore though so can't speak on if it was reintegrated.
Decision making in enterprise procurement is more about whether it makes the corporation money and whether there is immediate and effective support when it stops making money.
I don't think user submitted question/answer is as useful for training as you (and many others) think. It's not useless, but it's certainly not some goldmine either considering how noisy it is (from the users) and how synthetic it is (the responses). Further, while I wouldn't put it past them to use user data in that way, there's certainly a PR/controversy cost to doing so, even if it's outlined in their ToS.
In enterprise, there will be long content or document be poured into ChatGPT if there isn't policy limitation from company, which can be a meaning training data.
At least, there's possibility these content can be seen by staff in OpenAI as bad case, there's still existing privacy concerns.
No, because a lot of people asking you questions doesn't mean you have the answers to them. It's an opportunity to find the answers by hiring "AI trainers" and putting their responses in the training data.
Yeah it's a fairly standard clause in the business paid versions of SaaS products that your data isn't used to train the model. The whole thing you're selling is per-company isolation so you don't want to go back on that.
Whether your data is used for training or not is an approximation of whether you're using a tool for commercial applications, so a pretty good way to price discriminate.
As for ChatGPT, it's a consumer tool, not an enterprise tool. It's not really integrated into an enterprises' existing toolset, it's not integrated into their authentication, it's not integrated into their internal permissions model, the IT department can't enforce any policies on how it's used. In almost all ways it doesn't look like enterprise IT.