Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem is that OpenAI don't really have the enterprise market at all. Their APIs are closer in that many companies are using them to power features in other software, primarily Microsoft, but they're not the ones providing end user value to enterprises with APIs.

As for ChatGPT, it's a consumer tool, not an enterprise tool. It's not really integrated into an enterprises' existing toolset, it's not integrated into their authentication, it's not integrated into their internal permissions model, the IT department can't enforce any policies on how it's used. In almost all ways it doesn't look like enterprise IT.



This remind me why enterprise don't integrated OpenAI product into existing toolset, trust is root reason.

It's hard to provide trust to OpenAI that they won't steal data of enterprise to train next model in a market where content is the most valuable element, compared office, cloud database, etc.


This is what the Azure OpenAI offering is supposed to solve, right?


Sort of?

Then there’s trust that it won’t make up information.

It probably won’t be used for any HR/legal work for fear of false info being generated.


Correct


Why should MS be more trustworthy them OpenAI?

MS failed their customers more than once.


Microsoft 360 has over 300 million corporate users - trusting it with email, document management, and collaboration etc. It’s the defacto standard in larger companies especially in banking, medicine and finance that have more rigorous compliance regulations.


And MS already showed the customers shouldn’t trust them.

https://news.ycombinator.com/item?id=37408776

Maybe it’s a good idea to spread your data and not putting it in one place, if you really need to use the cloud


The administrative segments that decide to sell their firstborn to Microsoft all have their heads in the clouds. They'll pay Microsoft to steal their data and resell it and they'll defend their decisions making beyond their own demise.

As such Microsoft is doing the right choice in outright stealing data for whatever purpose. It will have no real consequences.


I think the case could be made that “spreading your data” is exactly what you don’t want to do, you’re increasing your attack surface.


Not like most had a choice, they already had office documents and windows, what else were they going to pick?

Your historical pile of millions of MSOffice documents is an ocean sized moat.


Surely MS wouldn't abuse that trust.

https://news.ycombinator.com/item?id=42245124


IT policy flick of the switch disables that, such as at my organization. That was instead intended to snag single, non-corporate, user accounts (still horrible, but I mean to convey that MS at no point in all that expected a company's IT department to actually leave that training feature enabled in policy).


This was debunked within hours, as commented on that thread last week.


It doesn't need to / it already is – most enterprises are already Microsoft/Azure shops. Already approved, already there. What is close to impossible is to use anything non Microsoft - with one exception – open source.


Because idk, windows, AD, office, and so many more Microsoft products could already betray that customer trust but don’t.


They betrayed their customers in the Storm-0558 attack. They didn't disclose the full scale and charged the customers for the advanced logging needed for detections.

Not to mention that they abolished QA and outsourced it to the customer.


How do you know they don't?


It is immaterial what they do and what you know. Important is what CIOs of the enterprise believe.


Because that would be the biggest story in the world.


Maybe they aren't, but when you already have all your documents in sharepoint, all your emails in outlook and all your databases VMs in Azure, then Azure OpenAI is trusted in the organization.


For some reason (mainly because Microsoft has orders of magnitude more sells reps than anything else) companies have been trusting Microsoft for their most critical data for a long time.


They sign business associate agreements. It's good enough for HIPAA compliance.


The devil you know


For example when they backed the CEOs coup against the board.

With AI-CEOs - https://ai-ceo.org - This would never have happened because their CEOs have a kill switch and mobile app for the board for full observability


OpenAi enterprise plan especially says that they do not train their models with your data. It's in the contract agreement and it's also visible on the bottom of every chatgpt prompt window.


It seems like a damned if you do, damned if you don't. How is ChatGPT going to provide relevant answers to company specific prompts if they don't train on your data?

My personal take is that most companies don't have enough data, and not in sufficiently high quality, to be able to use LLMs for company specific tasks.


The model from OpenAI doesn’t need to be directly trained on the company’s data. Instead, they provide a fine-tuning API in a “trusted” environment. Which usually means Microsoft’s “Azure OpenAI” product.

But really, in practice, most applications are using the “RAG” (retrieval augmented generation) approach, and actually doing fine tuning is less common.


> The model from OpenAI doesn’t need to be directly trained on the company’s data

Wouldn't that depend on what you expect it to do? If you just want say copilot, summarize texts or help writing emails then you're probably good. If you want to use ChatGPT to help solve customer issues or debug problems specific to your company, wouldn't you need to feed it your own data? I'm thinking: Help me find the correct subscription to a customer with these parameters, then you'd need to have ChatGPT know your pricing structure.

One idea I've had, from an experience with an ISP, would be to have the LLM tell customer service: Hey, this is an issue similar to what five of your colleagues just dealt with, in the same area, within 30 minutes. You should consider escalating this to a technician. That would require more or less live feedback to the model, or am I misunderstanding how the current AIs would handle that information?


> Instead, they provide a fine-tuning API


Most enterprise use cases also have strong authz requirements.

You can't really maintain authz while fine tuning (unless you do a separate fine-tune for each permission set.) So RAG is the way to go, there.


> How is ChatGPT going to provide relevant answers to company specific prompts if they don't train on your data?

Isn't this explicitly what RAG is for?


RAG is worse than training on the target data, but yes it is a mitigation.


That is a MASSIVE game changer !


100% this. If they can figure out trust through some paradigm where enterprises can use the models but not have to trust OpenAI itself directly then $200 will be less of an issue.


> It's hard to provide trust to OpenAI that they won't steal data of enterprise to train next model

Bit of a cynical take. A company like OpenAI stands to lose enormously if anyone catches them doing dodgy shit in violation of their agreements with users. And it's very hard to keep dodgy behaviour secret in any decent sized company where any embittered employee can blow the whistle. VW only just managed it with Dieselgate by keeping the circle of conspirators very small.

If their terms say they won't use your data now or in the future then you can reasonably assume that's the case for your business planning purposes.


Is it? OpenAI has multiple lawsuits over misuse of data, and it doesn't seem to be slowing them down much.

https://news.bloomberglaw.com/ip-law/openai-to-seek-to-centr...

Just make sure your chat history is off for starters. https://www.threatdown.com/blog/how-to-keep-your-chatgpt-con...


lawsuits over the legality of using using someone's writing as training data aren't the same thing as them saying they won't use you as training data and then doing so. they're different things. one is people being upset that their work was used in a way they didn't anticipate, and wanting additional compensation for it because a computer reading their work is different from a person reading their work. the other is saying you won't do something and then doing that anyway and lying about it.


It's not that anyone suspects OpenAI doing dodgy shit. Data flowing out of an enterprise is very high risk. No matter what security safeguards you employ. So they want everything inside their cloud perimeter and on servers they can control.

IMO no big enterprise will adopt chatGPT unless it's all hosted in their cloud. Open source models lend better to the enterprises in this regard.


> IMO no big enterprise will adopt chatGPT unless it's all hosted in their cloud

80% of big enterprises already use MS Sharepoint hosted in Azure for some of their document management. It’s certified for storing medical and financial records.


> IMO no big enterprise will adopt chatGPT unless it's all hosted in their cloud

Plenty of big enterprises have been using OpenAI models for a good while now.


Cynical? That’d be on brand… especially with the ongoing lawsuits, the exodus of people and the CEO drama a while back? I’d have a hard time recommending them as a partner over Anthropic or Open Source.


It's not enough for some companies that need to ensure it won't happen.

I know for a fact a major corporation I do work for is vehemently against any use of generative A.I. by its employees (just had that drilled into my head multiple times by their mandatory annual cybersecurity training), although I believe they are working towards getting some fully internal solution working at some point.

Kind of funny that Google includes generative A.I. answers by default now, so I still see those answers just by doing a Google search.


If everyone has the same terms and roughly equivalent models, enterprises will continue choosing Microsoft and Amazon.


This seems like the kind of thing that laws and regulators exist for.


Good luck with that. Fortunately few CTOs/CEOs share your faith in a company already guilty of rampant IP theft, run by a serial liar.


ChatGPT does have an enterprise version.

I've seen the enterprise version with a top-5 consulting company, and it answers from their global knowledgebase, cites references, and doesn't train on their data.


I recently (in the last month) asked ChatGPT to cite its sources for some scientific data. It gave me completely made up, entirely fabricated citations for academic papers that did not exist.


Did the model search the internet?

The behavior you're describing sounds like an older model behavior. When I ask for links to references these days, it searches the internet the gives me links to real papers that are often actually relevant and helpful.


I don’t recall that it ever mentioned if it did or not. I don’t have the search on hand but from my browser history I did the prompt engineering on 11/18 (which perhaps there is a new model since then?).

I actually repeated the prompt just now and it actually gave me the correct, opposite response. For those curious, I asked ChatGPT what turned on a gene, and it said Protein X turns on Gene Y as per -fake citation-. Asking today if Protein X turns on Gene Y ChatGPT said there is no evidence, and showed 2 real citations of factors that may turn on Gene Y.

Pretty impressed!


Share a link to the conversation.


Here you go: https://chatgpt.com/share/6754df02-95a8-8002-bc8b-59da11d276...

ChatGPT regularly searches and links to sources.


I was asking for a link to the conversation from the person I was replying to.


What a bizarre thing to request. Do you go around accusing everyone of lying?


So sorry to offend your delicate sensibilities by calling out a blatant lie from someone completely unrelated to yourself. Pretty bizarre behavior in itself to do so.


Except there are news stories of this happening to people


I suspect there being a shred of plausibility is why there’s so many people lying about it for attention.

It’s as simple as copying and pasting a link to prove it. If it is actually happening, it would benefit us all to know the facts surrounding it.


sure, here's a link of a conversation from today 12/9/24 which has multiple incorrect: references, links, papers, journal titles, DOIs, and authors.

https://chatgpt.com/share/6757804f-3a6c-800b-b48c-ffbf144d73...

as just another example, chatgpt said in the Okita paper that they switched media on day 3, when if you read the paper they switched the media on day 8. so not only did it fail to generate the correct reference, it also failed to accurately interpret the contents of a specific paper.


I assume top-5 consulting companies are buying to be on the bandwagon, but are the rank and file using it?


YMMV wrt your experience and luck.

I’m a pretty experienced developer and I struggle to get any useful information out of LLMs for any non-trivial task.

At my job (at an LLM-based search company) our CTO uses it on occasion (I can tell by the contortions in his AI code that isn’t present in his handwritten code. I rarely need to fix the former)

And I think our interns used it for a demo one week, but I don’t think it’s very common at my company.


Yes, daily. It's extremely useful, superior to internal search while combining the internal knowledge base with ChatGPT's


In my experience consultants are using an absolute ton of chatGPT


Do you mean Azure OpenAI? That would be a Microsoft product.


Won’t name my company, but we rely on Palantir Foundry for our data lake. And the only thing everybody wants [including Palantir itself] is to deploy at scale AI capabilities tied properly to the rest of the toolset/datasets.

The issues at the moment are a mix of IP on the data, insurance on the security of private clouds infrastructures, deals between Amazon and Microsoft/OpenAI for the proper integration of ChatGPT on AWS, all these kind of things.

But discarding the enterprise needs is in my opinion a [very] wrong assumption.


Is the Foundry business the reason for the run up of PLTR this year?

https://www.cnbc.com/quotes/PLTR


Very personal feeling, but without a datalake organized the way Foundry is organized, I don’t see how you can manage [cold] data at scale in a company [both in term of size, flexibility, semantics or R&D]. Given the fact that IT services in big companies WILL fail to build and maintain such a horribly complex stack, the walled garden nature of the Foundry stack is not so stupid.

But all that is the technical part of things. Markets do not bless products. They bless revenues. And from that perspective, I have NO CLUE.


This is what's so brilliant about the Microsoft "partnership". OpenAI gets the Microsoft enterprise legitimacy, meanwhile Microsoft can build interfaces on top of ChatGPT that they can swap out later for whatever they want when it suits them


I think this is good for Microsoft, but less good for OpenAI.

Microsoft owns the customer relationship, owns the product experience, and in many ways owns the productionisation of a model into a useful feature. They also happen to own the datacenter side as well.

Because Microsoft is the whole wrapper around OpenAI, they can also negotiate. If they think they can get a better price from Anthropic, Google (in theory), or their own internally created models, then they can pressure OpenAI to reduce prices.

OpenAI doesn't get Microsoft's enterprise legitimacy, Microsoft keep that. OpenAI just gets preferential treatment as a supplier.

On the way up the hype curve it's the folks selling shovels that make all the money, but in a market of mature productionisation at scale, it's those closest to customers who make the money.


$10B of compute credits on a capped profit deal that they can break as soon as they get AGI (i.e. the $10T invention) seems pretty favorable to OpenAI.


I’d be significantly less surprised if OpenAI never made a single $ in profit than if they somehow invented “AGI” (of course nobody has a clue what that even means so maybe there is a chance just because of that..)


That's a great deal if they reach AGI, and a terrible deal ($10bn of equity given away for vendor-locked credit) if they don't.


Fortunately for OpenAI the contract states that they get to say when they have invented AGI.

Note: they announced recently that that they will have invented AGI in precisely 1000 days.


Leaving aside the “AGI on paper” point a sibling correctly made, your point shares the same basic structure as noting that any VC investment is a terrible deal if you only 2x your valuation. You might get $0 if there is a multiple on the liquidation preference!

OpenAI are clearly going for the BHAG. You may or may not believe in AGI-soon but they do, and are all in on this bet. So they simply don’t care about the failure case (ie no AGI in the timeframe that they can maintain runway).


How so?

Still seems like owning the customer relationship like Microsoft is far more valuable.


OAI through their API probably does but I do agree that ChatGPT is not really Enterprise product.For the company the API is the platform play, their enterprise customers are going to be the likes of MSFT, salesforce, zendesk or say Apple to power Siri, these are the ones doing the heavy lifting of selling and making an LLM product that provides value to their enterprise customers. A bit like stripe/AWS. Whether OAI can form a durable platform (vs their competitors or inhouse LLM) is the question here or whether they can offer models at a cost that justifies the upsell of AI features their customers offer


That's why Microsoft included OpenAI access in Azure. However, their current offering is quite immature so companies are using several prices of infra to make it usable (for rate limiting, better authentication etc.).


> As for ChatGPT, it's a consumer tool, not an enterprise tool. It's not really integrated into an enterprises' existing toolset, it's not integrated into their authentication, it's not integrated into their internal permissions model, the IT department can't enforce any policies on how it's used. In almost all ways it doesn't look like enterprise IT.

What according to you is the bare minimum of what it will take for it to be an enterprise tool?


SSO and enforceable privacy and IP protections would be a start. RBAC, queues, caching results, and workflow management would open a lot of doors very quickly.


It seems that ChatGPT Enterprise already has many of these:

https://openai.com/enterprise-privacy/


OpenAI's enterprise access is probably mostly happening through Azure. Azure has AI Services with access to OpenAI.


have used it at 2 different enterprises internally, the issue is price more than anything. enterprises definitely do want to self host, but for frontier tech they want frontier models for solving complicated unsolved problems or building efficiencies in complicated workflows. one company had to rip it out for a time due to price, I no longer work there anymore though so can't speak on if it was reintegrated.


Decision making in enterprise procurement is more about whether it makes the corporation money and whether there is immediate and effective support when it stops making money.


>> internal permissions model

This isn't that big of a deal any more. A company just needs to add the application to Azure AD (now called Entra for some reason).


Is their valuation proposition self fulfilling: the more people pipe their queries to OpenAI, the more training data they have to get better?


I don't think user submitted question/answer is as useful for training as you (and many others) think. It's not useless, but it's certainly not some goldmine either considering how noisy it is (from the users) and how synthetic it is (the responses). Further, while I wouldn't put it past them to use user data in that way, there's certainly a PR/controversy cost to doing so, even if it's outlined in their ToS.


In enterprise, there will be long content or document be poured into ChatGPT if there isn't policy limitation from company, which can be a meaning training data.

At least, there's possibility these content can be seen by staff in OpenAI as bad case, there's still existing privacy concerns.


No, because a lot of people asking you questions doesn't mean you have the answers to them. It's an opportunity to find the answers by hiring "AI trainers" and putting their responses in the training data.


Not for enterprise: The standard terms of that forbid training on queries.


Not sure how valuation come into play here but I doubt enterprise clients would agree to have their queries used for training.


Yeah it's a fairly standard clause in the business paid versions of SaaS products that your data isn't used to train the model. The whole thing you're selling is per-company isolation so you don't want to go back on that.

Whether your data is used for training or not is an approximation of whether you're using a tool for commercial applications, so a pretty good way to price discriminate.


Also, a replacement for search




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: