Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
NVIDIA GeForce driver deployment in datacenters is forbidden now (nvidia.com)
358 points by f2n on Dec 25, 2017 | hide | past | favorite | 200 comments


This is a great time to remind everyone that AMD does not have this restriction for their Radeon graphics cards. Also, AMD has always been very supportive of the community and they've always respected their customers.

After decades of being the principled underdog, hopefully everyone can rally around them and make sure their open source projects work well with AMD products and contribute to their new open source initiatives.


That would have been hard to write out and genuinely support a couple years ago, when NVidia ruled the top-of the line as well as the best value-for-money most of the way down the chain. It was a hefty hit to your framerate or your wallet to support AMD.

But these days AMD is a serious contender at all levels! If the performance or value isn't actually in AMD's favor, it's near enough to be undetectable outside of a benchmark, and very easy to support the good guys.


> very easy to support the good guys

But AMD doesn’t run CUDA? Is it very easy to switch a stack relying on CUDA to OpenCL? I don’t think so.


AMD's ROCm suite has a tool for converting CUDA to C++ code: https://github.com/ROCm-Developer-Tools/HIP

How well it works is likely codebase and application dependent.


I’ve been using OpenCL for some neutral net image optimisation recently, I’ve actually found it very efficient (citation obviously needed), I personally welcome the competition of AMD and it does seem like Nvidia is digging its own grave (not that I think it’ll die).


I don't think it's that hard, but it depends maybe a bit on how long your codebase has evolved through CUDA.


Here was my experience. Everything in ocl ended up slower, with 30%+ longer (walltimes) on nvidia hardware. Also some problems with floating point precision. Real problem was there was no business case to switch because we could always make customers eat the cost of taking nvidia gpus.

The added code complexity was not worth it - we originally did it as NRE for a big client who wanted OEM capabilities to our IP.


OpenCL is slower than CUDA, but it's not the end of the world.


got a reference?


That is simply not true.


Please be more specific. What is not true? That it is slower or that this is not the end of the world?


Wow, people are butthurt. It's a wildly known fact in the industry. A simple google search for benchmarks or a PDF like this would have satisfied them. But they didn't want the truth they wanted to be butthurt.

https://arxiv.org/ftp/arxiv/papers/1005/1005.2581.pdf

In basically every chart and graph, OpenCL is slower.

OpenCL has a higher level of abstraction, so it's got a higher penalty for it. But, you get cross-platform support. nVidia doesn't optimize for OpenCL either because they don't want you to use it over their own competing framework.

But 'dats cool, keep downvoting, guys. Sources aren't real if you can downvote 'em enough. God, I love Hacker News.


Would you kindly raise the level of discourse here? You have useful information, so thanks for sharing it. But please leave the swipes and complaints about downvotes and HN members out of it. It worsens discussion and is against the guidelines. You’ll likely receive downvoted for that, more than anything else.


This reddit-style, highly emotional commenting is not terribly welcome on HN.


There are zero DL frameworks that support AMD cards as a primary target.

Most have some kind of branch or patchset with OpenCL support. The problem is that they aren't great. If you need any new layers there is no support. There is nothing like CuDNN so you don't get the high speed convolutional kernels.

It's great to blame developers for supporting NVidia, but the thing is NVidia are great to work with. They dedicate large teams to deep learning support (not like the 2 or 3 part time Devs AMD does), and they publish good research and tutorials. AMD does nothing like this.


Amd had always had first class support for openCL, cuda is Nvidia proprietary and although Nvidia "supports" OpenCL, it's quite bad. Issue #22 on tensorflow is regarding openCL support.

AMD has done some interesting work on HCC (a more proper gpgpu compiler approach with llvm base) and that is showing promise. See here: https://instinct.radeon.com/wp-content/uploads/sites/4/2017/...

Additionally, they support cuda decompiling to HIP which is an intermediary that can be built to target Nvidia (via nvcc) or amd (via HCC).

Nvidia has built a lot of tooling for DL, such as cudnn, and the new Tesla cards have dedicated silicon for tensor calculation. Amd does have a cudnn equivalent called MlOpen. They have also ported caffe via HIP and it works well. Work is being done by amd right now to torch, mxnet and tensorflow to add support for amd hardware with minimal burden to the maintainers of these projects.

You can read about some of the DL toolkits available here: https://instinct.radeon.com/en/6-deep-learning-projects-amd-...

I think it's particularly bad form on behalf of everyone in the DL framework and library world to cator only to Nvidia and cuda, and that they very much walked into this shake down with open arms.

The original comment is correct in that contributing support for OpenCL (which works on mobile too) will alleviate this to a fair degree. It's one of those things where the more momentum is behind it, the more device manufactures will focus on ensuring their opencl compiler is building properly optimized kernels for their hardware.

Start contributing to OpenCL or adding hip support to existing projects and we'll see some viable alternatives pop up from not only AMD, but players like Qualcomm and Samsung.


Until quite recently OpenCL was a C only game, so blame Khronos for not embracing all HPC relevant languages, like CUDA does.


OpenCL doesn't have an equivalent to CuDNN.

I'm not saying it's a great situation, I'm saying that NVidia has always had better libraries, tools and performance, and it isn't surprising that developers use them.

Deep learning is hard and slow enough without using second class tools.


Much respect to NVIDIA and their software team but the situation is changing. PlaidML is like cuDNN for every GPU. Fully open source, faster in many cases than TF+cuDNN on NVIDIA, beats vendor tools on other architectures, Linux/Mac/Win. Supports Keras currently but more frameworks are not difficult (patches welcome).


The PlaidML benchmarks are suspect. They compare to Keras + Tensorflow, which is a really unfair comparison since 1) Tensorflow is probably the slowest of the big deep learning frameworks out there (compared to PyTorch, MXNet, etc.), and 2) Keras itself is quite slow. Keras is optimized more for ease of use, introduces lots of abstractions, and often doesn't take advantage of many TF optimizations, (for just one example until very recently Keras did not use TF's fused batch norm, which the TF docs claim provides a 10-30% speedup in overall network performance, which alone could be enough to account for many of the benchmarks showing PlaidML ahead).


In my opinion it's extremely fair. The benchmarks are Keras+PlaidML compared to Keras+TensorFlow, it allows running exactly the same nets (just imported from the Keras included applications) and whatever penalty Keras might impose is equal in the two cases. Having one very direct comparison is actually why we constructed the tests that way (none of the other frameworks run on our high priority platforms).

That said we'd be pretty excited if someone wanted to add support for TF, PyTorch, MXNet, etc. We like Keras but are happy to have integrations for all frameworks. With work you could pair it with Docker and containerize GPU-accelerated workloads without the guests even needing to know what hardware it's running on. Lots of possibilities.


No, no, no.

> whatever penalty Keras might impose is equal in the two cases.

The penalty Keras imposes when using Tensorflow depends on its Tensorflow implementation. The penalty Keras imposes when using MXNet depends on its MXNet implementation. The penalty Keras imposes when using PlaidML depends on whatever the PlaidML devs implemented. When you build a Keras layer, it's calling different Keras code for each backend.

The comparison would be fair if Plaid claimed to be the fastest Keras backed, not if it were actually claiming to be faster than Tensorflow.


There was someone on reddit/ml who posted some pretty interesting numbers for training.

I think they have a lot of challenges ahead of them, but I’m still more optimistic about Plaid than AMD’s own efforts.

AMD says that they don’t care about ML[1], and their actions back that up.

Edit: and to be clear, I think comparing Keras+Plaid vs Keras+TF is an entirely valid thing to do. Lots of people work in Keras, and if you download a random NN code off github it likely to be Keras (or Pytorch now of course).

[1] https://www.reddit.com/r/MachineLearning/comments/66bgmf/com...


PlaidML seems entirely geared towards inference (I only see batch size 1 anywhere). Training is important.


Batch 1 inference on convnets is key for us internally but training does work pretty well. The underlying machinery can do much more. Here's a blog post that talks about how it works with some links to more detailed docs & the actual implementations:

http://vertex.ai/blog/tile-a-new-language-for-machine-learni...

Two of the big motivators for opening the code were 1) giving students taking the popular courses a way to get started with GPU in whatever machine they've got (recent Intel GPUs in say a MacBook Air are enough) and 2) giving researchers a platform where it's simple to add efficient GPU-accelerated ops.

For scale on #2 check out the entire implementation of convolution:

https://github.com/plaidml/plaidml/blob/master/plaidml/keras...


PlaidML is very promising I agree.


> Amd does have a cudnn equivalent called MlOpen.


MIOpen could be great one day. ATM you still get random problems like this: https://github.com/ROCmSoftwarePlatform/MIOpen/issues/19

That's an order of magnitude worse performance than NVidia on ResNet 52 for 2 months with no real reason.

Great idea, but no one can use it reliably yet.


apparently Caffe has support for AMD cards through ROCm

https://rocm.github.io/dl.html


That's exactly what I mean.

It's a patchset (note that it isn't upstreamed in their chart), and it's caffe. That was great in 2015.


For most Deep Learning developers, using AMD is out of question until DL frameworks like TensorFlow start natively supporting them. I am not really sure if/when this is going to happen. AMD needs to really step up their game.


One would think that this announcement from Nvidia would speed up this development. Doesn't this new policy render tensorflow + Nvidia pretty much useless ?


No. Tesla and Quadro are definitely exempt from this potential update to the EULA, and you can have a lot of GeForce cards outside of a datacenter.


Then this is again the same kind of money grab as they have done with active stereo rendering - it used to work even on consumer-level hardware.

But then it got popular outside of research labs too, so Nvidia has decided to milk the cow and restrict this feature only to their Quadro line-up, with ATI following suit. The result has been that they have pretty much killed the market - apart from those research labs nobody is going to buy a $3-4k Quadro card that has otherwise the same performance as a $300 GeForce only for the stereo support.

I think this sort of artificial crippling/restricting of usage to force industrial customers to use the more expensive hw they don't need otherwise will only lead to a proliferation of task-specific ASICs and Nvidia will hurt only itself with it. The reason why people use Nvidia GPUs for parallel computing is cost and ubiquitous availability, not because there aren't other options. This move will only accelerate their development - for which there were no reasons until now.



Hip caffe is also quite well underway.


Why is it on AMD to support TensorFlow? Pretty sure it is on the machine learning people to use the standard APIs like OpenCL or Vulkan etc instead of CUDA...


The’re not running a non-profit. It’s on them to respond to the market.


Goes the other way around too. This whole debacle demonstrates how dumb it is to base your product on a proprietary API with just one vendor.



Companies doing ML do not and should not base their decisions on some hypothetical considerations about long-term effects on market health, or just present/future hardware cost. They work with what's available. It is market's job to correct. And it surely will now, as current situation becomes more of a problem.


Nobody wanted to do this, but the lousy state of OpenCL compared to CUDA left them no choice. It really is so far ahead that OpenCL was never an option. It’s on AMD to fix that, since NVIDIA certainly won’t.


No, because CUDA has first class C++ support, while Khronos only added C++ support when it became visible no one wants to keep using C, when better options are available in CUDA via PTX.


As with many things: it depends on who wants it more. In this case, I think AMD probably cares more.


That’s all fine and great but CUDA. It’s a huge moat and enterprises will likely just fit the bill and pass on the costs.


Apparently tensorflow is quite close to supporting opencl, and now that nvidia put a 10x multiple on their cards for ML use you can be sure that there will be an army of people making sure that opencl becomes the standard.


Just going to say, just because tensorflow supports it won't mean much. The ecosystem as a whole is still largely cuda. This includes the database tech(kinetica,mapd), the resource managers (mesos/yarn), not to mention HPC.

There's also just not much incentive for people to move to AMD here. There's a reason NVIDIA gives out these graphics cards like candy to academia. It's because they want to make their margins in data center while keeping the broader developer community locked in to cuda.

People on HN make broad and sweeping comments about "if you open it they will come and everything will be magically better".

It's a lot more complicated than that. The market incentives just aren't quite there yet. Could it happen one day? Yes. Will it happen today? No. It's going to take a lot more than this for other vendors to start supporting AMD.

What I will say: There will be competition and the space is heating up. AMD is one player.

Now let me put my commercial hat on here: What would it take for me as a deep learning vendor to support/care about AMD?

1. Show me the money. I need a clear revenue stream. AMD has some hope here. Customers don't care about which gpu they use as long as it fulfills a use case they care about.

2. Code: Show me something that's actually a robust. A 1 off fork of any deep learning framework (see the random 1 off caffe forks that aren't actually caffe by nvidia, intel,amd)

3. Share the burden. Put the code out there and support the broader community. Actually maintain and follow up with the latest innovations (hint: this is hard. throwing code over the wall once doesn't mean anything)

4. Amazon will be key here. Google cloud (look google is great but they aren't the leading cloud player by a long shot and likely won't be anytime soon) - get them on board with some AMD cards.

5. (Disclaimer I know the founders) - make opencl not suck. https://vertex.ai/ is an interesting player in this space.

So look, I won't say it's impossible. Let's just not ignore the actual commercial market forces that are also at play here.


There used to be this gigantic company that ruled networking world. Its name was Cisco. No one could touch it. Costs for a basic series XX with 4 slots was N, they charged X, for 6 slots, it was N+20, they charged 4X, for 16 slots it was N+200, they charged 30X.

What is Cisco is the current question.


> What is Cisco is the current question.

A $190 billion company earning 4x what nVidia does, a mere $9-$10 billion in net income, with 12x the cash of nVidia. A gigantic money printing machine, is what it sounds like they are.


Have you heard about the company called Juniper? Brocade? Arista? The only reason they got anywhere is because of cisco's "We are cisco.Eff you!" attitude.


Nowhere did I say it's never going to happen. These things take time to evolve. I even talked about the market forces and what's currently plausible.

There's always big companies that get disrupted because they kill good will. That's the start to something. It takes more than 1 move. Will NVIDIA make this mistake over time? Likely.

Will it be AMD and Tensorflow support that does it? Not by a long shot. It will be by a series of players providing the right incentives. My response (nor any response to this) should not be treated as binary. Ultimately in order for a company to be "disrupted" you need to actually analyze and exploit the market forces involved here. just writing this off as "open ecosystem is all we need" is dangerous.


I have to say I disagree with you on the AMD/TensorFlow point here, Adam. Your previous points are completely valid. Nothing of note will happen in 2018, I expect. But by early 2019, maybe - if AMD (or Intel) gets their act together on the software side of things. I don't think the "community" will do it for them. Buy maybe a big player like Amazon will have had enough of Nvidia and support an open alternative. Although we support Cuda in our version of YARN (Hops Hadoop), I expect we'd add ROCm or OpenCL or whatever - if it was a serious alternative. That would happen quickly. The problem, of course, is that we would want great support in TensorFlow first before we do that. Data scientists need a seamless transition - including from a performance perspective. For us, that also means support for distributed deep learning (Ring AllReduce over infiniband). I don't expect that will happen in 2018, and could take until 2020, if i'm being realistic. That means when AMD finally get some good DL libraries, Nvidia will still have one-up on them with distributed DL (reduce training time linearly with more GPUs).

The other wildcard i haven't seen people mention here is the Neural Network Processor (NNP) from Intel Nervana. The hardware has potential. As long as the software doesn't force us to use BigDL, it has potential.


I'd expect competition to heat up in 2018.

Re: AMD/Intel. We've been waiting for them to get their act together for years. Nervana could be great but I'm going to wait on that one. So far their "launches" have been nothing more than marketing fluff.

As for your projections about tensorflow. It won't be tensorflow. Tensorflow will be 1 of many frameworks. Look I like HOPS but you guys push tensorflow explicitly. A startup running its own hadoop distro that happens to push tensorflow isn't going to move the needle. You guys are great middleware I'm sure but I haven't seen the customers where it might be viable. I hope you guys continue to grow though! While you're doing that, most hadoop vendors are focused on moving up the stack. It will take people with actual resources to move the needle in terms of enterprise adoption.

Amazon is doing this with mxnet and EMR, MapR is pushing tensorflow in their serving. CNTK is being pushed in SQL server and HDInsights. There's some competition there.

What I'm getting at here is: it will take multiple vendors and competition. I'm going to place my bet on the bigger players already involved with the foundations first though. Open standards (addressed below) where it commoditizes the chip will be key. The storage infra will follow from that. It should be something that doesn't displace current infra but allows interop.

Things like nvvm from the mxnet folks, onnx (where the framework doesn't matter anymore!) being pushed by the various hardware vendors etc will move the needle. You need buy in from the actual big players who can upfront the development time in to making these things viable alternatives.

For your "seamless transition" I'm not sure that would be that hard done right. Supporting "great tensorflow" can come in multiple flavors. As a separate issue, tensorflow's production story is horrible. That's another topic I could rant about all day though. It ultimately comes from abstracting it away though. That by itself is a hard problem (Disclosure: I have my own solution for this that I won't talk about here just know I'm biased :D)

Lastly, I question whether opencl can even be a viable alternative. It's a fragmented inconsistent standard with a worse API than cuda. One reason it "won" is because it's in general cleaner and a clear leader in the space.


Yeah, ROCm is the most viable candidate as of today. In general, Nvidia are not good for middleware vendors. They want to be one, but don't offer a platform that integrates with anything. Licensing costs for the DGX-1 are insanely high. My problem is mostly from a data scientist perspective - teams don't need a few high performance GPUs, like a couple of DGX-1 boxes. They need a hundred 1080Tis, maybe complemented by a DGX-1 (which would cost the same a 2 DGX-1s). That way they can do lots of parallel experiments (hyperparam optimization), and distributed training. Making GPUs a scarce resource just re-inforces the lead of the hyperscale AI companies, who have thousands of GPUs available for their data scientists.


Oh I agree that GPUs should be more commodity. We might see alternative ASICs rather than GPUs come out though. I'm personally more interested in seeing that succeed than just confining the solution space to gpus and discrete gpu competition. I'm just not keen on trying to predict what will win (I really don't know) I just have criteria I would be looking for before trying to implement support for it in either my deep learning framework or trying to support anything for customers.


A dick move like this by Nvidia will only spur the development of alternatives to CUDA. CUDA isn't the only game in town but until now nobody had much reason to invest in improving the alternatives - CUDA was free, worked and the hw was cheap and ubiquitous.


Right so like my other replies, I don't claim that alternatives won't come up eventually. I also don't claim that this is exactly how nvidia can unseat themselves.

What's wrong here with many of the guesses people are making though: They think simple open standards are enough.

It's a lot more than that. Don't alienate the market forces that also need to move to make this work. Too many coders write off the real business incentives side of this.

Viable competition will come from multiple vendors and possibly backwards compatible/interchangeable standards that don't require many changes in code. Give me a path forward as a vendor. Show me the money. Put code out there and maintain it/keep it up. Show me it's going to stick around.

It's like any SAAS you'd pay for or an IDE you invest in, vendors want to see an ecosystem that fulfills a set of requirements and allows them to get work done for their customers.

Right now cuda is still that best tool. I covered what will likely need to be considered in my parent post so I won't reiterate that here.

I only ask that folks don't assume that "open source linux drivers and open standards and 1 open source framework with opencl support" are enough to unseat nvidia. The market won't move for that.


btw since you know the founders of vertex.ai you should tell them their cert is messed up they probably want to do something like https://hackernoon.com/set-up-ssl-on-github-pages-with-custo...


Thanks will pass it on!


Just curious, what cards are you comparing when referring to the 10x multiple?


is this a serious reply? do you know how bad is AMD's OpenCL support? how bad was the attitude?

have a look, people had to make a petition to ask AMD to fix their craps.

https://www.phoronix.com/forums/forum/linux-graphics-x-org-d...


How would a open www.change.org petition be any indicator of the current state of things? Or even indication of the 2013 state of things? For christ's sake, anyone can start a petition about anything.

Obviously the petitioner is also clueless, or else how could they write "We feel that Radeon hardware is vastly superior to competition", when their hardware in 2013 was nowhere near NVIDIA. (See the review the petition links to for an example...).

You sir are a typical phoronix reader. Uninformed, biased and thick.


Shameless plug for the tvm project: http://tvmlang.org/2017/10/30/Bringing-AMDGPUs-to-TVM-Stack-... which is an effort towards broad (many vendors) hardware support and optimization for deep learning.


tvm was the first thing I thought of when I heard about this.

If you're involved in the tvm project - are you aware of any major deep learning libraries working on utilizing it? I'd imagine MXNet would be the first one to try it, but I haven't seen anything.


I can't speak for AMD's support towards the Radeon/OpenCL community, but their GPU and CPU support towards the Linux community has been less than stellar.

And while slightly off topic, their CPUs contain just as much Evil as Intel's CPUs (PSP/ME). I wouldn't be so quick to call them a "principled" underdog.


It doesn't seem to be enforcable anyway, what is the point of this change?


We already have software licences which grant you a copyright usage only when certain conditions are met like:

1. restrict its software to be deployed at one physical or virtual computer.

2. Similar to one but restrict it to be deployed the specific vendor hardware.

3. Non-commercial usage only

4. Commercial usage allowed but only for small company less than 50 employees, buy more expensive license for that.

What's the difference?

If those ridiculously existing consumer-unfriendly licenses are valid, I think NVIDIA's stupid no-data-center-deployment-except-for-blockchain-processing license is also valid.


Who says that those are valid too?


> AMD does not have this restriction

But AMD/ATI cards are heavily bought by crypto miners, so it's almost impossible to buy one, even at an insanely high prize. And NVidia cards are compatible with all games made between 2000 and 2017. AMD driver were very bad several years ago. So if you are playing GoG games or older games occasionally there is only NVidia. Yes, both NVidia with their spyware-driver and now this shit, and AMD driver sucks. For CPUs everyone suggests now AMD which are now better than the crippled Intel CPUs with their plastic pads inside the CPU so that a new Intel CPU will last only about 3 years.


As a holder of AMD stock, I completely agree


Sadly I did find that with the latest gen $130 card (cant remember the model) there is no driver support in Debian stable, or the driver was broken (I can’t recall which). I have to run Debian unstable to use my card, which unfortunately has a broken ZFS driver...

I read that AMD cards were friendly to Linux, so after removing my NVIDIA card for stability reasons, I switched to AMD only to have other issues.

I’m still looking for an easy Debian compatible graphics card so I can do CAD (Onshape - OpenGL). The AMD card works now, but I don’t know what to do next time around.


Yes AMD have open-source version of their drivers for linux, so they are generally more stable/friendly than Nvidia.


I was using the open source drivers, but the version in Debian stretch does not support my card. I had to go to buster to get it working after tracing a long list of forum posts of people having trouble with this card.


So, my summary:

Unsourced Japanese news outlet published an article claiming this clause was added to the EULA. (Edit: located here: https://wirelesswire.jp/2017/12/62708/ )

This version of the EULA, located at: http://www.nvidia.com/content/DriverDownload-March2009/licen..., has the no-data center clause. Note the "2009".

The version linked from the actual driver download page (at https://www.geforce.com/drivers/license ), has no such clause.

I think I'll postpone my outrage until the clause appears on the EULA that I actually have to agree to when I download GeForce drivers.

Alright, edit: For me, downloading drivers through https://www.geforce.com/drivers gets me the second EULA that I linked to. However, downloading drivers through http://www.nvidia.com/Download/index.aspx?lang=en-us gets me a EULA with this data center "limitation". This seems to me to be pretty problematic and an ineffective update.


The "2009" license is the correct license. If you try to download a GeForce driver from NVIDIA's website today, that is the license that you must accept before in order to download the driver.

On Windows, it's also the license you must accept during installation time before you can use the driver, even if you did not accept it during your download.

Interestingly, the license inside the Linux package does not include the data centre clause at this point in time.

https://www.geforce.com/drivers/license/geforce

http://www.nvidia.com/content/DriverDownload-March2009/licen...


I went to https://www.geforce.com/drivers and searched for 1060 on Linux.

Then clicked through to https://www.geforce.com/drivers/results/126577. This page has the standard "*By clicking the "Agree & Download" button, you are confirming that you have read and agree to be bound by the License For Customer Use of NVIDIA Software..." That sentence links to the EULA that I linked to.

EDIT: Alright, it matters which site you download the driver from. See the edit to my original comment.


Try downloading drivers for a Titan V - it goes to that very specific 2009 license (complete with the absurd exception for the blockchain). This is very specific and targeted at crushing small system vendors who were selling to scientists across many domains who preferred GeForce over Tesla because of the huge price difference. They have already pursued vendors and they have tried to shut them down. If you don't believe me or you're OK with that behavior, be my guest to continue enabling it.


I just tried for Titan V on Linux through geforce.com and got a EULA without this clause. Downloading through nvidia.com probably gets the no data center EULA.


Windows 64 got me the datacenter-limiting license, Ubuntu 16.04 as well. No idea how you're getting past this:

"2.1.3 Limitations.

    No Modification or Reverse Engineering. Customer may not modify (except as provided in Section 2.1.2), reverse engineer, decompile, or disassemble the SOFTWARE, nor attempt in any other manner to obtain the source code.

    No Separation of Components. The SOFTWARE is licensed as a single product. Its component parts may not be separated for use on more than one computer, nor otherwise used separately from the other parts.

    No Sublicensing or Distribution. Customer may not sell, rent, sublicense, distribute or transfer the SOFTWARE; or use the SOFTWARE for public performance or broadcast; or provide commercial hosting services with the SOFTWARE.

    No Datacenter Deployment. The SOFTWARE is not licensed for datacenter deployment, except that blockchain processing in a datacenter is permitted.
"


Does it matter when it was? Also not worrying about their stance on things until it gets to the EULA closest to you seems too laissez-faire for me. Rewarding this behavior early on is exactly how you get that clause into your own EULA.

Restricting innovation is an anti-competitive and frankly a dumb move -- who knows how many innovations were discovered in data centers that made their way to the mainstream.

That said, It's a little easier for me to be outraged at this, I'm already running Radeon, so I've already voted with my wallet.


Of course it matters which version we are looking at. It was not initially obvious to me that this clause was added, not removed between 2009 and now. See my edit: Two different driver download pages link to different EULAs.


The second link is "License For Customer Use of NVIDIA GeForce Software", the third link is "License For Customer Use of NVIDIA Software".

The second link is presented and you agreed upon when you try to download the GeForce or Titan driver software from NVIDIA website.

At least, try to read the title.

Oh and the original Japanese version of the article does have a source(has a link to the second link) https://wirelesswire.jp/2017/12/62658/


Frankly, I don't care if the page is titled "FUBAR license XYZ123" since I was asked to agree to the EULA at the third link alone. The third link is presented and you agree when you try to download the GeForce or Titan driver from the GeForce website (geforce.com). I understand that nvidia.com presents the second link, but GeForce.com presents the third link. I said as much in my edit. Please read the entirety of my post before correcting it.

Thanks for pointing out that the original Japanese article did have a link.


The URL is misleading. That isn't the actual EULA from 2009. You can verify that with archive.org. Even if you don't trust archive.org, the reference to "blockchain" would be anachronistic. There were no publicly available bitcoin GPU miners in 2009, and blockchain was typically written as two words.


Isn't it the CuDNN and CUDA licenes which matter?


CUDA requires the kernel driver. All three licenses matter.


Yes of course. Some other coverage indicated it was the Cuda and CuDNN downloads.


My apologies, I misunderstood. The CUDA toolkit license online doesn't appear to have this clause at this time.


https://github.com/ROCmSoftwarePlatform/hiptensorflow

Everyone on HN knows to always go with free, open software. Without competition any company (Intel, nvidia, etc.) starts to exploit the consumer.


1. Write a Tensorflow wrapper that mints a new private cryptocurrency where the proof of work is training your deep learning model.

2. Sell it to companies who bought racks full of GeForce GPUs for deep learning.

3. Profit!


Alternatively, you could simply have your GPUs calculate a few hashes every hour. So technically, you're doing blockchain processing...very slowly...


They don't care. But more importantly, courts won't care. Your compute is all tainted and has the wrong colour, due to your intent of working around the licensing restrictions.


you couldn't do that. Because you're deploying the GPUs for other purposes besides blockchain processing 99% of the time. What you could do, is write a very trivial blockchain and Have it require a couple of ledger entries that are DL "input data/gradient". Hard to argue what exactly constitutes "processing". This way, every cycle of the software is dedicated in some way to the blockchain.


4. NVidia changes the EULA again to forbid $X ¯\_(ツ)_/¯


That's how we do stuff :D


Is this even at all legal? Wouldn't the First Sale Doctrine and the extremely, extremely limited rights to use software that courts have defended (the implied license to use software necessary for a purchased device HAS been defended in court) protect the purchaser? I really don't think nVidia would be able to go after anyone legally for deploying the software in a datacenter. Companies can put whatever they want in software licenses and most of it is totally unenforceable bunk that just hasn't been tested in court. And if it comes to it, they will normally drop the case or do whatever they can to avoid it ever being tested.


Not first sale as this isnt the sale of a copy. This is a contract.


This may be a case of "you can put the cards wherever you want, but you can't use our drivers (and associated card firmware) in datacenters."



That "NVIDIA, fuck you!" from Linus comes in expensive now that NVIDIA cares about Linux. They were jerks with Linux before deep learning became famous, and that's why Linus was so French about them.


Nothing's changed, still jerks. No hardware docs, no cooperation with the open drivers beyond ensuring distribution install media do work (i.e. basic modesetting).


There are docs for some things, including beyond basic modesetting: http://download.nvidia.com/open-gpu-doc/

They at least used to be responsive to questions a couple of years back when they started publishing the docs, but I'm not up-to-date on the present situation.


>> No Datacenter Deployment. The SOFTWARE is not licensed for datacenter deployment, except that blockchain processing in a datacenter is permitted.

I thought they were trying to mitigate the diversion of gaming gpus towards mining, apparently not.


it's for deep learning


Quick question: Why? What does NVIDIA have to gain by forbidding people from using GeForce for ML in datacenters?


Price discrimination. Because they aren't getting effective competition from AMD right now they can sell to multiple tiers of customer for different prices. AMD is competitive in crypto, so they exempted crypto from the ban.


No, it's only because the gp106/gp104 purpose built mining cards use geforce drivers.


Seems like there would just be an exemption for them otherwise.


The Titan V is $3000 (GeForce)

The Tesla V100 is $17,000: http://www.nextwarehouse.com/item/?2782569_g10e

We're talking the difference of ~5.5x the price here for otherwise similar cards.


The V100 is $8000 retail. The site you linked is incorrect.


It probably depends a lot. There is the SXM vs. PCIe distinction plus discounts for bulk and all kinds of negotiations.


No, it's $8000 if you want to buy one now. https://www.thinkmate.com/product/nvidia/900-2g500-0000-000


When I was running my p-2-p startup I wanted to try capitalizing on this. It's the reason why GPU instances cost an arm and a leg.


But you can already not buy a datacenter's worth of GeForce; they have limitations on the quantity per customer. That seems like a far more effective measure than this clause.


And yet, many datacenters run hundreds or thousands of GeForce cards. Hetzner, for example, rents out an i7-6700, GTX 1080, 64GBRAM for 117€/month


I've spent some time training tensorflow cnns on nvidia 1080gtx.

It works pretty well, but i couldn't reliably train production models for work on it. it's just too flaky. I mentioned this to our 'trustworthy' dell rep who suggested that a good v100 suite would surely solve all the reliability problems, what with it's greatly increased memory bus, or something...


You mean your algorithm is too flaky. The hardware is fine. A V100 won't fix your convergence problems.


yes. salesman bullshit. i think it was tensorflow that was flaky.


CNN models being "flaky" on GeForce hardware isn't something I've heard of? NVIDIA has made some deliberate decisions to make GeForce cards less attractive for deep learning in terms of performance, but I don't think making them produce incorrect results is in their best interest. What hardware did you test this against?


NVIDIA is planning on standing up its own deep learning datacenters to compete with amazon, GCP, Azure, because instead of selling a commodity, they'd rather sell a service (better margins).


They want to push their expensive enterprise cards (Tesla series) which are exempt from this rule (applies only to "GeForc" cards) and they are probably getting ready to push their own ML service with their own datacenters.


If you want to know what Nvidia are afraid of, look at the last figure in this O'Reilly blog on distributed tensorflow.

https://www.oreilly.com/ideas/distributed-tensorflow

On a DeepLearning11 server (cost $15K), you get about 60-75% DL training performance compared to a DGX-1 (cost $150k)


What surprised me in the first place is why Tensorflow an "OpenSource" initiative by Google chose proprietary CUDA over "OpenSource" OpenCL.

Check Tensorflow issue 22 for more info.

Just sayin.


CUDA is superior to OpenCL. Also NVIDIA provides CuDNN, a proprietary library with very efficient implementation of deep learning primitives. If you want to train models faster, you have to use them.




thanks


What is a "datacenter"? I try to come to these things with an open mind but EULAs are really the dregs of legal writing :/


If you label your place "datacenter".


Then I’ll just call mine “data processing farm”.


That’s not how contract interpretation works. The vast majority of things are clearly datacenters or clearly not datacenters. Assuming the contract is enforceable for other reasons, if you use it in something that is clearly understood to be a datacenter, you would be in breach.

Do you really think a judge would be swayed by your little word game?


In the US? Depending on the district sued: absolutely. That shouldn't be the case, but it 100% is.

Unless there is a rigid legal definition for "datacenter", lawyers are allowed to argue that because the EULA specifically used the phrasing it does, that means nVidia has a definition that they have not made plain in the same document, which is grounds to void the clause. They are also allowed to argue that nVidia must produce a definition of "datacenter" which thanks to case law means that then becomes the official definition used in future court cases, something that nVidia would really not want to be responsible for.

It's an amazing bit of "did you actually think this through?" when you actually look at the ramification of an EULA clause like this.


I doubt it would come to trial.


My point is that you’re mistaken in saying above that something is a datacenter only if you call it such. Words must have meanings if we are to have agreements.

Whether this particularly contract will result in a court case is beside the point. And I wouldn’t be so sure it wouldn’t. Nvidia’s lawyers might have something to say if you start a business reselling GPU time for ML tasks and run it off thousands of GeForce cards with these drivers. And if they don’t get what they want, they would have good reason to go to court to defend their market segmentation strategy.


Because that is how such things work. People and companies trying to find loopholes. And renaming your stuff is exactly that. On a side note you can just go to countries where you are not at mercy of corporations.

There are other examples too: Software that is not allowed for more then 1 CPU socket; just create a VM with as much cores as you like. Loss in performance is often not an issue, problem solved. Standard practice.


For that interpretation of a contract to fly, you would have to argue that a reasonable person would believe that the meaning of datacenter is nothing more than “a thing that is called a datacenter”.

I should make a contract to sell you gold, take your money, then give you a pile of rotting wood and argue that I call it “gold”, so you have no remedy.


Well yes. The definition of data center (which is laking from Nvidia) is very wide. My room in the house could be a data center. Besides that, since the contract wasn't made at the point of purchase (datacenter clause) I can ignore it anyway.

Why should I give you money for rotting wood in the first place?


Have you never paid for something before receiving it? The whole point of the example is that you pay for a contract for gold, then I deliver rotting wood.

Playing dumb about a definition doesn’t do much for you. Barring the other party having good reason to expect you to be confused, you are held to the objective meaning of a contract. And for all of the situations Nvidia cares about, it is obviously a datacenter.


I have. If the quality or the product isnt the correct one, it goes back.

I don't have a contract with Nvidia to begin with when buying cards. The driver EULA is hardly a contract.


This is one way to help the competition immensely. The premium nvidia charges for their tesla cards just isn't worth it. It's not enough of a performance advantage to warrant the price increase, and reliability wise you can by 6-10 1080ti's for the cost of one p100.


So it will be profitable for a third party company to develop a GeForce driver.

This will also not be enforceable in all countries. It is probably not enforceable at sea where maritime law rules.


firmware needed by modern nvidia cards is cryptographically signed


The keys will leak eventually.


Couldn't Nvidia add the same clause to their product's license (instead of the driver's one) and thus having it enforceable even if you use some 3rd party drivers?


What product license? As the producer of some physical item, you don't get a say in what I can do with it after I bought it. Especially not if I never did any deal with you in the first place, but bought the thing through a long distribution chain of middlemen.

It's only with software that people consider it acceptable for the vendor to tell you what you can and can't do.


Deep learning ship!


The NVIDIA website has a whole section on Data Center graphics cards: https://www.nvidia.com/en-us/data-center/

Won't this just be hurting their own sales possibilities? I can only guess they'll be announcing some datacenter-specific version of the software soon, which is essentially the same but mysteriously more expensive.


Those aren't GeForce cards.

They don't want their consumer line (GeForce) competing with their datacenter line (Tesla).

They previously reduced the FP64/FP16 performance on GeForce cards (making them much worse for neural networks), but it's easier to limit the losses legally than by reducing hardware. Modern games don't want artificially limited ALUs!


This is not technically correct. I think you're referring to cases of Pascal cards where the GeForce version had poor fp64 performance. This is because it's a different asic Rev. Same reason the more expensive p100 Tesla didn't have int8 support -- certainly not because they did it in software. There's no reason a GeForce card for gaming needs high fp64 support. The reason you're seeing it on the Titan v is because it's literally the same asic, but now you're also paying a premium and it's not made for gamers.


I'm referring to the GTX 1080 running FP16 at 1/64th the rate of FP32: https://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-...

"GeForce GTX 1080, on the other hand, is not faster at FP16. In fact it’s downright slow. For their consumer cards, NVIDIA has severely limited FP16 CUDA performance. GTX 1080’s FP16 instruction rate is 1/128th its FP32 instruction rate, or after you factor in vec2 packing, the resulting theoretical performance (in FLOPs) is 1/64th the FP32 rate, or about 138 GFLOPs."


Right, and I'm saying it uses the GP104 asic, which has that disabled so that other features can be enabled (higher clocks, for example). I don't think you can find and example where the Tesla shares the same chip as the GeForce and something like fp16 is different. I'm sure they do a lot of shady things, but I believe part of this is just chip design tradeoffs.


Just to prove the point, the Tesla p40 has the same chip as the GeForce you site, and it too has slower fp16. This is for a $5k card .


Ah sorry, I see, so there's already that split in consumer vs. datacenter and they're now forcing the line to be drawn.


This is about the GeForce products, which are consumer/gaming targeted GPUs. Pro-style Quadro, Tesla, etc. cost quite a bit more for similar hardware due to the firmware, drivers and licensing.


I searched the site for high end Quadro drivers and they all link to the same "March2009" license.

EDIT: I was wrong. Apparently they dont include this "datacenter" provision http://www.nvidia.com/content/DriverDownload-March2009/licen...


> quite a bit more for similar hardware due to the firmware, drivers and licensing

The first two criteria in your list are irrelevant, it's purely due to licensing.


i.e. they cost quite a bit more because NVidia likes it that way -- it's tiered pricing.


Was this actually an executive-level decision? The announcement of this change (if you can call it that) was so bungled it creates the impression some low-level person in legal did this without coordinating with the rest of nVidia.


Obligatory Linus Torvalds response, re: NVIDIA: https://www.youtube.com/watch?v=iYWzMvlj2RQ


If I understand the underlying issue here, it's that Nvidia presumably doesn't want it's consumer graphics cards used in Datacenters, and instead wants to sell more expensive cards.

Isn't this similar to how movie studios used to require Video Rental Stores pay more money for a film, vs the retail cost?


Somebody should write an adapter that hooks into their GPL shim and forwards requests over the network to a machine not in the data center that runs their binary blob. Trying to limit use of certain hardware via stupid legal means just begs for stupid technical solutions.


Did you mean 'GPU' instead of 'GPL'


No, since the linux kernel is GPL, as far as I understand, their must be some part of their driver that is also available in source form to interact with the kernel API. I haven't investigated this at all, but it feels like it may be possible to hook that. It's a terrible amount of work of course, but would be kinda fun to try. Perhaps an easier thing to try would be to run something like QEMU and have it forward activity on the PCIe bus over the network to a host system. Not sure if that's something people have done before.


Eeh I have quite a few of these in data centers for Deep Learning work. What am I supposed to do now. Swap them out for Tesla/Quadro cards? Switch to AMD or Xeon Phi? Or just ignore it and possibly violate ToS. Nice move Nvidia ...



First step is to disable automated driver updates. The drivers you have installed currently were without this ToS change.

This gives you a while to continue using the cards, without violating the ToS.

For all new servers you set up, you will have to look at other solutions, though.


Anyone else bothered by the 2009 in the url?

Edit: Some driver download pages currently link to this page.


While these EULAs are head bangingly dumb, generally untested, (and sometimes totally unenforcable) i just wanted to say that i think that's just a CMS artifact. Can confirm that this is still linked as the "software license agreement" throughout the site for current driver downloads


“No Datacenter Deployment. The SOFTWARE is not licensed for datacenter deployment, except that blockchain processing in a datacenter is permitted.”


Just laugh and do it anyway.


This is why open source drivers are important.


"Datacenter" undefined.


That's a great example of why we need open source hardware. 10-20 years ago this was a norm in software field. Now, you can license world class products with complete source for free.


They're nuts and I'd love to see them try to enforce this. Coming soon to a DC near you: The Nvidia inspectors, picking locks on cages and racks looking for humping.


Sounds like one of those hilarious EULA clauses that is void in many countries with laws around companies not having the right to control the use of their product after sale.


Between this and the high Tesla prices, they really want to move people to use Google's TPUs...


And THAT'S WHY it's good for corporations to have closed non-libre drivers :) They can make an easy buck by selling 'more expensive' licenses. You are the buyer, you decided for a closed solution, so pay more for it!


Nowhere in the document does it define "datacenter," very nebulous term. Perhaps too nebulous to enforce.

I have 2 racks in my basement, if I had an NVIDIA GPU in one of my servers for playing games, am I violating their EULA?


The law is not so obtuse. If otherwise enforceable, the inability to draw an exact line where something starts being a datacenter will not void a contract. The vast majority of things are on either side of that line.


Is there some legal reason why AMD can't implement a CUDA compiler for their architecture? Seems like the obvious move right now.


AMD has a CUDA compiler. It requires two-step compilation, though. First into GPU-neutral code, and from there you can use AMDs actual compiler.


I think the card itself takes more power and wears out faster compared to the data center versions. Is this true in practice?


I think this is designed to deal with the low prices offered by ovh and heizner. And in favor of Google and Amazon.


Does that blockchain clause target miners? Aren't they called Mining Farms, not Data Centers?


How does NVIDIA define a datacenter exactly?


It's beyond sickening to see one more product you buy attempt to control you via license. when will this shit end for hardware or is it just the start?


This is value based pricing. It is just the start, it isn't new, and it isn't limited to computer hardware. *aaS (anything as a service) is the newest flavor of this. Expect it to explode as IoT becomes more mature.


It's not a start - we're way into that situation now. But things are getting worse. This is legal-level restriction. If they wanted to be complete assholes, they'd tie the driver to an on-line service instead, like almost everyone on the IoT market does these days.

Not sure what's the way out. The market doesn't and won't care (that's exactly why they're doing it), and you can't legislate things as vague as "don't be an greedy asshole", which is what all of this boils down to.


This is EXACTLY the kind of thing that NVidia do. This is the kind of thing they ALWAYS do.

Don't be surprised. NVidia are assholes.


What are some other examples?


If you want to pass-through a consumer Nvidia card to a VM, you need to hide that you are running a VM from the guest OS, otherwise the Nvidia drivers will fail with some obscure errors.

https://github.com/sk1080/nvidia-kvm-patcher/blob/master/REA...


I've run into this before and gave up and just ran things on physical hw. It baffles me that we're running in to this crap in 2017.


Works fine for me. Now there is even software that doesn't require changing the HDMI input for this.


If you wanted to use a secondary nVidia card for PhysX, all was fine until nVidia changed their driver to disable PhysX when it detected an active AMD card.


NVIDIA GameWorks is probably the most cancerous thing to happen to the gaming industry. Partnering with UE4 to fuck games for AMD at the engine level :(


Here is another interesting read:

https://www.eevblog.com/forum/chat/hacking-nvidia-cards-into...

The difference between their "professional" GPUs and regular ones is literally a few sub-cent resistors.


Sometimes. The GeForce GTX 690 used the same chip as the Quadro K5000 (the GK104). More recent generations typically have different silicon (or bins with defective regions) in the pro and gaming lines. For instance, high end pascal gaming cards use GP104, but Tesla P100 uses the GP100.

For more details, see https://en.m.wikipedia.org/wiki/List_of_Nvidia_graphics_proc... (Notably the "code name" columns).


You used to be able to upgrade AMD CPUs with a pencil.

http://www.overclockersclub.com/guides/tbird/


Maybe so, but please don't post unsubstantive rants to HN. We're looking for thoughtful conversation, not venting.

https://news.ycombinator.com/newsguidelines.html


Is that why Tesla is building their own chip for Autopilot?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: