Graceful pure JavaScript fallback when GPU is not available.
This is a bad thing, not a happy thing to be advertised. Software fallback was of the most frustrating parts of working with OpenGL circa 2010.
If you're making a hardware accelerated library, stop trying to make it "gracefully" fail. Just fail! That way people can address the root problem.
On the other hand, as long as the library is sufficiently noisy about the fact that it's in CPU mode, it seems fine. I just hope they don't follow OpenGL's design decisions.
It's surprisingly hard to figure out the answer from the README: https://github.com/gpujs/gpu.js/#readme There's a "How to check what's supported" section, but it's also "mostly for unit tests."
One of the most annoying things about OpenCL was the automatic CPU fallback. Trying to figure out if I had it actually _working_ or not was a terrible first impression. It gave an output, sure, but was it slow because it was still hitting the CPU fallback? Had I failed to install the GPU driver I had attempted to install? Or was it slow because my kernel wasn't GPU-friendly? Or hit some slow path in OpenCL?
Meanwhile CUDA, being entirely GPU only, was a breeze. Did I get an output? OK then yes I got the basics working. Now on to making it sing.
I really don't understand this - it's very easy to set which device (CPU / GPU) your OpenCL code is running on and to avoid running on the CPU if you don't want that.
On the other hand the ability to test on GPU-less systems is very useful indeed.
OpenCL has lots of faults but the ability to use a CPU with the appropriate drivers when a GPU isn't available surely isn't one of them.
> On the other hand the ability to test on GPU-less systems is very useful indeed.
Ideally OpenCL's support for CPUs shouldn't just be for testing - it should be able to make efficient use of the CPU's SIMD.
Intel still support OpenCL for CPU. [0] AMD used to, but they dropped support for CPU devices, at least on Windows. Perhaps they still support them on Linux. [1][2]
I'm actually using AMD CPU drivers (OpenCL2 on Ubuntu 20.04) in a project that I hope will be in production over the next month or so. Speed is more than acceptable for my purposes.
Tried to use Intel but didn't manage to get working. AMD's were a breeze by comparison. I'd be very interested in hearing about others experiences and any benchmark AMD vs Intel comparisons.
Good to hear AMD haven't completely done away with their OpenCL-on-CPU.
Have you compared the performance against alternative technologies like OpenMP?
Years ago I had the Intel OpenCL running on an Intel CPU (the machine ran Windows) and AMD OpenCL running on an AMD CPU (that machine also ran Windows). AMD's Visual Studio plugin for OpenCL was really pretty good. Haven't used OpenCL since though so I can't really comment on the current state of things.
Using some OpenMP but the code is exceptionally well suited to the OpenCL Kernel approach plus the ability to explore use of GPUs at some point in future is helpful.
Thanks for the links - I'll have a look at the Intel drivers again and see if I have any more success!
Historically there was an emulator mode, where you could build your CUDA code to run on CPU. However, it was removed a few years ago. I never used it, so can't really give you more details beyond that.
> "stop trying to make it "gracefully" fail. Just fail! That way people can address the root problem."
So the root problem is not having the right hardware?
Personally, I would hate to use software which tells me to upgrade my GPU, especially when there's a software fallback which could have been implemented... albeit with poorer performance...
Adding a not supported message is a design choice. The parent did not say that the library should not make a CPU implementation, just that the fallback should not be automatic.
For some applications, it might be appropriate t fall back to the CPU, but change the problem resolution, for some applications it might be appropriate to not event attempt the run the software, and for some applications it might be appropriate to just run in CPU mode if GPU mode is not available.
But I completely agree with the sentiment, that this should solely be the developers choice.
I think the developer should be able to explicitly disable the fallback, but I'd really like it to be enabled by default. I don't trust developers to think about weird hardware configurations, and I'd like the software to do what it can to (hopefully) keep working instead of just locking me out.
> "Personally, I would hate to use software which tells me to upgrade my GPU..."
I think its a design decision for the underlying library. Is the behavior configurable, or "hidden" from the the user? Is the fallback configurable on a library-wide basis, or a per-kernel basis? If you're writing or testing code and trying to make it work, you want it to explode in an elegant-ish manner, and tell you what to fix. If you're shipping code, you want control over how it fails, and you might want different things.
What are my options?
There's probably cases where a graceful or semi-graceful fallback would work. However, there's cases where the time spent in CPU mode would be embarrassingly long - to the point where your options really are "show a message" or "disable the function."
I'm with the OP here - the caller should have a say in the situation which will make the behavior predictable as software or hardware changes.
Yeah, GGP's take is not that interesting. It just depends on how much extra perf/value you get from the GPU vs the CPU.
If the GPU gives you some nice fancy shading, but your game works fine without it, then GGP's suggestion to "just make it fail" is of course silly.
Slight tangent: I think a lot of "tech hot takes" (and snarky SO comments) come from a lack of imagination outside our own experience. It's easy to forget that there is often a huge diversity of applications/use-cases for the languages/libraries/etc that we use.
> If the GPU gives you some nice fancy shading, but your game works fine without it, then GGP's suggestion to "just make it fail" is of course silly.
Not really, that's actually the point being made: if you try to use the GPU with your fancy shaders, you'd rather have it fail noisily so that you can explicitly fallback to the CPU implementation without your fancy shaders, than the library silently falling back to rendering everything (including your expensive shaders) on the CPU.
Then, so far as I can see, it's not even a un-interesting take, since you can easily detect WebGL support before running GPU.js. Perhaps this was difficult to do with OpenCL in 2010, but it's a non-issue here.
I just tried a WebGL demo, and got wildly different results depending on the browser, and which GPU is active at the time the demo started. Results varied between:
- Smooth and good looking
- Good looking but <1fps. Turning off GPU features would have been the better thing to do.
- Noisy, corrupt-looking pixels over half the image because calculations were producing incorrect numerical results; it looked like overflow or something.
At least it's possible to detect a low frame rate and, eventually, compensate by reducing features and/or resolution. But that's still annoyingly slow (it will take a few seconds to detect).
Bad calculations are a bigger problem, as it's hard to anticipate what kind of bad to look for and how to detect it.
> I just tried a WebGL demo, and got wildly different results depending on the browser
That's not a problem with the API, that's a problem with one or more of your browsers.
Bench-marking your code during init to choose the best algorithm is a very sane thing to do if you know your users' machines/browsers may have different capabilities. We're talking about extra fractions of a second of load time. The alternative is to expose a bunch of details about what's underneath the browser, which would probably not fit with the goals of the web - it being a sort of abstraction layer over all the operating systems and devices.
I agree when it comes to benchmarking for speed in WebGL.
It's much more of a pain when calculations come out wrong though. It's hard to anticipate all the ways that might happen, check for them, and then find workarounds when you don't have the hardware where they go wrong.
Bad calculations can ruin a game, not just due to visual glitches, but by revealing things that should be hidden, for example seeing through holes in walls caused by glitches.
I have to test for SwiftShader in my web game. Whenever Chrome gets a bit upset with the GPU it will silently fall back and make the game 10x as slow and I get a bunch of support requests. Putting in a check and telling people to restart Chrome has reduced my support workload a ton.
> Failing seemed reasonable to me, because my context around software fallback translates to “1 or 2 fps if you are lucky”.
Arguably I prefer to have the option to use software that underperforms than to not be able use it all all, specially if performance is not that critical.
Handling this gracefully would mean the developer can make that decision, explicitly. Ideally, the last piece of code I was working on would:
1. Detect GPU acceleration
2. With acceleration, enable fancy bells-and-whistles, with animations and effects
3. Without acceleration, fall back to good-old-fashioned mode, with flat HTML
A lot of this depends on what YOUR software does. Is it speeding up a spreadsheet (which will probably work fine in CPU)? Is it adding fancier shaders to a video game?
Explicit beats implicit. The library should support CPU and GPU, without a doubt, but the user of the library should make the decision of what to do if GPU isn't found.
That's fair assuming that developers put in the effort to write the less fancy version. But if they don't I'd prefer to still have the option to run the software as written with worse performance.
No one's talking about taking that option away. All I'm saying is it should be an explicit option. When you initialize the library, you can pass a parameter. Or set a feature flag. Or whatnot.
If you're writing a spreadsheet, use a fallback to CPU. If you're writing a glitzy animation, turn it off.
The "2fps" example was an arbitrary hyperbole that by itself is worth nothing as a claim or statement.
Moreover, it's not up to you to decide what's acceptable or not to a 3rd party, specially if your claim is being made to justify choosing between being able to use something that performs below par or not be able to use it at all.
Maybe on today's hardware, but what if I'm running this software 15 years from now? At which time, btw, I'm more likely to be forced to use a VM, or a similar solution where software rendering is the only option.
GPU is used for other things besides rendering animations... AI for instance. In many contexts you want to perform the calculations, even without acceleration.
that's 2005 software fallback performance. We're in 2020, LLVMPipe can easily give you 15-25 fps for simple games, which is very decent for a fallback option.
I'm actually curious, given the massive speed difference between CPU and GPU: can anyone here think of any situation where a fallback to CPU speed would still be acceptable for a real-life application?
The only thing I can think of is simple image operations like a 5x speedup for a Gaussian blur filter... but then I also wonder if you'd even bother writing code to do it on the GPU in the first place.
> can anyone here think of any situation where a fallback to CPU speed would still be acceptable for a real-life application?
Sure, there are valid scenarios where the slower speed fallback is acceptable. Note that I agree with the GP comment; fallback should be explicit not automatic.
- Developer adoption. Not everyone has a GPU, but with a fallback path everyone can try it out and write code that will work on the fast path.
- Application development. Along the same lines as developer adoption, not every customer has a GPU either, so if I write an app I may want to let the user try a new feature (with a warning that it’s slow) rather than tell them they can’t use the feature at all until they upgrade their hardware.
- Heterogeneous compute farms. I’m thinking in particular of render farms for VFX and CG movies, as an example, how a render farm is large and may not be full of GPU nodes due to cost. A fallback path allows time/perf to be balanced against whatever the farm’s budget for GPU nodes is.
- Test machines. Running automated end-to-end integration tests to check for correctness doesn’t depend on speed, and if you’re renting your test machines, it might be cheaper to fallback to CPU than to rent GPU nodes. Think Selenium tests running on a bunch of AWS nodes, for example.
In fact, in practically ALL applications it's better to work than not work at all. The only exception I can think of is directly-interactive software like video games and VR imaging, where minimum fps really does matter.
Many situations for various customers aren't massive datasets but modest sets, so you might be on the left of the J-curve. Requiring 2 seconds, instead of 0.2 seconds, for a result is simply not important in most cases.
I think it's fair to warn the user that "GPU acceleration not working" if the performance is expected to be much better with a GPU.
The GPU isn't a "make code run fast" button, there are all sorts of algorithms that run poorly on a GPU because of its heavy SIMD structure and its programming model requires a programmer to cross their ts and dot their is, and the overhead of uploading/downloading data is not free, and scheduling tasks is not free.
Look at how ISPC had to change the C language to make it parallelizable for actual problems. I've attempted to write compute frustum culling in GPU.js and it came out slower than my CPU approach. Hand-written shader performed much better than either.
> a 5x speedup [...] I also wonder if you'd even bother writing code to do it on the GPU in the first place
A 5x speed-up is very significant in many applications! I've spent a bunch of energy on several projects just aiming for a 2x speed-up on GPUs - battling against SIMD and other optimisations on the CPU side.
As for real-world examples: The tensforflow.js CPU back-end runs something like 10x slower than the WebGL back-end, and I've got some great mileage out of the CPU back-end in simple node.js apps that don't have a GPU attached to their server.
The CPU isn't always too slow. For example rendering a spreadsheet, text editor, web browser or CAD screen might prefer GPU to make it visibly smoother and more satisfying (especially scrolling, compositing and high density text updates), but will still be useful on CPU.
And current CPUs are faster than old GPUs for some things, especially if vector instructions are used.
And GPUs are sometimes slow due to technical issues such as bus traffic, even if the GPU itself is fast.
> I also wonder if you'd even bother writing code to do it on the GPU in the first place.
Some of your customers have large datasets that are only manageable with hardware acceleration, and others don't? Maybe you don't want to maintain two separate codebases?
I mean, a GPU is only faster if it's doing many operations in parallel. So you can definitely have code that is faster on a CPU than on a GPU. And there'd be borderline cases where it's faster on a GPU but only by a bit.
With OpenCL, one of the big (massive) benefits was being able to choose to run a kernel on CPU or GPU. The main case where CPU ran a lot faster was random memory access.
Agree. Same for SIMD instructions, and video decoders.
Too many libraries/frameworks are trying to software emulate what's not supported, and fail miserably because the emulation is not remotely fast enough.
It doesn't realize that H.264 1080p60 works due to HW decoding, and falls back to VP9 480p30 because it ties 720p and up to HFR if it's an HFR video.
Using `youtube-dl -F` will tell you that 480p, 720p30, 720p60, 1080p30, and 1080p60 are available in H.264, while only higher resolutions are gated behind VP9. Some videos (from 50 Hz countries) are p25/p50 instead, but iirc using the same format codes as the p30/p60 brothers.
The difference between the CPU and the GPU in practice will be the speed that your code runs. If you have an app where you literally can't run it on a CPU, then there are probably going to be a few older and under-powered GPUs that also will be too slow. In which case, the correct thing for you to do is to do some very minor profiling of your application at runtime and show an error message or make adjustments based on the actual, real-world speed tests running on the real-world hardware sitting in front of the user.
You shouldn't be policing the specific hardware, you should only be checking whether or not the hardware works. Setting up "allowlists" of what hardware we support is contrary to the philosophy and spirit of the web.
Like you mention, there is a scenario where you might like to tell the user why the application is slow, and in that case, the ``GPU.isGPUSupported: boolean`` property seems sufficient to me.
In general though, if your user wants to run your app on underpowered hardware, other than letting them know what the problem is, there is no reason for you not to get the heck out their way and let them do what they want to do.
And it makes sense for the CPU fallback to be the default behavior, because the web is a general application platform for general, amateur programmers, and their defaults should allow them to fall into a pit of success. The idea that "the core library should fail, and the programmer should go out of their way to specify a custom CPU fallback alongside their original code", is just naive. Developers won't do that, the default setting should be the version that works for the most people.
If you have a GPU accelerated application where it's genuinely preferable to just crash the entire application instead of letting it run slower, and if for some reason you can't do profiling instead of blindly assuming that all GPUs will be fast enough to run your application, then you can go out of your way to specify your own fail conditions. But that shouldn't be the default. By default, apps should work.
The way that 90% of programmers will use this library is to have a normal application that has a few expensive operations that benefit from the GPU. For nearly all of those developers, having a default fallback to a slower implementation is the right choice. The 10% of developers that fall into the exception can manually detect slow hardware and code their own response.
That's the thing. Sometimes it is specifically to use the GPU in which case a silent fallback isn't desired. It's fine for the software to include a CPU mode but make it an actual mode and and not always fallback so the user can specify 'what they want to do'.
Maybe you missed the part where the program tells the user it's running in "CPU Mode" and may be slower, "Try running on a GPU" is another message that would make it clear.
So many comments here make it seem like we're dealing with people with a negative IQ.
Just show an appropriate message and let the user run the software. It's GREAT that there is a simple CPU fallback and it's enabled by default.
It's really an edge-case where a user would want to run the code and have it fail if there isn't a GPU - that is mostly an edge case of 1 user, the developer, except in very specialized situations where nobody would want to run it without a GPU, again an edge-case.
What are people going to do in the case of no support though? Fallback to CPU anyway. I don't think there's many problems where it's either that you compute on the GPU or you don't want the result at all. And given than, falling back to the CPU makes sense to me. Having to write that code myself would be annoying. I do agree though that there should be (if there isn't, IDK) an escape hatch for the rare case it may be needed. But I do think this is the right default.
I mean, look at the performance chart. CPU mode performance is a nice J curve with respect to "realistic workloads you'd actually want to GPU accelerate." I agree that having a CPU fallback is nice, but it's also pretty much the opposite of what you'd want to do in the field.
Prediction: someone deploys this and doesn't notice it's incredibly slow for mysterious reasons, and then spends several hours to figure out that the right version of CUDA wasn't installed. ("Fail fast, fail loudly" is a nice way to ensure that doesn't happen.)
Typically you'd bubble this up as a rejected promise in JavaScript. Hopefully with a more detailed error message than "no GPU". "Failed while doing X" or "failed because of Y," where the messages contain actionable info, would be ideal. Then the application developer can decide what to do.
I've never done front end development but I would probably want to request permission to gather telemetry, and, if granted, post the telemetry and the error to an error logging service. Then you could try to diagnose aggregated issues. You could also offer the option to export the telemetry to a zip file (hopefully the same format you send it to the error logger). Then the user could examine it themselves and have the option to provide it to you via another channel (support forum, email, etc.).
>Prediction: someone deploys this and doesn't notice it's incredibly slow for mysterious reasons, and then spends several hours to figure out that the right version of CUDA wasn't installed.
Why not just print out a message "Running in CPU mode" or "Running in GPU mode". There's no reason it has to be a mystery.
The comments here tell me that programmers have a narrow idea of what other programmers should/would/could do with their software.
> I agree that having a CPU fallback is nice, but it's also pretty much the opposite of what you'd want to do in the field.
Do the opinions of potential users matter? Because most people aren't willing to waste money on unnecessary upgrades just because a developer wanted to assume everyone had mid/high-end hardware.
Sure, but in my anecdotal experience, things where you care about acceleration are next to unusable without acceleration and by silently (its ok to do it noisily) falling back to cpu, you just give the user (who doesn’t necessarily realise its using the fallback) the impression that your software is slow garbage.
A lot of software doesn't have a CPU fallback path, because (1) the no/unsupported GPU case is very rare (2) performance on the CPU will be too bad to be of any use.
> What are people going to do in the case of no support though?
Fix their GPU drivers or OSes. In a cloud, switch to GPU instance type.
Compute shaders were introduced in Direct3D 11.0, on PCs the support for them is required for a decade now. Mobile devices were a bit later, but their upgrade cycles are faster, the support is good by now.
I disagree. JavaScript historically has always offered backward compatibility.
> Backwards compatibility is important because of the large number of browsers deployed and the wide variety of versions of those browsers [0]. This may be the reasoning behind offering backwards compatibility.
Also, according to the GitHub page [1] the fallbacks are WebGL2, and WebGL 1, and if neither of those work, only then will it use the CPU when in the browser. I find this to be quite acceptable. After all, it is JavaScript!
Web is hard, is the most hostile hardware environment, ie Android fragmentation. What you're suggesting will only be feasible in a predictable hardware ecosystem.
So what you're suggesting is that users without GPUs will break.
I will take a side and pick the user this JavaScript fallback seems to me a good design choice.
>This is a bad thing, not a happy thing to be advertised. Software fallback was of the most frustrating parts of working with OpenGL circa 2010.
>If you're making a hardware accelerated library, stop trying to make it "gracefully" fail. Just fail! That way people can address the root problem.
No. I want my code to run locally sometimes (possibly with a GPU), and sometimes I run the same code across hundreds of EC2 instances or just 1 instance, and I have good reasons for that. Sometimes a machine doesn't have a GPU, but I still want that code to run on it.
The "bad thing" is imagining you know every possible use of javascript, and then telling other people about how they should be running javascript (or any language).
For what it's worth, HN might be interested to look into how programming a simple WebGPU calculator works. I worked on one some time ago: https://laskin.live/
This way you can use some "better" API like Vulkan to run programs that are written in the new SPIR-V intermediate representation. In general, if you are interested in GPU programming, I would definitely look into WebGPU. The API is much easier to get started than Vulkan is, and achieves the most basic things required for simple GPGPU tasks (despite the fact that WebGPU can use Vulkan as the GPU API driver).
Note that WebGPU needs a browser flag to be enabled, and is generally very dangerous. It's possible to kernel panic operating systems on web page load, despite the fact that you are using a web browser.
This, along with WebAssembly (threads, SIMD, etc), are going to be a very big deal in the next few years. The obvious use case is AAA games running in your browser, but I think once we've got 80-90% of native speeds on the web, basically all of the "client-heavy" apps (video processing, game dev tools, etc.) are going to start migrating over too. Not really an interesting prediction at this stage, since it already started happening way back in asm days. IIRC photopea.com doesn't even use WASM at all, and it has excellent performance on all sorts of image manipulation/encoding/etc tasks. Still, I think there are exciting times ahead for those hacking on the web platform.
> obvious use case is AAA games running in your browser, but I think once we've got 80-90% of native speeds on the web
The thing about AAA games is that it is not just about the speed but also the memory requirements. Currently there is no way to load the many GB of memory required for many games in a browser.
> The obvious use case is AAA games running in your browser
Er, no? Why would an AAA game, which typically wants every ounce of performance and control it can get, ever jump onto a web platform and lose all of that? And in exchange get... nothing at all? And since modern consoles are the same architecture & ISAs as gaming PCs are, why would they want to figure out a WASM backend with lagging hardware support & high overhead when 100% of their users are x86-64 with the entirely same set of SIMD features? And do they just not run an anti-cheat or DRM system? Good luck with that sales pitch.
It's not like you'll ever be able to click a link and just start playing a AAA game locally in your browser, either, you need to pull several gigs of assets first. And go through a login/paywall unless we're only talking F2P games which are almost never AAA games.
This use case comes up a lot, but if (and that's a big if) it happens, AAA games will be the last thing to switch, not the first. More plausible first users would be things like CAD & 3D modelling applications, which need high performance but also don't have 80GB+ installs.
> The web would be a really interesting medium for indie games.
It already was. Flash gaming was huge (so was embedded Java gaming such as Runescape). It regressed rather heavily over the years, perhaps from mobile, perhaps from regressions in the web's tech stack even, or perhaps just from social changes. But unless it was web technology regressing, I don't see why WASM + WebGPU would suddenly bring it back, either.
> More plausible first users would be things like CAD & 3D modelling applications, which need high performance but also don't have 80GB+ installs.
These sorts of applications are already inching their way onto the web. Photo editors, video editors, 3D modelling software, game dev tools (like Godot and Play Canvas). We're just starting to move past the "toy" stage and into the "native apps which don't have a web plan need start worrying" stage, I think.
If you think this trend isn't going to continue to progressively heavier applications, including games, well... you're just wrong. You haven't been watching closely enough.
> Why would an AAA game, which typically wants every ounce of performance and control it can get, ever jump onto a web platform and lose all of that?
"Why would an AAA game, which typically wants every ounce of performance and control it can get, ever jump onto smartphones and lose all of that?"
Because they don't care about performance - they care about revenue. Performance is just a means to that end.
Call of Duty Mobile was one of the largest mobile game launches in history - more than a hundred million downloads in the first month. I'm not sure if anything has beaten it yet. They had to trade off performance, resolution, latency, bandwidth, screen real estate, and basically everything else. But the benefits were obvious: Increased accessibility and on-the-go playing being perhaps the main two.
The web offers a similar set of benefits. Arguably a super-set of those, since it encompasses mobiles too.
> And since modern consoles are the same architecture & ISAs as gaming PCs are, why would they want to figure out a WASM backend with lagging hardware support & high overhead when 100% of their users are x86-64 with the entirely same set of SIMD features?
Compilers. Even a 50% perf hit (or even more) is fine for now. The trade-offs are worth it.
> but also don't have 80GB+ installs
Asset streaming. It's the web. We stream 4GB HD movies now while we're watching them. Networks speeds are still exponentially improving. This is an engineering problem. You don't actually need the 80GB all at once - game dev tooling will adapt, or be replaced by web-first tooling which has done lazy-loading of assets from day 1.
> This use case comes up a lot, but if (and that's a big if)
All I'll say is that I'd be willing to be a lot of money that you're wrong. Really-big-budget browser games are going to land well within the next decade. Everything is basically already in place. We're already seeing web-first garage-made games like krunker.io with player counts larger than some AAA first person shooters.
Basically all mass-market consumer applications will end up as web applications. You might not like that (there are certainly down-sides), but you'd need to do a lot of explaining to explain away the very clear trends that we're seeing across the industry.
Smartphones were a new market. The web isn't. The web is the current market, just worse in every way. You haven't addressed any of the motivation here. Why would games go to the web?
And AAA smartphone games are also non-existent anyway. AAA games are widely understood to be big budget, fancy graphics blockbusters. Those aren't on mobile today.
> Asset streaming. It's the web.
Hahahahaha this is a bad joke, right? A game's working set for a level is regularly >4GB. With reliable I/O games currently have 30-40 second load times and use asset streaming.
You can't just wave this away or pretend it's an engineering problem. You want AAA games, you pay the AAA disk sizes. And that's just not compatible with the web. Not today, not next year, and network trends here aren't that great, either. Even fiber would struggle, and next gen consoles are about to raise the requirements here well beyond even that.
And if networks are that robust, then why wouldn't we just all jump on game streaming instead and embrace the thin client life? There doesn't seem to be any room here for needing powerful local computing and needing world class internet quality?
> Basically all mass-market consumer applications will end up as web applications. You might not like that (there are certainly down-sides), but you'd need to do a lot of explaining to explain away the very clear trends that we're seeing across the industry.
Current trends are everything is a native mobile app. Apps are bigger than ever. There's no zero-sum game here as you're claiming anyway. Web isn't eating everything up.
New-ness doesn't matter. What matters is perf/features and audience. The web has traditionally been held back by the former.
> And AAA smartphone games are also non-existent anyway.
I'm talking about big-budget games that push the limits of the devices they run on, and there are lots of them on mobile.
> > Asset streaming. It's the web.
> Hahahahaha this is a bad joke, right?
It's not a bad joke. This sort of condescension does not help persuade people. In fact it probably hurts your case because it's what people do when they're scraping the bottom of their argument barrel.
> Even fiber would struggle
1 Gbit plans just launched one suburb over from me. They're getting speeds of 100 MB/s. That's a 4GB level download in 40 seconds - easily hidden by an intro video/cut-scene. But again, the studios will be happy to make some trade-offs here, so expect asset sizes to be smaller (akin to big smartphone).
> And if networks are that robust, then why wouldn't we just all jump on game streaming instead and embrace the thin client life?
Yep, agreed here. I was going to mention this at the end of my comment but it was getting too long. This definitely could be the "dark horse" which makes AAA web-platform gaming irrelevant. Although, from the consumer's point of view, they'll still end up accessing it from the web.
> I'm talking about big-budget games that push the limits of the devices they run on, and there are lots of them on mobile. [...] But again, the studios will be happy to make some trade-offs here, so expect asset sizes to be smaller (akin to big smartphone).
Then you're not talking about AAA games. The discussion here is, or was, about a specific category of games. You seem desperate to move the goalposts to include non-AAA games.
I think you're just trying to weasel out of this now. By your definition there are no AAA games on Nintendo Switch, or mobiles, or... any handheld device in history. It wouldn't even help your argument to shift the definition to "maximum graphics", because I made it very clear that trade-offs need to be made in the original comment you replied to. You were well aware of the definition I was using the whole time - if you weren't arguing against that, then that's your fault.
I'm not going to continue this discussion further - it has become useless.
I think a couple of studios would disagree given their franchises on iOS and Android, or Stadia/xCloud/PlayStation Now/GeForce Now like games with most of their engines being cloud based.
Note that WebGPU is not planning to use SPIR-V anymore, but their own shading language, WGSL. If that makes you confused or angry, well, I spent about a year trying to convince them out of it to no avail.
Yet the representations are bijective, no? Doesn't it imply that (assuming a complete and consistent translator, a huge assumption I know) that any SPIR-V works as-as in WGSL?
Regardless, I appreciate your efforts. A huge disappointment that a common and shared IR could not be agreed on.
This is cool and I am glad it exists as someone who works with nodejs often, but if you’re doing heavy server side computations where serious parallelization is actually important, I’m a little skeptical that investing in having those computations in JS is likely to be a reasonable choice in many cases outside of small hobby projects. I could be wrong, but I think the performance compared to other alternatives also using parallelization via the GPU would be very unfavorable and you also wouldn’t have rich complementary libraries for this as a result of the low performance ceiling for JS in this domain.
Those criticisms haven't really been true of JS for years. Google has invested many dozens of millions to make JS fast. And GPU acceleration is largely unrelated to the particular language that happens to drive the GPU.
If the critique is "static typing good, JS bad" then that's a fair argument. But personally I'm excited to have something akin to a scripting language for a GPU.
Javascript is now fast in the sense that V8 is good enough at the things you are likely to do in the browser or in a typical node server. The very example in this link of how a matrix of random numbers is to be generated is an absolute joke compared to how this would be achieved in lower level languages. Javascript’s lack of static typing doesn’t just affect DX but what performance enhancements can be made. You don’t have direct access to memory and V8’s optimizations just aren’t THAT helpful for this sort of thing. This is a large part of the appeal of WebAssembly in the browser context. Even asm.js, impressive as it is/was, doesn’t really compare favorably and what you see here is clearly nothing like asmjs. I have a lot of love for JS and there are many domains it does well in but unless you show numbers that prove otherwise, I’m quite sure you are totally wrong here.
You don’t have direct access to memory and V8’s optimizations just aren’t THAT helpful for this sort of thing.
You do: ArrayBuffer works great.
I have a lot of love for JS and there are many domains it does well in but unless you show numbers that prove otherwise, I’m quite sure you are totally wrong here.
It's fair to say that writing a matrix multiply in pure JS might not be the best way to deploy your production code. But curing that is a matter of "run profiler, see clear hotspot, optimize hotspot." Especially now that you have so many options for dropping down to a lower-level stack.
I always wonder when I see these demos: what is it like in a real world workload?
I remember some years ago everyone was going nuts over how WebGL was able to render a detailed model in the browser, native games are dead! Except it was a single non-animated 3D model, in a simple or non-existent background.
real games had many animated models, in complex environments, with complex interaction, physics, AI, audio and gameplay logic.
Ever since that comparison, I think, yeah, those demos look cool and indeed silky smooth, but most are a single simple thing. Real workloads tend to be much more complex: how does it fare with those?
(Of course if the vast majority of the processing time is in the GPU, then it really doesn’t matter what language, fast or slow, you use to control or feed it.)
One recent innovation that (IMO) hasn’t got much attention is OffscreenCanvas. Even with WebGL, JS has been sorely limited by the fact that it’s single threaded. OffscreenCanvas lets you pass a drawing context off to a separate thread, or multiple contexts to multiple threads even. There’s a whole lot of potential there that I haven’t seen tapped yet (but I’ll admit I’m not very clued into the kind of scenes you’d see making the most of this).
As ever, browser support matters. Chrome and Firefox: yes. Safari? No.
I'm excited by the prospect of OffscreenCanvas, and yet at the same time I'm not convinced that it's going to be massively useful for my canvas library[1]. The one area where it could have a big impact (I think) would be to run the OC in a web worker alongside the bulk of the library's code, leaving the main thread to deal with final canvas display, user interactions, etc. But to implement that I'd need to recode the entire library from scratch.
At the moment I generate additional canvas elements in the code (in a pool, for reuse) and use them without adding them to the DOM. I'm happy with the resulting frame rates, and the flexibility the system gives me for creating responsive, accessible canvas elements in the web page. This is all done using a simple 2d context (even the bendy images code[2] is 2d, rendered in the main thread). I've avoided WebGL and the whole GPU acceleration thing because shaders scare me more than quaternions!
I've been running https://noclip.website for a few years now, which, while not a "real-world" workload, clearly approaches the amount of data that a game has to render. Check out e.g. https://noclip.website/#smg/AstroGalaxy which approaches the number of draw calls I'd expect a modern game to render with.
There are a lot of upsides to running it through the web platform (the big one is that I can give people a link and they can click it, rather than convince them to run an app, but there are others, too).
There are downsides: performance is heavily variable. A new Chrome can launch and my performance can be 5x worse than before, which is why I now always test in Canary to make sure I'm ahead of all issues. In Chrome, the WebGL layer is pretty solid, but I still find plenty of bugs and performance issues (e.g. looking at a profile unearthed this very simple issue causing browser resizing to be way slower than it needed to be https://bugs.chromium.org/p/chromium/issues/detail?id=110347... ).
Firefox is quite a bit slower and has a bit more draw call overhead than Chrome (I'm guessing they run validation on the main thread rather than an IPC process, causing JS to block)
Most of the performance problems I encounter are actually JS-related, it's super easy to be knocked off of a fast path by complete accident, and there aren't good tools to know what I'm doing wrong. Math and ALU is plenty fast -- doing matrix multiplication on the CPU has never been a bottleneck, and you actually want that more on the CPU so that the GPU doesn't have to run a matrix mul for every pixel. Uniform data computation belongs on the CPU.
My big thing that I fight is the garbage collector, and you can see the awkward "manual register allocation" code style I sometimes resort to to make sure my allocations per-frame are super minimal, especially in fast math code. It doesn't look like traditional ergonomic JS code: https://github.com/magcius/noclip.website/blob/master/src/Su...
I could go on if wanted but this comment is already getting pretty long.
That sounds like a development nightmare like the one of the IE6 era, and even worse for supporting customers.
Basically you are saying code performance is nondeterministic across your user base and across time too?
I am glad WebGL is a thing for pages that need some 3D widgets where performance is not a big deal. But for general software or games? I will keep my native software.
JS is really fast assuming you don’t do anything in JS is not a convincing argument for this.
My point is that if you are using JS as a wrapper for a big black box of gpu computations that are close to the metal then you are not really using JS in any meaningful sense and can wrap anything else that has a much better library ecosystem and performance qualities for anything that you’re not just getting the gpu to crank out in a server context (which is implied by nodejs).
There’s the speed of generating inputs, the speed of transforming and passing data at the input/output boundaries, and the ability to conveniently and performantly work with data in memory natively (i.e. while not outsourcing the computations elsewhere) that matters here. Is JS a great choice for any of these things? Most importantly, the last thing? This library isn’t using ArrayBuffers for setting up all data or working on the data in JS so even if they work great for performance it seems totally irrelevant and let’s be honest, if it were working with ArrayBuffers directly you would be so far away from usual JS and any convenience JS offers, you might as well not be writing JS.
> if it were working with ArrayBuffers directly you would be so far away from usual JS and any convenience JS offers, you might as well not be writing JS
Or you might as well do, since it makes interop generally easy. In the library space I no longer have to worry where my web code begins, where my mobile code ends or where my backend code sits. All libraries can be used anywhere I feel necessary with zero interop pain.
The same arugment was applied to writing C in the late 70s (vs writing several types of assemblies for various types of hardware at the time)
About the static typing point, I believe the JIT compiler knows what types you're calling your function with, compiles a version for them, and checks types before calling the compiled function.
So in the case of matrix multiplication, it knows that it's a double-vector of ints and runs the whole thing compiled. So it should, theoretically, be just as fast as a compiled language.
Genuine curiosity: How can it check the types at the call side quickly? Let's say I have a function that averages an array of floats. Does the JIT have to check each of the array's inputs to have the right type? That looks like it could become a serious overhead.
Usually how it works is it does that a couple of times, while gathering some kind of stats.
If those stats prove that it is always an array of floats, then machine code gets generated for an array of floats, with a trap handler for when the code fails.
The trap handler, if called, will throw away the code and restart the analysis.
So if you are nice for the compiler and keep your types regular, the machine code will not be thrown away and there are no speed bumps.
(Answering a bit late, sorry I missed your reply!)
I still don't get it... How do you know when to activate the trap handler if you're not checking every time? It's not enough to wait for an error condition (e.g. an invalid pointer). If it's an array of integers, and I sneak in a string in there, what's stopping the JITed version from treating that pointer as an integer?
The performance isn't an issue, but the language might be. I'm still waiting for a javascript derivative/platform that has operator overloading so we can finally see a numpy like library for js. Once that hits I'll pick up js and never look back. Python is fine but I always loved javascript's functional lite style over python's imperative style, but the language itself doesn't really allow for a library that would make computational science feasible in javascript.
As noted by others, instead of the handcuffs of lowest-common-denominator, we do GPU JS with best of class in node, where we get to play with Apache Arrow, RAPIDS, etc. We now shim nodejs to PyData for GPU tech vs doing node-opencl/cuda to get more GPU lib access, but I'd love to add a more direct numba-equiv and serverless layer here as V8 is better in key ways than cpython in many key ways here, and only a few glaring gaps afaict (gotchas in bigint / large memory space / etc still smoothing, TBD for multigpu and networking streaming.)
GPU JS and related frameworks are really 'browser GPU JS', and we find predictability there low enough that we still handroll WebGL 1.0 instead of them. When we shift, I'd hope it'll be for a webgl2+ (opencl2+...), but 10+ years later, I've stopped tracking the politics.
I like how easy it is to write a new GPU kernel function.
I have never done any GPU kernel programming but I might give this a try.
I wonder if TensorFlow.js is implemented in much the same way. TensorFlow.js is fairly much awesome, largely because the examples are so well done, getting up to speed is painless.
This could be useful for embarrassingly parallel workloads but 2 points:
- the backend is OpenGL so it won't be as performant on NVIDIA hardware as CUDA (NVIDIA don't like OpenGL)
- I don't see any explicit GPU memory management so you might not be able to setup a pipeline of operations which all operate GPU memory (aka the big pipelines you see in ML). That would be another performance hit.
Having said that it looks like fun and I'm going to check it out!
Nvidia loves OpenGL. They are probably the single biggest supporter of OpenGL. They have the best OpenGL driver in the industry. The president of Khronos is an Nvidia employee, as is the chair of the OpenGL working group.
CUDA is faster than OpenGL because it exposes more Nvidia proprietary features. GPU.js could add a CUDA backend for the Node version, but it wouldn't be portable to the Web or non-Nvidia GPUs.
Kind of, while they do love OpenGL for the graphics workstation cards, they are also the main hardware partner for DirectX design.
NVidia's OpenGL and Vulkan extensions are usually born as hardware support for DirectX capabilities, and then trickle down as extensions for OpenGL/Vulkan.
Also their main rendering products, like Optix, are CUDA based actually.
Speaking from my own personal use, 9 times out of 10, the 10x-30x speedup of going CPU to GPU is what I'm looking for. The extra 1.5-2x speedup at the end is not something I care about.
I think this is the major hold-up to more GPGPU adoption. NVidia does an amazing job tweaking libraries for ultrafast performance for Torch/TensorFlow/video games, where it really does matter. In the process, they miss everyone else who has embarrassingly parallel problems, but where it's just not worth the time to code up different versions for CUDA and OpenCL (or whatever ATI/Intel happen to be swinging that week).
I think if they did this /well/, they'd take over Intel as the major value-add in computers.
WebGL doesn't specify what the backend is supposed to be, in fact there are platforms with WebGL support, where OpenGL is nowhere to be found, like game consoles.
That benchmark seems a little bit problematic to me. When I click the benchmark button with matrix size 101 and iterations 5 the resulting score for GPU varies between 12k, 20k, 60k and Infinity.
Seems like if it's possible to accelerate JS with a GPU it should be far more possible to do it with a many-core CPU. This bodes well for the likely future of dozens of X64 cores or even hundreds of ARM cores on a higher-end desktop/laptop chip and hundreds to thousands on server chips.
I may be confused as I do not know much javascript, but the installation page mentions a dependency on Mesa, is it relying on it to perform the actual work?
If so, I do not quite understand the debates here around avoiding OpenGL design and fallback to software emulation.
Since this supports doing image convolutions I guess it can be used to do efficient CNN inference. Are there any examples doing that?
I see there are some other libraries to run deep learning networks in the browser (like Tensorflow.js), but it seems they have some limitation regarding the use of the GPU.
It may be interesting in pushing the boundaries of this project to get an efficient and generic CNN library. They seem to not use CUDA, which may be a limitation...
My point is that if you are using JS as a wrapper for a big black box of gpu computations that are close to the metal then you are not really using JS in any meaningful sense and can wrap anything else that has a much better library ecosystem and performance qualities for anything that you’re not just getting the gpu to crank out in a server context (which is implied by nodejs).
There’s the speed of generating inputs, the speed of transforming and passing data at the input/output boundaries, and the ability to conveniently and performantly work with data in memory natively (i.e. while not outsourcing the computations elsewhere) that matters here. Is JS a great choice for any of these things? Most importantly, the last thing? This library isn’t using ArrayBuffers for setting up all data or working on the data in JS so even if they work great for performance it seems totally irrelevant and let’s be honest, if it were working with ArrayBuffers directly you would be so far away from usual JS and any convenience JS offers, you might as well not be writing JS.https://www.bloggerzune.com/2020/05/follow-my-blog-with-blog...
really a bummer that they require function keywors so that the gpu function can access this.threads ! Would have been cleaner IMO - and more modern ideomatic - if the callback would have a parameter that would refer to the gpu.
otherwise this stuff is AWESOME. it would be great if we could reduce our dependency on python
Do you want exploits? Because exposing more and more bare metal OS functionality to javascript and then running all javascript you receive without a care in the world is how you get exploits.
And then once the exploits appear now the user is the danger for going to those sites, or installing that add-on, so the control must be taken away from the user. And so on with HTTPS only and no more accepting self signed certs so everyone has to be leashed to a cert authority and ask for permission to host a visitable website.
No, making the browser the OS is the path to loss of control and darkness.
This is a bad thing, not a happy thing to be advertised. Software fallback was of the most frustrating parts of working with OpenGL circa 2010.
If you're making a hardware accelerated library, stop trying to make it "gracefully" fail. Just fail! That way people can address the root problem.
On the other hand, as long as the library is sufficiently noisy about the fact that it's in CPU mode, it seems fine. I just hope they don't follow OpenGL's design decisions.
It's surprisingly hard to figure out the answer from the README: https://github.com/gpujs/gpu.js/#readme There's a "How to check what's supported" section, but it's also "mostly for unit tests."