more pornel's comments | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit | more pornel's comments

pornel on June 27, 2024 | parent | context | [–] | on: Misconceptions about loops in C

If you're wondering why the paper seems to complicate "simple" constructs by converting them to control flow graphs — that's because many other kinds of static analysis, like tracking when variables are initialized and overwritten, are even more puzzling when trying to analyze very stateful code directly based on the syntax tree. The graph representation allows splitting the code into smaller blocks, and remove a lot of statefulness from them.

pornel on June 27, 2024 | parent | context | | [–] | on: The C Standard charter was updated, now with secur...

Almost every other principle in the charter can be used to shoot down changes for safety, so I'm afraid the safety part is only wishful thinking, hoping for a Sufficiently Smart Compiler/Analyzer to save them.

The charter also calls for codifying known solutions, and not new inventions. But the known solutions for memory safety require things like more advanced type systems, that aren't in "the character of the language", and aren't "small and simple". Safety could be improved by severely restricting pointers, in conflict with the "programming freedom" principle, and of course performance overheads or backwards-incompatible changes are unacceptable.

So the charter says that C should have a cake and eat it.

pornel on June 27, 2024 | parent | context | | [–] | on: 310-mile automated cargo conveyor will replace 25,...

Imagine the savings in construction and maintenance costs if you could simplify the conveyor sections to be just two parallel iron rods.

Closi on June 27, 2024 | | [–]

It probably will be in reality - pallet conveyor is too slow and expensive, so it’s most likely that this would be an electric monorail system or similar.

(I work in logistics automation design, 310 miles of traditional pallet chain conveyor just isn’t feasible imo for loads of reasons)

pornel on June 26, 2024 | parent | context | | [–] | on: Liquid Layers

This is a common and difficult problem when simulating fluids using particles.

It's possible to simulate using a grid instead, and that computes pressure very precisely, but it has a downside of adding inaccurate viscosity.

And there are tricks that combine both approaches to balance out the errors.

kotsoft on June 27, 2024 | | [–]

Yeah agree. The objective was more to make a physics toy that would run on single core on a phone than something for actual scientific or industrial use. I could add additional iterations or do pressure projection but then there would be complaints about it being slow & choppy.

There are also some large density ratios between the materials which further increased the difficulty, and would also increase the number of pressure projection iterations on a grid. I tried to simulate buoyancy without cheating (e.g. giving different materials different acceleration to gravity)

pornel on June 26, 2024 | parent | context | | [–] | on: Liquid Layers

I'm surprised the simulation is run on the CPU! This problem has solutions that fit the GPU well, even for grid<>particle conversions.

troglobite on June 26, 2024 | | [–]

this kind of sims are better suited for CPU ;-), gpu are good to work on meshes, not really on pure particles. At GPU are super good for grid based hydro.

lukan on June 27, 2024 | | | [–]

"gpu are good to work on meshes, not really on pure particles"

Why?

Having thousands of particles, all in need of doing the same operations on them in parallel screams GPU to me. It is just way harder to program a GPU, vs a CPU.

thegeomaster on June 27, 2024 | | | [–]

Collision detection is usually a tree search, and this is a very branching workload. Meaning that by the time you reach the lowest nodes of the tree, your lanes will have diverged significantly and your parallelism will be reduced quite a bit. It would still be faster than CPU, but not enough to justify the added complexity. And the fact remains that you usually want the GPU free for your nice graphics. This is why in most AAA games, physics is CPU-only.

lukan on June 27, 2024 | | | [–]

"Collision detection is usually a tree search"

Yes, because of the very limited numbers of CPU cores. With a GPU you can just assign one core to one particle.

Here is a simple approach to do it with WebGPU:

https://surma.dev/things/webgpu/

It uses the very simple approach, of testing every particle with EVERY other particle. Still very performant (the simulation, the choosen rendering with canvas is very slow)

I currently try to do something like this, but optimised. With the naive approache here and Pixi instead of canvas, I get to 20000 particles 120 fps on an old laptop. I am curious how far I get with an optimized version. But yes, the danger is in calculating and rendering blocking each other. So I have to use the CPU in a smart way, to limit the data being pushed to the GPU. And while I prepare the data on the CPU, the GPU can do the graphic rendering. Like I said, it is way harder to do right this way. When the simulation behaves weird, debugging is pain.

kotsoft on June 27, 2024 | | | [–]

If you use WebGPU, for your acceleration structure, try to use the algorithm here presented in the Diligent Engine repo. This will allow you not to transfer data back and forth between CPU and GPU: https://github.com/DiligentGraphics/DiligentSamples/tree/mas...

Another reason I did it on CPU was because with WebGL you lack certain things like atomics and groupshared memory, which you now have with WGPU. For the Diligent Engine spatial hashing, atomics is required. I'm mainly using WebGL because of compatibility. iOS Safari still doesn't enable WGPU without special feature flags that user has to enable.

lukan on June 27, 2024 | | | [–]

Thanks a lot, that is very interesting! I will check it out in detail.

But currently I will likely proceed with my approach where I do transfer data back and forth between CPU and GPU, so I can make use of the CPU to do all kinds of things. But my initial idea was also to keep it all on the GPU, I will see what works best.

And yes, I also would not recommend WebGPU currently for anything that needs to deploy soon to a wide audience. My project is intended as a long term experiment, so I can live with the limitations for now.

thegeomaster on June 27, 2024 | | | | [–]

This is a 2D simulation with only self-collisions, and not collisions against external geometry. The author suggests a simulation time of 16ms for 14000 particles. State of the art physics engines can do several times more, on the CPU, in 3D, while colliding with complex geometry with hundreds of thousands of triangles. I understand this code is not optimized, but I'd say the workload is not really comparable enough to talk about the benefits of CPU vs GPU for this task.

The O(n^2) approach, I fear, cannot really scale to much beyond this number, and as soon as you introduce optimizations that make it less than O(n^2), you've introduced tree search or spatial caching that makes your single "core" (WG) per particle diverge.

lukan on June 27, 2024 | | | [–]

"that make it less than O(n^2), you've introduced tree search or spatial caching that makes your single "core" (WG) per particle diverge"

Well, like I said, I try to use the CPU side to help with all that. So every particle on the GPU checks maybe the 20 particles around it for collision (and other reactions) and not 14000, like it is currently.

That should give a different result.

Once done with this sideproject, I will post my results here. Maybe you are right and it will not work out, but I think a found a working compromise.

kotsoft on June 27, 2024 | | | | [–]

Yeah, pretty much this, I've experimented with putting on the GPU a bit but I would say particle based is 3x faster than a multithreaded & SIMD CPU implementation. Not 100x like you will see in Nvidia marketing materials, and on mobile, which this demo does run on, GPU often becomes weaker than CPU. Wasm SIMD only has 4 wide but the standard is 8 or 16 wide on most CPUs today.

But yeah, once you need to do graphics on top, that 3x pretty much goes away and is just additional frametime. I think they should work together. On my desktop stuff, I also have things like adaptive resolution and sparse grids to more fully take advantage of things that the CPU can do that are harder on GPU.

The Wasm demo is still in its early stages. The particles are just simple points. I could definitely use the GPU a bit more to do lighting and shading a smooth liquid surface.

thegeomaster on June 27, 2024 | | | [–]

Agree with most of the comment, just to point out (I could be misremembering) 4-wide SIMD ops that are close together often get pipelined "perfectly" onto the same vector unit that would be doing 8- or 16-wide SIMD, so the difference is often not as much as one would expect. (Still a speedup, though!)

troglobite on June 30, 2024 | | | | [–]

The issue is not really parallelism of computation. The issue is locality. Usually a hydro solver need to solve 2 very different problem short and long range interaction. therefore you "split" the problem into a particle-mesh (long range) and a particle to particle (short range).

In this case there is no long range interaction (aka gravity, electrodynamics), therefore you would go for a pure p2p implementation.

Then in a p2p, if you have very strong coupling between particles that will insure the fact that neighbors stay neighbors (that will be the case with solids, or with very high viscosity). But in most case you will need are rebalancing of the tree (and therefor the memory layout) every time steps. This rebalancing can in fact dominate the execution time as usually the computation on a given particle represent just a few (order 100) flop. Then this rebalancing is usually faster to be done on CPU than on GPU. Then evidently, you can do this rebalancing "well" on gpu, but the effort to have a proper implementation will be huge ;-).

pornel on June 25, 2024 | parent | context | | [–] | on: Fixing QuickLook (2023)

Design of macOS since BigSur took such a nosedive.

Buttons are flat text that doesn't look clickable, with the best case of having a very faint border, sometimes only on hover. There are multiple ad-hoc checkbox replacements. There's a jarring cacophony of old macOS and new iPadOS UI elements — old UI elements with small fonts, small padding, and teeensy disclosure indicators share the screen with big fat round blobs lazily transplanted from a touch screen OS. Some elements react to hover, some don't. Some can only be discovered by hovering mouse in a specific location. Menus have varying heights, and varying padding.

Such unpolished inconsistent details used to be a tell-tale of non-native UI toolkits, or skins for other OSes faking a Mac OS X look. Now macOS looks like a hasty unfinished reskin of iPadOS ;(

tr3ntg on June 26, 2024 | | [–]

Agreed. And the poor Settings app. They butchered its soul. I’m sad every time I have to interact with it.

toasterlovin on June 26, 2024 | | | [–]

I much prefer it, fwiw. Since MacOS apps have their settings in-app, I rarely use Settings.app and so I could never remember where to find things in the 2D grid of the old app. Now things are much more discoverable, since it’s a 1D layout that I scroll linearly. Even more so because there’s somewhat of a correspondence with iOS, whose Settings.app I use all the time (since iOS apps don’t have settings in-app). I say this as someone who has used macOS for almost 20 years.

omnimus on June 26, 2024 | | | [–]

I agree. The new settings app is one of few places where new design/reorganisation is actually better and people are annoyed just because of broken habits. Whats really bad though is how individual settings in the app are gradualy disappearing.

kccqzy on June 26, 2024 | | | | [–]

I never even try to find things in the 2D grid of the old System Preferences app. I just use Spotlight, or any number of Spotlight replacements that support searching for prefPanes. I say this as someone who has used macOS for almost but less than 20 years: perhaps my first introduction to OS X was Tiger so I'm more accustomed to Spotlight?

infotainment on June 26, 2024 | | | [–]

I'd say it goes back even further; the Yosemite UI wasn't really good either. The buttons ended up looking so boring with all gradients stripped out.

AnonC on June 26, 2024 | | | [–]

To me, it seems like the spirit of Jony Ive is still around. Seriously, you should email Craig Federighi about this with (any links to) criticisms or images showing comparisons. Sometimes Apple takes action only when people high up are alerted.

grishka on June 26, 2024 | | | [–]

There's an accessibility setting to show button borders in toolbars. I turned it on the day I got my current Mac and now keep forgetting that it's not the default.

pornel on June 25, 2024 | parent | context | | [–] | on: How GCC and Clang handle statically known undefine...

UB is a guaranteed compilation error in constexpr.

However, in regular code UB may only become known after multiple optimization passes (e.g. constant propagation, inlining).

It's difficult for the optimizer to meaningfully err at that time. The code may have been heavily modified by other optimization passes, which can add UB to the code (e.g. UB may have been data-dependent (1/x), but optimizer added a specialized copy of a function for x=0 where it became unconditional). The IR may have been generated by a non-C language that used UB/poison intentionally to guide optimizations.

pornel on June 22, 2024 | parent | context | | [–] | on: Researchers describe how to tell if ChatGPT is con...

LLMs are language models, and I think it's best not to try to extrapolate them to general intelligence. They are universal language translators, and a lossy database of a lot of text. They might be a component of some bigger AI system in the future, but themselves they are not as intelligent as their marketing implies.

pornel on June 22, 2024 | parent | context | | [–] | on: C can be memory safe

This only handles a subset of bounds checks (spatial safety). It has nothing for temporal safety (UAFs), nothing for thread safety, nothing for uninitialized memory, and it doesn't fix C's defaults.

This is very much a "we have memory safety at home" situation.

pornel on June 22, 2024 | parent | context | | [–] | on: Farm: Fast vite compatible build tool written in R...

Languages have more characteristics than what can be reasonably included in a headline.

"Fast to run, but slow to compile and needs very new compiler, and may have a big-ish executable, but OTOH it won't cause much problems with installation of dependencies - vite compatible build tool"

wiseowise on June 22, 2024 | | [–]

Not sure what you’re implying. I’m an end user of a product, I couldn’t care less what time it takes to compile the final binary.

Or this thing exists to attract enthusiastic Rust devs to contribute to the project?

whstl on June 22, 2024 | | | [–]

"Written in Rust" is often a shorthand for "having performance as a goal" when the tool is new and the target audience is mostly technical and made of early adopters, or people willing to try or contribute to new things.

Perhaps you're not the target audience at the current time.

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact