More

rudedogg · 2025-09-10T17:00:31 1757523631

This is wishful thinking. It’s the same as other layers we have like auto-vectorization where you don’t know if it’s working without performance analysis. The complexity compounds and reasoning about performance gets harder because the interactions get more complex with abstractions like these.

Also, the more I work with this stuff the more I think trying to avoid memory management is foolish. You end up having to think about it, even at the highest of levels like a React app. It takes some experience, but I’d rather just manage the memory myself and confront the issue from the start. It’s slower at first, but leads to better designs. And it’s simpler, you just have to do more work upfront.

Edit:

> Rust's strategy is problematic for code reuse just as C/C++'s strategy is problematic. Without garbage collection a library has to know how it fits into the memory allocation strategies of the application as a whole. In general a library doesn't know if the application still needs a buffer and the application doesn't know if the library needs it, but... the garbage collector does.

Should have noted that Zig solves this by making the convention be to pass an allocator in to any function that allocates. So the boundaries/responsibilities become very clear.

bob1029 · 2025-09-10T17:07:07 1757524027

Use of a GC does not imply we are trying to avoid memory management or no longer have a say in how memory is utilized. Getting sweaty chasing around esoteric memory management strategies leads to poor designs, not good ones.

rudedogg · 2025-09-10T17:28:47 1757525327

> Getting sweaty chasing around esoteric memory management strategies

I’m advocating learning about, and understanding a couple different allocation strategies and simplifying everything by doing away with the GC and minimizing the abstractions you need.

My guess is this stuff used to be harder, but it’s now much easier with the languages and knowledge we have available. Even for application development.

See https://www.rfleury.com/p/untangling-lifetimes-the-arena-all...

pron · 2025-09-10T19:11:18 1757531478

Arenas are fantastic when they work; when they don't, you're in a place that's neither simple nor particularly efficient.

Generational tracing garbage collectors automatically work in a manner similar to arenas (sometimes worse; sometimes better) in the young-gen, but they also automatically promote the non-arena-friendly objects to the old-gen. Modern GCs - which are constantly evolving at a pretty fast pace - use algorithms that reprensent a lot of expertise gathered in the memory management space that's hard to beat unless arenas fully solve your needs.

PaulHoule · 2025-09-10T18:13:14 1757527994

For a lot of everyday programming arenas are "all you need".

pron · 2025-09-10T18:27:35 1757528855

> It’s the same as other layers we have like auto-vectorization where you don’t know if it’s working without performance analysis. The complexity compounds and reasoning about performance gets harder because the interactions get more complex with abstractions like these.

Reasoning about performance is hard as it is, given nondeterministic optimisations by the CPU. Furthermore, a program that's optimal for one implementation of an Aarch64 architecture can be far from optimal for a different implementation of the same architecture. Because of that, reasoning deeply about micro-optimisations can be counterproductive, as your analysis today could be outdated tomorrow (or on a different vendor's chip). Full low-level control is helpful when you have full knowledge of the exact environment, including hardware details, and may be harmful otherwise.

What is meant by "performance" is also subjective. Improving average performance and improving worst-case performance are not the same thing. Also, improving the performance of the most efficient program possible and improving the performance of the program you are likely to write given your budget aren't the same thing.

For example, it may be the case that using a low-level language would yield a faster program given virtually unlimited resources, yet a higher-level language with less deterministic optimisation would yield a faster program if you have a more limited budget. Put another way, it may be cheaper to get to 100% of the maximal possible performance in language A, but cheaper to get to 97% with language B. If you don't need more than 97%, language B is the "faster language" from your perspective, as the programs you can actually afford to write will be faster.

> Also, the more I work with this stuff the more I think trying to avoid memory management is foolish.

It's not about avoiding thinking about memory management but about finding good memory management algorithms for your target definition of "good". Tracing garbage collectors offer a set of very attractive algorithms that aren't always easy to match (when it comes to throughput, at least, and in some situations even latency) and offer a knowb that allows you to trade footprint for speed. More manual memory management, as well as refcounting collectors often tend to miss the sweet spot, as they have a tendency for optimising for footprint over throughput. See this great talk about the RAM/CPU tradeoff - https://youtu.be/mLNFVNXbw7I from this year's ISMM (International Symposium on Memory Management); it focuses on tracing collectors, but the point applies to all memory management solutions.

> Should have noted that Zig solves this by making the convention be to pass an allocator in to any function that allocates. So the boundaries/responsibilities become very clear.

Yes, and arenas may give such usage patterns a similar CPU/RAM knob to tracing collectors, but this level of control isn't free. In the end you have to ask yourself if what you're gaining is worth the added effort.

rudedogg · 2025-09-11T02:00:31 1757556031

I enjoy reading your comments here. Thanks for sharing your knowledge, I'll watch the talk.

> Yes, and arenas may give such usage patterns a similar CPU/RAM knob to tracing collectors, but this level of control isn't free. In the end you have to ask yourself if what you're gaining is worth the added effort.

For me using them has been very easy/convenient. My earlier attempts with Zig used alloc/defer free everywhere and it required a lot of thought to not make mistakes. But on my latest project I'm using arenas and it's much more straightforward.

pron · 2025-09-11T16:45:40 1757609140

Sure, using arenas is very often straightforward, but it also very often isn't. For example, say you have a server. It's very natural to have an arena for the duration of some request. But then things could get complicated. Say that in the course of handling the transaction, you need to make multiple outgoing calls to services. They have to be concurrent to keep latency reasonable. Now arenas start posing some challenges. You could use async/coroutine IO to keep everything on the same thread, but that imposes some limitations on what you can do. If you use multiple threads, then either you need to synchronise the arena (which is no longer as efficient) or use "cactus stacks" of arenas and figure out a way to communicate values from the "child" tasks to the parent one, which isn't always simple (and may not even be super efficient).

In lots of common cases, arenas work great; in lots of common cases they don't.

There are also other advantages unrelated to memory management. In this talk by Andrew Kelley (https://youtu.be/f30PceqQWko) he shows how Zig, despite its truly spectacular partial evaluation, still runs into an abstraction/performance tradeoff (when he talks about what should go "above" or "below" the vtable). When you have a really good JIT, as Java does, this tradeoff is gone (instead, you trade off warmup time) as the "runtime knowns" are known at compile time (since compilation is done at runtime).

chpill · 2025-09-12T16:36:10 1757694970

  When you have a really good JIT, as Java does, this tradeoff is gone

Is there a way to visualize the machine code generated by the JVM when optimizing the same kind of code as the examples shown in the talk you mention? I tried putting the following into godbolt.org, but i'm not sure I'm doing it right:

  public class DontForgetToFlush {
      public static void example(java.io.BufferedWriter w) throws java.io.IOException {
          w.write("a");
          w.write("b");
          w.write("c");
          w.write("d");
          w.write("e");
          w.write("f");
          w.write("g");
          w.flush();
      }
      public static void main(String... args) throws java.io.IOException {
          var os = new java.io.OutputStreamWriter(System.out);
          var writer = new java.io.BufferedWriter(os, 100);
          example(writer);
      }
  }

pron · 2025-09-13T13:57:20 1757771840

You can ask the JVM to dump the actual instructions (https://javanexus.com/blog/printing-assembly-code-hotspot-ji...).

Just note that there may be differences between the very old APIs (as in your example), and the newer NIO (https://docs.oracle.com/en/java/javase/24/docs/api/java.base...), and you need to pay attention to text output that undergoes characeter set encoding (as in your example) vs binary output.

vips7L · 2025-09-14T17:30:51 1757871051

You can use jit watch: https://github.com/AdoptOpenJDK/jitwatch

rkagerer · 2025-09-10T17:57:32 1757527052

Convention (as you report Zig does) seems to be a sensible way to deal with the problem.

> Also, the more I work with this stuff the more I think trying to avoid memory management is foolish ... It takes some experience, but I’d rather just manage the memory myself and confront the issue from the start.

Not sure why you're getting downvoted, this is a reasonable take on the matter.

rudedogg · 2025-09-09T19:54:50 1757447690

The tagging mechanism reminds me of generational indices like https://floooh.github.io/2018/06/17/handles-vs-pointers.html, but I’m a bit out of my depth

rudedogg · 2025-09-08T16:10:32 1757347832

This is the answer IMO. The number of targets and noise would be a lot less if JS had a decent stdlib or if we had access to a better language in the browser.

I have no hope of this ever happening and am abandoning the web as a platform for interactive applications in my own projects. I’d rather build native applications using SDL3 or anything else.

mrguyorama · 2025-09-08T19:44:30 1757360670

But this can't be the whole story. In the Java world, it's pretty common to import a couple huge libraries full of utility functions, but those are each one import, that you can track and version and pay attention to.

Apache Commons helper libraries don't import sub libraries for every little thing, they collect a large toolbox into a single library/jar.

Why instead do people in the javascript ecosystem insist on separating every function into it's own library that STILL has to import helper libraries? Why do they insist on making imports fractally complex for zero gain?

crabmusket · 2025-09-08T21:36:48 1757367408

Bundle size optimisation. See my comment upthread for more detailed explanation. Bundle size is one of the historical factors that makes JS ecosystem a unique culture, and I'd argue uniquely paranoid.

xd1936 · 2025-09-08T20:45:44 1757364344

I didn't used to be. It's just become less trendy to import a big giant Lodash, Underscore, Sugar, or even jQuery.

flomo · 2025-09-08T21:04:59 1757365499

Originally I think it was to avoid the applet experience of downloading a large util.jar or etc. (Not that most js devs really care.) However, I suspect the motivation is often social status on GitHub & their resume.

imiric · 2025-09-08T16:26:15 1757348775

To be fair, this is not a problem with the web itself, but with the Node ecosystem.

It's perfectly possible to build web apps without relying on npm at all, or by being very selective and conservative about the packages you choose as your direct and transitive dependencies. If not by reviewing every line of code, then certainly by vendoring them.

Yes, this is more inconvenient and labor intensive, but the alternative is far riskier and worse for users.

The problem is with web developers themselves, who are often lazy, and prioritize their own development experience over their users'.

palmfacehn · 2025-09-08T16:59:09 1757350749

I'm often surprised at the number of JS experts who struggle with the basics of the browser API. Instead of reasoning through the problem, many will reach for a framework or library.

PeterisP · 2025-09-08T21:10:52 1757365852

At least historically it used to be the case that you don't ever want to use the browser API directly for compatibility reasons but always through some library that will be a do-nothing-wrapper in some cases but do a bunch of weird stuff for older browsers. And traditions are sticky.

skydhash · 2025-09-08T18:29:27 1757356167

Especially with the MDN, an amazing resource.

rudedogg · 2025-09-07T06:12:03 1757225523

https://angelscript.hazelight.se/ for others.

Any thoughts on Verse? I’m not experienced with Unreal or in the ecosystem, but it looked like it might be too foreign to me. But Tim Sweeney is no dummy, so it’s probably good and just requires some effort if you’re not already a functional programming nerd?

rudedogg · 2025-09-07T02:43:53 1757213033

Do you still want to do this for SDF font textures? Or is the lossiness an issue?

masonremaley · 2025-09-07T03:28:20 1757215700

Good question. I don’t have any authored SDF content right now so take this with a grain of salt, but my thoughts are:

1. Fonts are a very small percent of most games’ storage and frame time, so there’s less motivation to compress them than other textures

2. Every pixel in a font is pretty intentional (unlike in, say, a brick texture) so I’d be hesitant to do anything lossy to it

I suspect that a single channel SDF for something like a UI shape would compress decently, but you could also just store it at a lower resolution instead since it’s a SDF. For SDF fonts I’d probably put them through the same asset pipeline but turn off the compression.

(Of course, if you try it out and find that in practice they look better compressed than downscaled, then you may as well go for it!)

[EDIT] a slightly higher level answer—you probably wouldn’t compress them, but you’d probably still use this workflow to go from my_font.ttf -> my_font_sdf.ktx2 or such.

xeonmc · 2025-09-07T06:20:48 1757226048

Tangential: where does pixel art textures fall under in this consideration for asset compression?

masonremaley · 2025-09-07T06:52:14 1757227934

That’s also a good question.

I personally wouldn’t compress pixel art—the artist presumably placed each pixel pretty intentionally so I wouldn’t wanna do anything to mess with that. By pixel art’s nature it’s gonna be low resolution anyway, so storage and sample time are unlikely to be a concern.

Pixel art is also a special case in that it’s very unlikely you need to do a bake step where you downsize or generate mipmaps or such. As a result, using an interchange format here could actually be reasonable.

If I was shipping a pixel art title I’d probably decide based on load times. If the game loads instantly with whichever approach you implement first then it doesn’t matter. If it’s taking time to load the textures, then I’d check which approach loads faster. It’s not obvious a priori which that would be without measuring—it depends on whether the bottleneck is decoding or reading from the filesystem.

rudedogg · 2025-09-07T05:45:53 1757223953

Makes sense, thanks for the insight.

exDM69 · 2025-09-08T08:09:09 1757318949

Not for multi-channel SDF at least. Texture compression works terribly badly with "uncorrelated" RGB values as they work in chroma/luminance rather than RGB. For uncorrelated values like normal maps, there are texture compression formats specifically for that (RGTC).

However, your typical MSDF font texture has three uncorrelated color channels and afaik there isn't a texture compression format with three uncorrelated channels.

TinkersW · 2025-09-07T15:46:38 1757259998

A single channel SDF can be encoded to BC4 with fairly good quality, and it can actually represent a wider range of values than a u8 texture... but with the downside of only having 8 values per 4x4 block.

So if the texture is small I'd use u8, for a very large texture BC4 isn't a bad idea.

rudedogg · 2025-09-05T21:50:45 1757109045

> The use of a ZIP archive to encapsulate XML files plus resources is an elegant approach to an application file format. It is clearly superior to a custom binary file format.

Can anyone expand on this? Why would it be better than a binary format?

I was watching a talk Andrew Kelley gave about a simple binary format he’s using in Zig: https://www.hytradboi.com/2025/05c72e39-c07e-41bc-ac40-85e83...

Having to map between SQLite and the application language seems like it’d add lots of complexity, but I don’t have any experience with custom file formats so would love some advice.

rudedogg · 2025-08-29T18:07:14 1756490834

I’m doing offline-first apps at work and want to emphasize that you’re constraining yourself a lot trying to do this.

As mentioned, everything fast(ish) is using SQLite under the hood. If you don’t already know, SQLite has a limited set of types, and some funky defaults. How are you going to take this loosey-goosey typed data and store it in a backend database when you sync? What about foreign key constraints, etc., can you live without those? Some of the sync solutions don’t support enforcing them on the client.

Also, the SQLite query planner isn’t great in my experience, even when you’re only joining on ids/indexes.

Document databases seem more friendly/natural, but as mentioned indexeddb is slow.

I wish this looked at https://rxdb.info/ more. They have some posts that lead me to believe they have a good grasp on the issues in this space at least

Also, OPFS is a newish thing everyone is using to store SQLite directly instead of wrapping IndexedDB for better performance.

jitl · 2025-08-29T18:48:54 1756493334

I've been a bit put off by rxdb's lack of transactions (see https://rxdb.info/transactions-conflicts-revisions.html) and the sometimes self-congratulatory tone in their docs.

Notion is a very async collaborative application and we rely on a form of transactions. When you make a change in Notion like moving a bunch of blocks from one page to another, we compose the transaction client-side given the client's in-memory snapshot view of the universe, and send the transaction to the server. If the transaction turns out to violate some server-side validation (like a permissions issue), we reject the change as a unit and roll back the client.

I'm not sure how we'd do this kind of thing with RxDb. If we model it as a delete in one document and an insert into another document, we'd get data loss. Maybe they'd tell us our app shouldn't have that feature.

nchmy · 2025-08-30T11:38:28 1756553908

Yeah, their tone can be a bit weird sometimes, but its by far the best browser db that I've found.

You could always ask your question in their discord - I've always gotten prompt and helpful responses

nchmy · 2025-08-30T11:36:53 1756553813

I am continually bewildered how no one ever gives RxDB, which has been around for many years longer than the rest of these tools, any love.

It has so many optimizations and features that the others dont. And is even better when you use the premium addons. I compared it to pretty much everything, and its not even close.

rudedogg · 2025-08-26T17:11:35 1756228295

Siri is dumb for an edge model though. It’s like ~2 years behind. It’s just bad and shouldn’t be excused, everything about it is bad.

mcpeepants · 2025-08-26T17:22:10 1756228930

Only ~2 years behind is generous, I think. It's unbelievably bad even when compared to Google Assistant circa 2018

rudedogg · 2025-08-21T13:58:16 1755784696

Metaverse people: “We’re so back!”

rudedogg · 2025-08-13T14:45:07 1755096307

All that feels like specialized stunts like IBM’s Watson beating Ken Jennings at Jeopardy.

The rate of improvement has slowed significantly. And chasing benchmarks is making everything worse IMO. Opus 4.1 is worse than Sonnet 3.7 to me :/.

I think the future will be:

1. Ads and quantization/routing to chase profits

2. Local models start taking over. New companies will slide in without the huge losses and provide what Claude/OpenAI do today at reasonable margins

3. Apple/Google eat up lots of the market by shipping good-enough models with iOS/Android