More

wengo314 · 2025-11-04T07:54:13 1762242853

what a trip down memory lane.

for some extra nostalgia, check out "one finger death punch 2" game (and its prequel). i bet it's sort of an homage to those animations.

wengo314 · 2025-10-29T09:15:27 1761729327

reject outright. ask to split it into reasonable chain of changesets.

wengo314 · 2025-08-27T08:01:43 1756281703

this looks like maintenance nightmare going forward, but i could be wrong.

If you are stuck on specific pg version for a while, maybe it's worth it.

nopurpose · 2025-08-27T11:27:22 1756294042

hash won't change on PG version upgrade, because that would cause a massive reshuffling by pg_upgrade: something PG managed to avoid so far.

wengo314 · 2025-04-16T07:09:44 1744787384

vibe coding could not have come at a worse moment.

redleader55 · 2025-04-16T11:23:07 1744802587

I see this as the perfect moment to get into consulting - either development, or security. People were not sure what jobs AI will create: "GenAI babysitting" is one of them.

sgt · 2025-04-16T08:09:39 1744790979

Just tell the AI: "Make this code secure" /s

wengo314 · 2025-03-19T08:52:39 1742374359

"Make one Ubuntu package 90% faster by rebuilding it and switching the memory allocator"

i wish i could slap people in the face over standard tcp/ip for clickbait. it was ONE package and some gains were not realized by recompilation.

i have to give it to him, i have preloaded jemalloc to one program to swap malloc implementation and results have been very pleasant. not in terms of performance (did not measure) but in stabilizing said application's memory usage. it actually fixed a problem that appeared to be a memory leak, but probably wasn't fault of the app itself (likely memory fragmentation with standard malloc)

FooBarWidget · 2025-03-19T11:22:42 1742383362

I did research into the glibc memory allocator. Turns out this is not memory fragmentation, but per-thread caches that are never freed back to the kernel! A free() call does not actually free the memory externally unless in exceptional circumstances. The more threads and CPU cores you have, the worse this problem becomes.

One easy solution is setting the "magic" environment variable MALLOC_ARENA_MAX=2, which limits the number of caches.

Another solution is having the application call malloc_trim() regularly, which purges the caches. But this requires application source changes.

https://www.joyfulbikeshedding.com/blog/2019-03-14-what-caus...

glandium · 2025-03-19T11:42:00 1742384520

The glibc memory allocator DOES have pathological cases that lead to what can look like memory leaks. See https://glandium.org/blog/?p=3698 https://glandium.org/blog/?p=3723 https://sourceware.org/bugzilla/show_bug.cgi?id=23416 (despite being rather old, it's also still a problem)

wengo314 · 2025-03-19T12:32:18 1742387538

FWIW i had it with icinga2. so now they actually preload jemalloc in the service file to mitigate the issue, this may very well be what you're talking about

in case someone is interested: https://github.com/Icinga/icinga2/issues/8737

(basically using jemalloc was the big fix)

https://icinga.com/docs/icinga-2/latest/doc/15-troubleshooti...

blablabla123 · 2025-03-19T12:40:08 1742388008

True, I also believed it for a second. But it's also easy to blame Ubuntu for errors. IMHO they are doing a quite decent job with assembling their packages. In fact they are also compiled with Stack fortifications. On the other hand I'm glad they are not compiled with the possibly buggy -O3. It can be nice for something performance critical but I definitely don't want a whole system compiled with -O3.

kllrnohj · 2025-03-19T13:30:18 1742391018

> with the possibly buggy -O3

-O3 isn't buggy on either GCC or Clang. Are you thinking of -Ofast and/or -ffast-math that disregard standards compliance? Those aren't part of -O3.

BoingBoomTschak · 2025-03-19T13:35:02 1742391302

-O3 itself isn't "buggy", but since it uses more optimizations, it can reveal issues in them. Other Gentoo users know: e.g. https://bugs.gentoo.org/show_bug.cgi?id=941208 https://bugs.gentoo.org/show_bug.cgi?id=940923 (search O3 in the bugzilla).

kllrnohj · 2025-03-19T13:45:02 1742391902

strict-aliasing, which is what caused that bug to manifest, is enabled at O2.

BoingBoomTschak · 2025-03-19T13:51:29 1742392289

Yep, it's what caused the bug to manifest, but who knows if that UB would have caused -O2 optimizations to mangle the result as well.

EDIT: first one is -funswitch-loops, though

kllrnohj · 2025-03-19T13:57:42 1742392662

Probably would have. There's no shortage of UB bugs revealed by O2 after all, including major security issues.

dietr1ch · 2025-03-19T09:19:12 1742375952

To me it's obviously a scam because there's no way such an improvement can be achieved globally with a single post explanation. 90% faster is a micro-benchmark number.

arghwhat · 2025-03-19T18:02:34 1742407354

This is neither a micro-benchmark nor a scam, but it is click-bait by not mentioning jq specifically.

Micro-benchmarks would be testing e.g. a single library function or syscall rather than the whole application. This is the whole application, just not one you might care that much for the performance of.

Other applications will of course see different results, but stuff like enabling LTO, tuning THP and picking a suitable allocator are good, universal recommendations.

looofooo0 · 2025-03-19T11:26:13 1742383573

True that, I mean it is still interesting, that if you have a narrow task, you might achieve some significant speed up from rebuilding them. But this is a very niche application.

wengo314 · 2025-03-19T12:34:51 1742387691

true, i saw a thread recently on reddit where guy hand-tuned compilation flags and did pgo profiling for a video encoder app that he uses on video encode farm.

In his case, even a gain of ~20% was significant. It calculated into extra bandwidth to encode a few thousand more video files per year.

cratermoon · 2025-03-19T15:02:34 1742396554

I wonder how many prepackaged binary distributions are built with the safest options for the os/hardware and don't achieve the best possible performance. I bet most of them, tbh. Many years ago I started building Mozilla and my own linux kernels to my preferences, usually realizing modest performance gains. The entire purpose of the Gentoo Linux distribution, e.g., is performance gains possible by optimized compilation of everything from source.

tonymet · 2025-03-19T16:39:17 1742402357

the title is clickbait, but it's good to encourage app developers to rebuild. esp when you are cpu bound on a few common utitilities e.g. jq, grep, ffmpeg, ocrmypdf -- common unix utils built build targets for general use rather than a specific application

UncleEntity · 2025-03-19T20:19:40 1742415580

Or, if I understand TFA correctly, don't release debug builds in your release packages.

Reminds me of back in the day, when I was messing around with blender's cmake config files quite a bit, I noticed the fedora package was using the wrong flag -- some sort of debug only flag intended for developers instead of whatever they thought is was. I mentioned this to the package maintainer, it was confirmed by package sub-maintainer (or whomever) and the maintainer absolutely refused to change it because the spelling of the two flags was close enough they could just say "go away, contributing blender dev, you have no idea what you're talking about." Wouldn't doubt the fedora package still has the same mistaken flag to this day and all this occurred something like 15 years ago.

So, yeah, don't release debug builds if you're a distro package maintainer.

margana · 2025-03-19T13:33:06 1742391186

I thought it would be something like recompiling to utilize AVX512 capabilities or something.

tremon · 2025-03-19T14:22:09 1742394129

Vector operations like AVX512 will not magically make common software faster. The number of applications that deal with regular operations on large blocks of data is pretty much limited to graphical applications, neural networks and bulk cryptographic operations. Even audio processing doesn't benefit that much from vector operations because a codec's variable-size packets do not allow for efficient vectorization (the main exception being multi-channel effects processing as used in DAW).

isotypic · 2025-03-19T18:04:18 1742407458

Vector operations are widely used in common software. Java uses AVX512 for sorting. glibc uses SIMD instructions for string operations.

tremon · 2025-03-20T23:56:42 1742515002

Thanks for the correction. I hadn't considered bulk memory operations to be part of SIMD operation but it makes sense -- they operate on a larger grain than word-size so they can do the same operation with less micro-ops overhead.

wengo314 · on Oct 28, 2024

* "if they want Adobe to be successful".

i wonder how this all win play out in the end.

wengo314 · on Sept 27, 2024

i think the problem started when quantity became more important over quality.

you could totally compete on quality merit, but nowadays the volume of output (and frequency) is what is prioritized.

wengo314 · on July 23, 2024

in case you don't know, some Gameboy games required to have Nintendo logo in the game data as part of copy protection. allegedly that was legal protection against bootlegs.

https://www.copetti.org/writings/consoles/game-boy/#anti-pir...

Playstation2 used something similar. ( https://github.com/mlafeldt/ps2logo )

I suppose it gave companies in question additional legal leverage - they could not distribute copies of games without violating the trademark laws.

wengo314 · on July 17, 2024

imho it's embarrassing that this got merged in the first place.

it's not a major flaw, and no exploit. but it seems as if nobody paid due attention to actual changes.

renewiltord · on July 17, 2024

Funny how everyone's competent in hindsight and eager to share that.

wengo314 · on July 17, 2024

this explot is basically "death by a 1000 papercuts", now that i think about it.

bufferoverflow · on July 17, 2024

More like assemble a sword from 1000 sheets of paper and stab everyone.