Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I never said the CPU reads one key at a time. I said it can decode hundreds of keys at the time it loads one from memory. This is completely irrelevant of how memory reads are batched. It's about a ratio, like 100:1 get it? Seems like you felt your ego attacked, and you just had to respond in a patronizing way about something, but didn't know about what.

Hashing a string as you read it from memory and jumping to a hash bucket is not an expensive operation. This entire argument sounds like some kindergarten understanding of compute efficiency. This is not a 6052.



Will ignore you this time )


Excellent work. Don't drop to my level where I made a simple factual statement.


lets sync on numbers maybe? My engine processes 600MB/s (mebabytes, not megabits) of data per core (I have very many cores) for my wire format, current bottleneck is that linux vpages system can't allocate/deallocate pages fast enough when reading from NVME raid.

What are your numbers for your cool json serializer?


I think maybe you possibly forgot what our argument was. I said the bottleneck is memory, and not processing/hashing the keys to match them to the symbol you want to populate.

And you're currently telling me the bottleneck is memory.

I also said you don't need to parse a JSON object to a hashmap or a b-tree. The format suggests nothing of the sort. You can hash the key and fill it into a symbol slot in a tuple which... literally only takes the amount of RAM you need for the value, while the key is "free", because it just resolves to a pointer address.

Additionally, if you have a fixed tuple format, you can encode it as a JSON array, thus skipping the keys entirely. None of that is against JSON. You decide what you need and what you don't need. The keyvals are there when you need keyvals. No one is forcing you to use them where you don't need them at gunpoint.

I have a message format for a platform I'm working on, it has a JSON option, just for compatibility. It doesn't use objects at all, yet (but it DOES transfer object states). Nested arrays are astonishingly powerful on their own with the right mindset.


> And you're currently telling me the bottleneck is memory.

Not memory, but virtual pages implementation in linux, which is apparently single threaded and doesn't scale to high throughput. There was a patch to fix this, but it didn't make to mainline: https://lore.kernel.org/lkml/[email protected]...


"High throughput" seems like an odd problem to have. You don't have to throw away pages and allocate new ones all the time. You can reuse a page.


That's not up to me. It is Linux virtual fs implemented that way




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: