Hacker News new | past | comments | ask | show | jobs | submit login

These functions are essential to a hash tables (other related names are hash maps, hash sets).

A hash table is a remarkable data structure that can be used to implement many algorithms simply and efficiently.

This efficiency depends on being able to create small (say 32- or 64-bit) and nearly-unique “hashes” for your data. For example, to hash user names, it wouldn’t work very well to just use the ASCII code for the first letter of the names, because then a lot of usernames would map to the same number. That’s called a collision. If there are a lot of collisions, hash tables become very inefficient.

A better approach would be to grab bits from the entire username, and smash them together somehow so that “throwaway_1237” and “throwaway_12373” still username different numbers.

Hash functions do this mapping. The property of “avalanche” describes how good of a job they do at avoiding collisions.

There’s generally a tradeoff between (1) how fast the actual hash function is, and (2) how good of a job it does of avoiding collisions.

World-class hash functions look remarkably weird, multiplying and xoring and shifting by strange amounts. It’s very hard for humans to look at these cryptic functions and guess how well they will do.

So this code is trying out a bunch of hash functions at random and baking them off against each other. This is cool because success here could improve real-world performance of a core data structure used across many languages and libraries.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: