Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: SHAllenge – Compete to get the lowest hash (quirino.net)
123 points by quirino on June 18, 2024 | hide | past | favorite | 133 comments
I've always had an appreciation for the properties of hashing. I especially like the concept of Proof of Work, so I tried to make a little competition out of it.


This is essentially how bitcoin's proof of work algorithm works. They maintain the difficulty by adding 0s as computers get faster.


Yep. So much compute and energy thrown away because proof of stake wasn't figured out sooner.


You need some proof of work to prime the pump for proof of stake. Otherwise, you have no security because there's nothing at stake.

Now that we have plenty of proof of work completed, we don't really need any more. You can bootstrap your new coin by burning Bitcoins as a provable value sink to get it started on the path to POS.


Did Ethereum end up moving to PoS? How did that work out?


PoS is a superior mechanism design than PoW from the fact that miners acting in bad faith in PoS have their stake burned so they can't keep on acting bad. While to mess with PoW miners only need to guarantee temporarily excess hash power - look up "Bitcoin Cash: 51% Attack" for examples.

Like everything in tech, the devil is in the details, algorithm details in this case, so far it seems to be working out just fine.


PoS at best makes a different tradeoff, it's certainly not strictly superior.

See criticism here: https://news.ycombinator.com/item?id=25007874


Yes ... And it's spawned even more innovation. Right now the L2s are building momentum!


You should also check out current L1s. Most stablecoin transactions aren’t happening on an Eth L2 - see “Total Transaction Size by Blockchain, Last 30 Days”

https://visaonchainanalytics.com/transactions


Huh, very nice! I'll see if the price of gas changed, thanks!


They should really be using renewables by now /s


You could stake fiat in a bank account I guess?


That's pretty much Tether. You could probably bootstrap a POS coin by staking Tether. There would be a lot of details in getting it to eventually grow independent of Tether. I don't claim to know how that could happen. Especially since Tether is not a highly-regarded/trusted token in the community.


lol what this nonsense. tether's regarded as the safest bc their biz model is providing entities usdt for usd without offering interest on the cash.

now why would any entities want to deposit usd into a private bank and receive usdt without any yield on their cash? you think you understand tether so well maybe think about it for a few minutes


Bitcoin still refuses to move to proof of stake. It's obscene https://digiconomist.net/bitcoin-energy-consumption


Ah yes, because explicitly admitting who controls the financial system is more efficient than using proof of waste to measure it indirectly.

I wish cryptocurrency folks would stop pretending it is about decentralization or distributed systems, and that anything that allows them to bypass financial regulation is good enough.


Isn't the reason more about limiting currency. Most of the crypto bros all hate the money printers.

The fact that faster hashers will come about just adds some natural competition, but it's still about limiting money supplies.


The hashing is about consensus. It's related how a low hash stands as proof of computational power, you can't just fake it.

The "adding more zeros" is done automatically, according to an algorithm, so that the time between blocks is more or less the same.

There's this incredible MIT course about all of this, it's where I got the idea for the site from: https://youtube.com/playlist?list=PLUl4u3cNGP61KHzhg3JIJdK08...


Thanks for the playlist! I've already been going through the later videos.


That's incorrect. The money supply is fixed irrespective of the hash rate.


But it halves every so often.


Yes, the block reward halves every 210K block or roughly every ~4 years, but that's independent of the hash rate. Bitcoin adapts to the hash rate so that blocks are produced approximately every 10 minutes. Eventually the total supply of 21M BTC will be reached after which new bitcoin issuance will cease.


But "it's still about limiting money supplies" is correct, no?

One's investment in mining equipment becomes half as profitable every so often.


No, mining profitability and money supply are distinct concepts. In fact, mining profitability constantly fluctuates depending on many factors aside from the block reward (e.g. electricity cost, total hash rate, equipment depreciation, BTC price, etc.).

Bitcoin could have been designed with a different money supply mechanism (e.g. no halvings) but it would still have required mining. The hash rate has basically no impact on the money supply. That's because mining doesn't solve the problem of limiting the money supply, it solves the problem of decentralized consensus, aka the double spending problem.


I understand the purpose of mining and how the hash rate adjusts but I'm trying to understand about the profitability.

Imagine you are a miner and a halving is next week. One week you earn 1btc. None of those other factors you mentioned have changed.

Then the halving happens so the next week you make 0.5btc. However yours and everyone else's held bitcoins are still worth the same as they were before.

Hasn't your mining profitability halved?


Given the scenario you proposed, yes it has.


Its about preventing double spending of the same bitcoin.


For comparison with Bitcoin:

    00000000 000003da 5849ade5 e2112447 73478c46 5fcdc744 9e247df4 11e97b28 (#1 of leaderboard)

    00000000 00000000 0002a2fa b0d010ce 270927e8 c9a698a3 4f5e3d6c 4dbcffd3 (latest mined block)
Doesn't seem that impressive but keep in mind that it gets exponentially harder to find additional leading 0s and that the Bitcoin network currently finds such hashes every ~10 minutes.


Could someone scan all the mined blocks, and find one that has a plaintext conforming to the submission requirements?

Edit: Looking at the data structure for a bitcoin block, for multiple reasons such as magic numbers and length, none of them would be a valid submission.


This would be roughly equivalent to finding a SHA256 collision then, right?


Before looking into the actual bitcoin block structure, and realizing this wasn't possible, I thought that maybe hashes were a hash of a hash.

A hash is essentially just some random bytes, there is a small chance that the first hash bytes would randomly all be printable characters that fit the submission format, given that the bitcoin network has already done most of the work finding the small hashes, scanning through the blockchain trying to find a hash with a plaintext that fits the submission criteria, should be much less work.


They are hashes of hashes, and in fact I already did scan all* the blocks to check if any of them meet the constraints, just in case (they don't, and it would be extremely unlikely but not entirely out of the question).

*as of a year or so ago (my archive is stale and I'm too lazy to update it)


No, you're looking for a suitable hash but not an exact match to anything. Some of the hashes are premined. But you do need the hash input to fit a specific format. That's still more likely and more accessible than a collision (I'm sure scanning all the available values is computable in a relatively shorter time than finding arbitrary collisions with a specific hash).


And compared to lowest:

    00000000 000003da 5849ade5 e2112447 73478c46 5fcdc744 9e247df4 11e97b28 (#1 of leaderboard)

    00000000 00000000 0000fac0 39d91fc6 1db532f3 879ec244 15a2bff9 cd1d4efb (latest mined block right now -- another zero added since your post)

    00000000 00000000 00000000 5d6f0615 4c868514 6aa7bc3d c9843876 c9cefd0f (lowest hash ever?)[0]
[0] https://crypto.stackexchange.com/questions/110956/what-s-the...


At some point this problem of finding a hash with a number of heading zeros flips from being increasing hard to brute force to being increasingly easier to search: http://jheusser.github.io/2013/02/03/satcoin.html https://github.com/jheusser/satcoin


Ah this brings back fond memories of the EngineYard challenge that consumed HackerNews in 2009[1][2][3][4].

From what I recall it was a bunch of YC founders writing janky cuda code vs djb who proceeded to smoke us all.

[1] https://news.ycombinator.com/item?id=704182

[2] archive link to the dead link in [1] https://web.archive.org/web/20090717104634/http://www.engine...

[3] https://news.ycombinator.com/item?id=716851

[4] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...


> Win and iPhone 3GS

Got a 404 but the title was a nice throwback, $2k in cloud credits too!


Anyone got a guestimate as to the energy used to make that leaderboard?


ChatGPT says a desktop CPU uses 105Wh for every hour of maxed computations. I think a hash with 8 leading zeros takes around an hour and every additional 0 takes 16x that. I'll ignore anything less than 8 0s and disregard how close the hashes are to gaining or losing a 0.

There's 13 8s, so that's 1365Wh. 5 9s is 8400Wh, 2 10s is 53760Wh, 3 11s is 1290240Wh.

Total that's ~1.35MWh, which might cost $100 to $180. Very weak estimate, though.


FWIW getting 8 0s took me 30 seconds, 9 0s definitely less than an hour, but after that I only got lower 9 0s, and 10 0s do not seem to be in sight


One hour later, it seems like you've got your 10th zero! :)

Luck plays a tiny part too, I'm testing sequentially and I got my 9th zero with only ~7e8 hashes, a whopping 100x earlier than expected.


Oh I didn't notice, I've written something so it automatically submits if it's a better value haha.

Yep, crazy how long it took, but currently I'm also running multiple 'variants'. All the same program except the nonce generation, one with numbers, one with 32 character long 'base64' and one with 64 character long 'base64'. I got inspired by seletskiy who instead of using numbers used the whole allowed character set for the nonce.


Why would the selection of characters matter? (Other than making sure you try distinct input strings)


It doesn't but I'm going sequential, so I needed different character sets for each instance I launched.


I guess either my computer or my code is really bad! I only got a mid-8 0s with multiple hours.


How many hashes does your code go through per second?

I just tried your HN username, and it took me these times:

19 seconds for 7 0s - nonce: 293576344

28 seconds for 8 0s - nonce: 436316829


I added some instrumentation and it was doing around 3.3 million hashes per second. Mine isn't sequential, it's generating random strings. Changing from a 64 byte string to a 26 byte string with (14 character nonce) brought it up to around 6 MH/s.

Given the nonces you provided, I guess you're going sequential and you're hashing 2-3 times faster than me. I just got a 7 0 hash in 70 seconds and an 8 0 hash in 164 seconds. You're right and I was overestimating how long I waited around last night.


After experimenting more with this, it seems like with random generation the bottle neck is the "input generation" and not the hashing itself, that's why I went with sequential.

I replaced my nonce generation with a static string and my hashrate went to the moon, so I invested more time in optimizing my input generation.


I profiled my code and you're right that it was using a big chunk of CPU on generating the input. Approximately 40-50% of the CPU time was taken up by go's rand.IntN function.

I was able to optimize it so that I only call rand.Int63 twice per string generation, bringing the CPU time of that part down to 10-15%. I'm getting 11.3 MH/s now!


There is really no point randomising at all, any cycles you spend on it are a total waste - one of the most important properties of a good hash algorithm is that a one bit change in the input should result in an output that's completely unrelated to the previous.

Your random number generator is pretty much equivalent to feeding a counter into SHA256, so using it as input is pretty much equivalent to hashing a counter twice. There's no point!


Yea, I know. It makes me feel better this way, though.


BTW, I changed it so it's iterative, getting 12.5 MH/s.


Curious. I tried with a C program that ChatGPT spat out using 12 threads of my CPU and I get the 8 zeroes almost instantly. It calculates 1 billion in about 13 seconds, apparently.

Start time: 1718742386

1718742387 SHA-256 hash of "NAHWheatCracker/486005487" is: 00000000cbed8e14c5e52b0c9bef443f017564aa6870da37f85ec92dd01b544d

1718742394 SHA-256 hash of "NAHWheatCracker/993155254" is: 0000000031fadb4805db8036ec66d872bdd9f04a507e0bbfea9f60d1d00b86f2

1718742395 SHA-256 hash of "NAHWheatCracker/436316829" is: 00000000e20d059e8a7dc687b85bd6b33e891f7d4b27e98aaf0c991d0264066e

End time: 1718742403

EDIT: To be clear, I'm not submitting anything to the leaderboard as I appreciate the spirit of the challenge is writing your own code. I'm just having some fun.

EDIT 2: Added some better timing. Interestingly, I modified it to use 64 bit integers and it was dramatically slower. Like, orders of magnitude slower.


Please would you post the code


Sure: https://pastebin.com/tFPadBVZ

Again, I didn't write it, but did add some timestamps.


The top competitors are very likely using GPUs, I guess that would make this cost much much lower...


At least I'm not (second place currently), I'm using a single threaded rust program that chatgpt wrote for me. I'm getting 10-15 million hashes per second.

But seletskiy is, as you can see in his nonce which tells us that he uses an RTX4090 and has 18 Giga Hashes per second. I'm really curious how he wrote that program, as I have a Rx 7900 XTX and would love to use it for this competition haha


Hey. It's nothing very fancy. About 150 lines of C/CUDA code with no deps, including args parsing and logging.

The code runs at steady rate of 18.00-18.40 GH/s at cloud GPU. In fact it's not hashes-per-second, but actually messages-per-second checked.

It launches a 64⁶ kernels in a loop, where each launch checks first two bytes of the SHA of a message concatenated with unique 6-byte nonce per kernel + 6-byte incremental nonce for each launch. There is only one block, so SHA algorithm is heavily trimmed. Also, most of the message is hard-coded, so pre-calculated SHA state is used; it's 10 loops less than needed to encode whole block. Since we only 2 bytes of the hash to check, last two loops also unrolled by hand to exclude parts that we wouldn't need. All code also in big-endian since SHA is, so message hardcoded in big-endian as well.

Base64-encoding is pretty time-consuming, so I've optimized it a bit by reordering the alphabet to be in ASCII-ascending order. I've got to the point where single binary-op optimization can earn 100-500 MH/s speed-up, and I don't really know what else here is remaining.

I don't have RTX4090, so instead I just rented 4090 GPU to run code on. It's less than $0.3 per hour.

I've tried GPT-4 to get some directions for optimization, but every single proposal was useless or straight wrong.

I by no means a C/GPU programmer, so probably it can be optimized much more by someone who more knowledgeable of CUDA.

GPU's are ridiculously fast. It freaks me out that I can compute >18,000,000,000 non-trivial function calls per second.

Anyways, if you want to chat, my e-mail is in the profile.


I think we've done something similar in our kernels, because I've likewise struggled to squeeze more than ~18GH/sec from a rented 4090. I think the design of SHA256 makes it hard to avoid many of the loop iterations. It's possible there are some fun GPU specific instructions to parallelize some of the adds or memory accesses to make the kernel go faster.

If you limit your variable portion to a base16 alphabet like A-P, radix encoding would just be `nibble + 0x41`. Sure you're encoding 4x as many values, but with full branch predictability. You'd have to experimentally determine if that performs any better.

You could also do something theoretically clever with bitmasking the 4 nibbles then adding a constant like 0x4141 or 0x00410041 or something (I'm too tired to think this through) and then you can encode twice as many per op assuming 32-bit bitwise ops are similar in speed to 16-bit bitwise ops.

Anyways this has been a cool challenge. You might also enjoy hunting for amulets - short poems whose sha256 hash contains a substring that's all 8: https://news.ycombinator.com/item?id=26960729


SHA256 is designed as such that the maximum amount of data that can be contained within a single block is 440 bits (55 bytes.)

If you carefully organize the nonce at the end and use all 55 bytes, you can pre-hash the first ~20/64 rounds of state and the first several rounds of W generation and just base further iterations off of that static value (this is known as a "midstate optimization.")

> If you limit your variable portion to a base16 alphabet like A-P

The more nonce bits you decide to use, the less you can statically pre-hash.

In FPGA, I am using 64 deep, 8-bit-wide memories to do the alphabet expansion. I am guessing in CUDA you could something similar with `LOP3.LUT`.


Thanks, I sent you an e-mail, you might need to check in spam folder, as gmail doesn't like my mail server.


I've also been spending today building a miner. Started in JS, then JS with worker threads (~2.2MH/s), then c++ with openssl (~3.7MH/s) and now attempting CUDA.

Also have been using ChatGPT to do the code translations.

I'm currently stuck in yak shaving, I cannot compile CUDA programs here yet as on fedora 40 I need an earlier gcc (13.2) which I am now also having to compile from source.


I'm also yak shaving currently, as I'm using NixOS... which means spending more time looking for the correct packages and creating environments.


FWIW I got it working with rust and opencl, most of it is written by chatgpt as I have no clue about opencl. GPU usage is only 50-60% and I get 100MH/s.

With hashcat and opencl I could get 12GH/s but I couldn't find a way to use hashcat for this use case.


I've parked CUDA for now, I don't understand it enough to care to fix the tooling.

Got an optimised C++ version with no deps averaging about ~24 MH/s on an i7-11800H.

I've got 9 zeros; if I get a result that ranks top 10 I think I'll submit and call it a day.


Wow 24MH/s on an i7 with 8 cores sounds really good!

I don't know how I got it working, but I'm now at 3GH/s with my OpenCL implementation. I basically converted 90% of my rust logic to opencl and now my GPU is at 100% usage and I also needed to switch to a tty, as my window manager became unresponsive haha

I'm kind of glad about this HN post, as I had absolutely no clue about how sha256 and opencl worked before this challenge.

Thanks @quirino


I'm glad you had some fun! This experiment went about as well as I could hope!

If anyone's curious, I'm getting 4.5MH/s single-threaded and 12.2MH/s multi-threaded on a slightly old i7 with 4 cores.

It's my own C++ implementation, which I've made about 20% faster than the fastest one I found online (Zig/stdlib, also tried Go/stdlib, C++/cgminer, rust/ring, C++/VanitySearch and Python/stdlib).

I think it might be faster just because I was able to skip some steps, since all inputs are short and of the same length.

I've just finished testing 10^12 inputs. I think I'll stop with 10 zeroes, which is very likely to happen in the next couple of days, according to my calculations. I might revisit it later to learn some GPU programming.


Very cool!

But same, I also was able to skip some steps as I can make some assumptions about the input as its fixed in length and doesn't need padding.

I got lucky and found a hash with 12 zeroes in 60s with my OpenCL implementation, and now I got my second spot back.

GPUs are so crazy


I'm using a janky CUDA miner that's getting around 6GH/s on a 3090. My code is kind of bad though since I hacked it together in a couple of hours last night, so I wouldn't be surprised if a proper/competent implementation got 2-3x the hashrate mine does.


Mind sharing it? I have an AMD card, so its not that much of use for me, but still would be interested


I'm using a single 4090 which nvidia-smi says is consuming 450W while hashing at 21GHps. Let's round that up to 667W for incidentals and to account for power supply inefficiency.

Running this for 1.5 hours would consume 1KWh of energy, and compute 1.134e14 hashes (or 2^46.7 if you prefer).

Assuming SHA-256 hashes are essentially normally distributed, that means that finding a hash with 13 leading hex zeroes should be a 1-in-(16^13) aka 1-in-(2^52) odds.

This gives 2^(52-46.7) or 39.7KWh on average to generate a single 13-zero hash, and ~16x that (635KWh) for a 14-zero hash. I got very lucky finding a 14-zero hash quickly, vastly earlier than average expectation, so my energy usage is quite a bit lower than that.


it's just for fun? :)


Fun or not, but I think the parent means electricity, i.e. how much kWh were wasted/spent on this


I like that you linked to a javascript miner. Makes it really easy to try instead of needing to write up some code.

Design/colour scheme is nice too.


I ran the javascript miner for like an hour, then I asked ChatGPT to write some go code that worked perfect (minus an unused variable error).

The go program beat the javascript miner within seconds. Right now it's that easy to get into the top 10.


Maybe I should include a warning that it is pretty slow.

I'm using my own SHA256 implementation in C++, and it is at least 100 times faster than the JS one. I've measured it to be about as fast as the one on the Go standard library.


Anyone who stumbles into your SHAllenge ought to know that JS isn't going to be really competitive, so I think you're good. A warning might encourage them to delve a little deeper. I like that it makes it very accessible.

I assume anyone with knowledge about GPU hashing and the will would be able to steal the top spot very quickly. Assuming seletskiy isn't already that guy. I don't have a good grasp on the compute difference between 8-9 0s and 10 0s in a reasonable amount of time.


I kept at it mostly as an exercise to test different compiler options. I found that a miner implemented in C runs in 1/5 the time of a Perl script (but at ~10 times more lines of code). Also, compiling with "-march=native" reduces the run time by ~10%, and concatenating all C files together to compile and link in a single step saves another ~0.5%.

I felt like I got enough out of this exercise and don't need to burn more CPU cycles on it :)


Assuming hashes are random (which is a reasonable assumption, considering it's sha256), every extra leading zero cuts the probability of success in half, meaning double the candidates searched to find a matching value. So each additional leading hex 0 would take on average 16x longer than the last.


Thank you!

Curiously most of the code for that JS miner was written on my phone, with the help of ChatGPT.

I didn't have my laptop with me and was curious how fast a phone browser would be able to mine.


Coding and testing JS with the help of ChatGPT is several multitasking levels above what I’d be willing to even try on a phone. Android, I assume? Big screen? BT keyboard even?


It was actually much easier than I expected. Just asked for it as a single HTML page, copied the output to a file and opened it on Chrome. Did very few adjustments.


Can you share that ChatGPT conversation, please? (There is a share link in ChatGPT.) Thank you.


I also find this concept fascinating and made something similar a while back [1][2] but came from a slightly different angle, where people can compete for the title text on the page, for colors and for pixels in a grid (like a small r/place).

It uses Peer-2-Peer in the browser and I haven't touched it in a while, but it looks like it still works. :)

[1] https://tropical.pages.dev/pow/

[2] https://news.ycombinator.com/item?id=38934607


Thanks for sharing! It's similar to mine in many ways.


Might as well ask, as tangentially (even so slightly) related: DAE know of a way to get/generate "vanity" (arbitrary or close) Bitcoin addresses?


Same way as vanity onion domains: brute force. Anything else would be a security flaw. If you're asking how to actually get one yourself, just search for "bitcoin vanity address" on github. There are plenty of projects to choose from.


Yes, vanitygen.


I've made a few changes in my software, use Base64 instead of numerical values, now I'm getting the following error: "Nonce must be 1-64 characters long and consist only of Base64 characteres"

Whih is really odd, since my string has only 51 chars and all chars are valid within the Base64 group.

If I remove the padding ("=") then it's good to go, however, there is a string in the scoreboard with "=" in it (garethgeorge/AHQAAAHPe0Q=)

Did the user bypassed the javascript check using curl or something?

Also, this could use some adjustments:

    const nonceRegex = /^[A-Za-z0-9+/]{1,64}$/;
    if (!nonceRegex.test(nonce)) {
        alert('Nonce must be 1-64 characters long and consist only of Base64 characters');
        return false;
    }
Personally, I would use const nonceRegex = /^[A-Za-z0-9+/]{1,64}(={0,2})$/;


The site owner changed equals from accidentally-allowed to forbidden, but left any existing solutions using it. See comment and reply here: https://news.ycombinator.com/item?id=40724707


Just got to 11 zeros with some beginner rust code! Gonna keep it running for a while to see if I can get higher than 24, but this was a ton of fun! And it was my first time doing parallel rust code!

I couldn't get wgsl to work properly, as it would need to have a sha256 implementation, and that's a bit heavy for a buzzed coding session. So, I used the simd commands, and that got me up to around 7mh/s on the m1, and a whole lot more on my desktop.

I will share the code once I get all the other optimizations I want in!


You can probably get a 2x speedup by shortening the string so that it fits in one block.


Using simd properly(I think) and lowering the nonce text, I went from that 7MH/s to 10MH/s

I have a bunch of other ideas to get me to my goal of 20, but I am happy to let this run for a bit! Thanks for the suggestion!


Ah, nice! I will shorten it after I finish the simd addition! I didn't even think of that. I was just overcomplicating my thread partitions to avoid collisions.


Wasn't expecting to stay in the top 100 for this long!

Initial C implementation got me to ~70 for a short time, then very sloppy racy pthreads one kept me at the tail end for a bit longer, but my again very sloppy cuda program running on my GTX1070 has me at 35 at the time of writing.

Rough calculation shows it to be doing 28.6MHashes/sec, and I suspect I could do a lot better than that if I knew anything about cuda programming.

I didn't read enough to know how to pick good values for blocks/threads per block, so I just benchmarked all the combinations of powers of two to arrive at the best result.

Really fun challenge!

Would be fun to see the total amount of valid submissions, and maybe the number of leading zeroes of a hash for those of us that need a moment to figure it out from hex in their head :]


Just lucked into 33rd place with 16 lines of single threaded C# doing ~1MH/sec.

Bug: The site says "nonce: 1-64 characters from Base64 (a-zA-Z0-9+/)" but it accepts "=" as well.

Reading the other answers about hand optimised CUDA and parallel Go and Rust, that's probably as high as I'll get.


I wouldn't worry, Go is not as fast as everyone believes it to be, particularly for tasks like these, unlike C# - there's a reason Nethermind Ethereum client uses pure C# without ever having to touch C or ASM (the performance critical parts of Geth, another Ethereum client that uses Go, are implemented in C and Go's "asm" instead).


Chose to fix that, but won't delete the submissions that used it. Thanks!


I used a classic cheat code to get a valid entry and a bit higher :D

    jodrellblank/710260141000+idkfa 
though why limit the filter to all zeroes, here's double-0 facade...

    jodrellblank/uhxbvhbaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

    00FACADE F2036CC41D004F2EB0F35B079C589D724D5A49BD796363C9A13C27D8


Would it make sense to have some partial/incremental SHA256 implementation? E.g. one that gives the first character only (in ASCII representation). So that you could have a cheaper money miner to narrow down the search space?

Is it even possible to have cheaper first character SHA256 algorithm?


SHA256 doesn't generate characters, it generates bytes that are usually rendered in hex form, which gives you the "first character".

There isn't a known way to calculate just the first bytes of a SHA256 hash faster. Each step in SHA256 depends on the value of the previous step, so you have to calculate the whole thing through. SHA256 is meant for cryptography and would be considered vulnerable to attacks if a shortcut were known.


>Is it even possible to have cheaper first character SHA256 algorithm?

What would be the goal of that? Do you want a cryptographic hash function, or do you want something else?


Well, such an algorithm existing would be a massive breakthrough in cryptography research. The answer should be no since that make it easier to do a preimage attack than brute-force.


No. Hash functions are specifically designed to make any shortcuts impossible. Essentially, they consist of many steps of scrambling the input bits, each step taking as its input the output of the previous one.

If any such shortcut would be found, it would be considered a vulnerability, and the use of SHA256 for anything involving cryptography would be strongly discouraged.


Huh, I somehow got #32 with my very naive multithreaded thing I quickly wrote in C. One thing I'm kinda amazed about is how quickly the M1 Max goes through 2^32 hashes, it only takes about 80 seconds with 10 threads.

edit: up to #28 ;)


I managed to get #69 and then #42 with some Go code I repurposed since I was messing around with HashCash a while ago.

Including the code below if anyone is curious. I managed to get a 9 zero hash in under an hour on a M2 Mac Mini using it. Its only about ~3.3 MH/s which is not very impressive, but it was very easy to write.

https://gist.github.com/boyter/8600199cc6f4073dc9da380f3224f...


Here's mine: https://gist.github.com/grishka/ed84e4bbfacfbcf4ddc5e63cfc96...

I suppose the next step for me would be to make use of the GPU, which is a much better fit for this job and I'm sure would increase my hash rate by orders of magnitude, but I researched it for a bit and running code on the GPU on a Mac is cumbersome to say the least.


I have always wanted to learn about GPU programming. Now might be the time.


I couldn't resist, so here's the GPU hasher: https://gist.github.com/grishka/c1be1c035f39564debfd4b195bdf...

It gets around 232.5 MH/s on the M1 Max. There's also an optimization that exploits the fact that SHA256 is incremental, so the state for the common prefix could be pre-computed and then copied and only updated with the variable part. This yields around 30% performance boost.


I'm able to get 250MH/s with this. M1 Pro


I made some improvements and it's now 340MH/s


I put my CPU code up here: https://github.com/neonz80/shallenge-x86 . It requires an x86 CPU with the SHA256 extension.

It manages 1.1 GH/s on an Intel i7-13700K.


60 MH/s with parallel Rust code (rayon) on a 5900X. Got stuck on 8 zeroes, changed my nonce prefix and got lucky with 9 zeroes after 10 seconds! Currently sitting at #73 and pondering whether to write a GPU solution.

https://gist.github.com/zdimension/63d3aa5f04573267f423520e4...


There are still some gains to be had on the CPU. I'm at ~570 MH/s on a 5950x


I'm a bit late to the game, but I managed to squeeze out about 1 GH/s on an i7-13700K.


Fun, no way I'm beating any of the top 10 scores though.


Getting into the top ten is still not too hard at this point, if you wanted to. I hacked together a super basic parallel CPU miner quickly before diving into CUDA, and the CPU one still gets a value that beats #7 within a few minutes.


This is my attempt in Rust. I am not a Rust programmer so it is probably horribly inefficient, but it works!

https://gist.github.com/Chaz6/81523edaed6f31aa609daa919ff6b8...


Also, Oi tudo bem @quirino?!? Will you be posting the number of competitors at any point? I would love to see how good my 24 is in the grand scheme!


Hey! Tudo bem :)

Sorry for the late reply.

I've added a "/data" endpoint where you can get the full database in json format.


I'm at about ~22GH/s per Xilinx VU9P UltraScale+ FPGA - at least 40GH/s is possible (for around 250W/device.)

The nonce alphabet being limited makes the whole thing quite a bit more expensive.


That's a beefy FPGA! Wish I had access to one to revive my 2009 era FPGA coursework knowledge but it seems the dev boards start at 5 figures which is too rich for a side project.


You can obtain them on capable boards on eBay for $600-1000 - https://www.ebay.com/itm/326181455270


Very fun ! I got to the first half of the leaderboard using Rust and ChatGPT.


Just saw you got 12 zeroes! Second place!

Did you start using a GPU or where you just insanely lucky?


Still CPU only.

This is my current solution: https://gist.github.com/lovasoa/9bb6c50feed7bb264c15ae65c622...


I am now second place, still running on CPU only on a laptop :)


I got an amazing one only to find out Zalgo is not supported EXỦMÂ :[


I'm sitting on just under 40MH/s on an M1 Pro with a multithreaded Rust implementation I whipped up. I doubt I can go much faster without using the GPU.


Man, I can't get higher than 8mh/s on my m1 Pro with multithreaded rust. I already got 11 zeros, but I am now more interested in beating at least 20mh/s.

Time to comb through this code :D


yeah I wonder what you could be doing differently. feel free to link it and I'll point out anything that could be the issue.

I just started on a metal implementation and it's at 320MH/s


Turning on the asm feature of sha2 has me on 230MH/s.


Suggestion: close submissions after it leaves the front page :)


Quirky and fun




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: