The “Build Your Own Redis” Book Is Completed

rcarmo · on Jan 29, 2023

I actually build minimal Redis clones in every new language or runtime, or when I want to explore threading models.

It all started with https://github.com/rcarmo/miniredis (which I forked to add and experiment with pub/sub), and I just found myself doing it again and again because Redis is the quintessential network service:

By implementing it, you learn about socket handling, event loop for a specific runtime, threading models, data representation, concurrency (if you want to do a multi-threaded version), etc. None of my "ports" are fully functional, but they all helped me sort out some of the above plus build tools, packaging, dependencies, etc.

It's "hello world" for core cloud native microservices, if you will (and without having to do REST or JSON stuff).

helge5 · on Jan 30, 2023

I once built one in Swift 4, in part to see how multithreading via SwiftNIO and using copy-on-write datastructures would compare. It held up well against the C implementation, would be worth trying again against Swift 5.x. https://github.com/NozeIO/redi-s

mLuby · on Jan 29, 2023

I'm curious if you re-use the same tests across different languages? And is it actually on a network or are you pretending?

krenel · on Jan 30, 2023

Shameless plug; I've been playing with a [Redis Server implementation in Go](https://github.com/jan-carreras/ddia) for the past weeks. Mainly as a way to try out things explained in Designing Data-Intensive Applications book (favourite of mine!). Those are the [commands implemented](https://github.com/jan-carreras/ddia/blob/master/commands.md), + TTL + AoF files (for state replication) + config file, ... The "challenge" was to do it without any external dependency other than go stdlib.

> I actually build minimal Redis clones in every new language or runtime, or when I want to explore threading models.

100% agree with your advice; I'll definitively try to implement other parts of the Redis service in Go (eg: pub/sub, replication, clustering...) and probably repeat the same exercise when learning any new language.

rapnie · on Jan 30, 2023

It might be nice to have Build Your Own web-ring.

_joel · on Jan 30, 2023

> web ring

There's a name I've not heard in a while!

rapnie · on Jan 30, 2023

It was Fedi Ring that made me think of it.

https://fediring.net/

sillysaurusx · on Jan 29, 2023

RESP3 is really nice too. It should probably be required reading.

There’s a lot of interesting / subtle design choices worth studying.

https://github.com/antirez/RESP3/blob/master/spec.md

The best parts still haven’t been implemented.

rcarmo · on Jan 30, 2023

Depends on how much time I have. The Python tests work, so I’ve re-used them a couple of times. Sometimes I also explore the native testing.

Timpy · on Jan 30, 2023

I saw this "Build Your Own Text Editor"[0] on HN a month or two ago, everyone was raving about it so I went through it and it really was fantastic. The learning experience was unparalleled. I'm a believer in the idea of "Build Your Own..." guides now, I hope this Redis guide is just as good as the kilo text editor. I'm definitely bookmarking this for a deep dive when the time is right. Any other top notch "Build Your Own" recommendations would be highly appreciated.

[0] https://viewsourcecode.org/snaptoken/kilo/

mbwgh · on Jan 30, 2023

The site you linked (which is great by the way, thanks) seems to already be a great starting point: https://viewsourcecode.org/snaptoken/similarTutorials.html

sarupbanskota · on Jan 30, 2023

Here's a whole repository. https://github.com/codecrafters-io/build-your-own-x

Jolter · on Jan 29, 2023

While I applaud the initiative and effort, I want to urge the author to consult a good proofreader. I’m only in chapter 2 yet I’ve already encountered enough friction in the form of poor English to put me off. I think the substance looks great, so it’s a bit wasteful to not walk the last mile and make it genuinely enjoyable to read by eliminating blatant errors like this one:

“The Redis is an example of the server/client system.”

jamil7 · on Jan 30, 2023

I noticed that in the blog post announcing the book and that was enough to put me off.

m00dy · on Jan 29, 2023

I would ask chatgpt to refactor his text rather than waiting him to do so... I think poor english as an excuse will be a thing of past.

travisgriggs · on Jan 29, 2023

Instead of saying “the singer sounds so autotuned” we’ll soon be whining

“the text reads too autogpt”

(I propose a pronunciation of “auto-jipped”).

chimeracoder · on Jan 30, 2023

> (I propose a pronunciation of “auto-jipped”).

Probably want to pick another term, since that's a homonym of a racial slur.

travisgriggs · on Jan 30, 2023

Sigh. I had to actually go look this up. Sometimes it's just easier not to talk, because all of the skeletons of old normalized words come to light. I truly was not trying to slur anyone, and apologize if my (unlucky) attempt at humor resonated sour for anyone.

ly3xqhl8g9 · on Jan 30, 2023

I propose adding some Unicode sign like the double turnstile [1], ⊨ (ALT+8875), which can signify "is a semantic consequence of" or "is stronger than", after a polysemic word to imply that no racial/socially derogatory connotation is to be derived from the word in that phrase, for example, "the text reads too auto-jipped⊨" or "the master⊨ branch is fubar" in a Git context.

Or perhaps an even better sign would be the negated double turnstile, ⊭, "to denote the statement 'does not entail'" [1], making it more explicit. Hence an example would look like "the text reads too auto-jipped(⊭racism)", which can be read "the text reads too auto-jipped and this word, auto-jipped, does not entail racism in this context". Ok, done, racism solved, your move David Guetta [2].

[1] https://en.wikipedia.org/wiki/Double_turnstile

[2] https://www.youtube.com/watch?v=dEI7oX0XxJw

toyg · on Jan 30, 2023

That would be abused faster than you can type "gamer".

madeofpalk · on Jan 30, 2023

I would rather poor English than confidently completely factually incorrect.

Kiro · on Jan 30, 2023

Why would it be factually incorrect? You tell it to proof-read, not add stuff. It works great.

madeofpalk · on Jan 30, 2023

ChatGPT does not know how to proof read. It is not an artificial inteligence. It will take your input and determine the most likely response matching the data it was trained with.

Just like how ChatGPT fails at simple math - ChatGPT doesn't know math https://ai.stackexchange.com/questions/38220/why-is-chatgpt-...

williamcotton · on Jan 30, 2023

I’ve learned over the last few months that there are certain trigger words that instigate basically pointless discussion around e.g. “what it means to learn”.

I can now add “read” to that list. Let’s play word taboo! The rules are we can’t talk about GPT using anthropomorphic terminology.

Does GPT predict less than useful mathematical computations? Yes, and not just less than useful but basically useless.

Does GPT predict less than useful language translations, ranging from English-French, to summaries, in-the-style-of, etc? No, it’s actually quite useful as when confined to only the information contained in a prompt it doesn’t have to hallucinate an answer.

It is not useful to anthropomorphize the functionality of these tools in either a practical or legal context.

And everyone pick up a copy of Philosophical Investigations by Wittgenstein so y’all can learn about how to avoid snake-eating-tail discourse.

madeofpalk · on Jan 30, 2023

Of course you're more than welcome to use whatever tech for whatever purpose.

Personally, I'm not confident that ChatGPT wouldn't hallucinate facts when prompted to 'just' proof-read.

I would rather poor English than confident factual errors.

williamcotton · on Jan 30, 2023

For example, here's a section from the Build Your Own Redis book:

Our server will be able to process multiple requests from a client, to do that we need to implement some sort of “protocol”, at least to split requests apart from the TCP byte stream. The easiest way to split requests apart is by declaring how long the request is at the beginning of the request. Let’s use the following scheme.

The protocol consists of 2 parts: a 4-byte little-endian integer indicating the length of the following request, and a variable length request.

Starts from the code from the last chapter, the loop of the server is modified to handle multiple requests:

GPT suggested this instead:

Our server will process multiple requests from a client by implementing a protocol to separate requests from the TCP byte stream. The simplest method for separating requests is to include the length of each request at the start. The protocol consists of two parts: a 4-byte little-endian integer indicating the length of the request and a variable-length request. The server code from the previous chapter has been modified to handle multiple requests in the following manner:

There are no hallucinated facts because the most probable continuation of the given prompts is one that can gather all required information from the original text itself.

It's sort of like the difference between the truthfulness of analytic and synthetic claims. An analytic claim would be like "It is raining and you're outside, naked, and unsheltered so therefor water is falling on your skin from the sky." A synthetic claim would be like "It is raining outside".

Synthetic claims are said to be contingent on facts outside the text itself. These are the cases where GPT is completely useless.

The error rate for analytic claims is much lower although anyone who is writing anything should do a lot of review before publishing. Think of it like you asked your assistant to write something. You're gonna wanna read it over before you slap your name on it.

I mean, I actually don't care if you use these tools or not but your explanation of how it works will guide other readers in the wrong direction so I feel the need to correct the narrative you've presented.

chillfox · on Jan 31, 2023

What prompt did you use? and how big chunks of text would you say it can handle reliably.

williamcotton · on Jan 31, 2023

Me: Hey, I need some help cleaning up some language with a lot of grammatical errors, can you help?

GPT: Yes, I'd be happy to help you clean up language with grammatical errors. Please provide the text for me to review.

Me: Our server will be able to process multiple requests from a client, to do that we need to implement some sort of “protocol”, at least to split requests apart from the TCP byte stream. The easiest way to split requests apart is by declaring how long the request is at the beginning of the request. Let’s use the following scheme.

The protocol consists of 2 parts: a 4-byte little-endian integer indicating the length of the following request, and a variable length request.

Starts from the code from the last chapter, the loop of the server is modified to handle multiple requests:

GPT: "Our server will process multiple requests from a client by implementing a protocol to separate requests from the TCP byte stream. The simplest method for separating requests is to include the length of each request at the start. The protocol consists of two parts: a 4-byte little-endian integer indicating the length of the request and a variable-length request. The server code from the previous chapter has been modified to handle multiple requests in the following manner:"

---

That's the entirety of the interaction!

I haven't tested much but for the last day or so I've been thinking a lot about Kant, Frege, Quine and Wittgenstein!

GPT opens the door for some kind of empirical philosophy... like, what are the error rates for various kinds of tasks? Can we use a Kantian framework? How about Frege? How about Quine?

I mean, Quine is actually my favorite of the analytic philosophers because of his indeterminacy of translation argument and the notion that there really is no analytic/synthetic divide when you get down to it resonates well with me.

Death to metaphysics!

But there seems to be some use in differentiating between "All bachelors are unmarried" and "All bachelors are unhappy" if only because I'm now seeing how making a distinction can have a profound impact on the usefulness of GPT completions.

Briefly and half-assed, Quine's argument is that because you would have to be familiar with language and culture in order to understand "All bachelors are married" that the meanings of those words are fact-like and outside the scope of the proposition.

If GPT is able to do some Frege-like substitution of synonyms it is because it has this compressed language model which seems to lend credence to Quine's arguments.

chillfox · on Feb 1, 2023

Thanks,

I find the quality of answers you receive out of GPT drastically changes with the way you phrase questions.

I don't think I would ever have come up with asking the question in the way you did.

As someone who has english as a second language I found GPT ofthen produced incorrect and low quality answers while most of my native english speaking colleagues were getting high quality answers. Looking at their prompts compared to mine it's all down to differences in how questions are phrased.

Kiro · on Jan 30, 2023

Still works great to ask it to proofread so whatever point you're trying to make is irrelevant.

ohjeez · on Jan 30, 2023

When I see poorly written prose, it makes me distrust the technology imparted. If you can't take time to get the grammar right, why should I believe that you got the code right?

kybernetikos · on Jan 30, 2023

There are many signals you can use to help you judge the value of a written piece of text. In this case, it reads to me much more as the writing of someone for whom English is not their native language than of someone who doesn't know what they're talking about technically.

If you want to pre-judge all technical content coming from people who have not spent huge portions of their life living in an English speaking country as being of no value, then I'm sure that will protect you from some bad content, but you're going to be missing out on an awful lot of good stuff too.

ilyt · on Jan 30, 2023

That's like complaining book writer site have shitty JS.

"If you can't write code why should I believe that the prose you write is any better?"

That being said they are selling it and that's enough reason to complain.

huimang · on Jan 30, 2023

Not everyone speaks English as their native language.

KptMarchewa · on Jan 30, 2023

True, but looking at author name - James Smith - it does not seem to be the case.

kybernetikos · on Jan 31, 2023

Potentially, but on the other hand, James Smith is exactly the sort of name you'd choose to use in the Anglosphere if your birth name was difficult to pronounce for English speakers.

tarkin2 · on Jan 30, 2023

It's possible to be good at coding yet poor at grammar.

aejnsn · on Jan 29, 2023

I am absolutely with you. There’s a certain large company I work for who has public documentation full of stuff like this. It makes me insane when I find it so glaring. There’s an internal program to proofread and review, but quantity is incentivized, not quality. I’ve tried to shove that program in a better direction to no avail.

suncherta · on Jan 29, 2023

I am very happy to see more content like this (attempting to teach how to create a real system) becoming available. Few month ago there was a launch announcement of YC backed startup that sell you walk-through labs guiding you to build Redis, Docker, Git and others. (Can't find link to it now.)

What is still missing, in my opinion, and is badly needed, is content or even an idea on how to teach taking such projects from toy prototype version to the production quality one.

DylanSp · on Jan 29, 2023

You're probably thinking of https://codecrafters.io/. I've been meaning to check them out with my education stipend from work, just haven't had the time to devote to it.

x86hacker1010 · on Jan 29, 2023

So expensive IMO, only reason I'm avoiding it.

fcanela · on Jan 30, 2023

$83/mo or around $994 annually is crazy way too much for me. I hope them the best because I love that kind of content and want it to be more popular, but I will wait until a better priced option comes.

forgotpwd16 · on Jan 30, 2023

There's* a free plan allowing a stage daily. With current content (all languages, all stages) someone can do everything in <9mo. But challenges are same across languages. If only want to do challenges ("build X clone") once in one language can do it in <1.5mo.

*Saw in sibling comment this is recent addition. Without it, yeah, it may be a bit expensive for one person.

lsferreira42 · on Jan 30, 2023

Same reason here, if i convert that value to my current currency it will be half of the living wage in my country

giancarlostoro · on Jan 30, 2023

This is more than any software engineering learning subscription I have currently per month.

sarupbanskota · on Jan 30, 2023

We now have a free tier, so you should be able to try it out :-)

KptMarchewa · on Jan 30, 2023

Please consider similar pricing structure as JetBrains. If you pay for it, the price is X, but if your company pays for it, or you're somehow reimbursed for it, the price is 5X.

sureglymop · on Jan 30, 2023

Do you provide educational licenses/pricing? It looks really interesting but I can't afford it as a student.

sarupbanskota · on Jan 30, 2023

We do. Feel free to email me sarup@codecrafters.io

sureglymop · on Jan 30, 2023

Will do, thank you for replying :)

x86hacker1010 · on Jan 30, 2023

It’s free, but isn’t it one lesson per day?

sarupbanskota · on Jan 30, 2023

Correct. For many of our users, doing one stage per day is just the right amount of exercise (especially on the later stages — as it gets more challenging)

x86hacker1010 · on Jan 30, 2023

Not to by hyper critical but that is just not enough to support learning for more ambitious people. There are too many great learning communities that are way cheaper. I really hope you guys revise the pricing structure. At a $20mo price point, I know a ton of people that would buy it. At the current price point, well, it’s just people with a learning stipend at big corp that don’t like to buy books.

Great idea, but only accessible to the rich.

robertlagrant · on Jan 30, 2023

> only accessible to the rich

I think this is a bit much. I wouldn't buy it, but I think this is deep technical content that you might not stick on forever, and they need to pick a price for it. If it's too high or low they'll find out soon. No need for any other inputs than that.

lprd · on Jan 29, 2023

Curious if anyone else has checked out their content yet? If so, hows the quality?

pizza234 · on Jan 29, 2023

The idea is very good - the tests approach is effective and stimulating; they also provide a range of products to study/experiment with. However, I think that the approach to the learning material is "love it or hate it".

By design, the service doesn't provide any documentation; it provides references to existing technical documentation (of any kind, including blog posts).

Those who expect a focused introduction to each topic will find it very tedious or hard to proceed (for example, the SQLite exercise has important details buried in a very large and confusing webpage), and likely hate it; those who like the challenge of understanding loads of raw documentation will love it.

giancarlostoro · on Jan 30, 2023

Sounds like I might as well not pay them a penny though if its research that I can do myself instead.

pizza234 · on Jan 30, 2023

It's more complex than that. People may need stimulation in order to follow through a certain learning process.

Providing a structured path and (automated) test means can be stimulating, and can make the difference between (deciding to) learning something or not.

Some people are certainly entirely autonomous, but at the very least, there is a spectrum of need for stimulation when approaching a topic of study.

giancarlostoro · on Jan 30, 2023

I get you, but, I can get subscriptions elsewhere with way more resources for significantly less.

pizza234 · on Jan 31, 2023

AFAIK there are no other services that provide multiple languages, automated testing and team features, but if you know of any, it's certainly useful to report them in this thread.

Exercism is the closer service I can think of, but it's based on simple exercises, not real-world projects.

There are a few books that have a similar target (build X), but much narrower in scope (either a single language or pseudocode, and certainly no automated testing/team features).

codeisawesome · on Feb 2, 2023

I’d love a Manning project subscription but it doesn’t exist..

deepGem · on Jan 30, 2023

I have finished their build your own Redis exercise. It is very well organized and there are hints/discord groups in which you get sufficient help if you make the effort.

This model is great for someone who loses patience with all the groundwork setup.

However, I do agree that the next leg is taking this MVP/Prototype level to production and ideally sell it as a real alternative to the commercial version of Redis.

suncherta · on Jan 29, 2023

Yes, that's the one. Thank you.

tikkabhuna · on Jan 29, 2023

Sorry for being a little off topic, but the Mini Redis tutorial [1] was really fun when I did it a few years ago. It has you implement a server and client in Rust using the Tokio library.

I think Redis is a great server to “build yourself” as you don’t need to start with much to get it going.

[1] https://tokio.rs/tokio/tutorial/setup

cormacrelf · on Jan 30, 2023

If anyone is looking at this stuff and thinking it’s intimidating then think again: Redis is mostly a lot of fuss around a hash map. There are cool features sure but the idea is mainly “what if a hash map were accessible from two or more servers”. If you only have one or two servers, you only want a cache, and you don’t anticipate scaling up soon (be honest) you can just use a hash map with expiry checking and not have to learn how to deploy another piece of software.

tayo42 · on Jan 30, 2023

Redis offeres alot more then just a key value store. The fuss is the data structures it provides.

And just using a hash map in memory isn't sufficient. You'll have unbounded growth. You need a max size and need to evict. Then you want to do that efficiently so your not wasting space on useless keys...

HideousKojima · on Jan 30, 2023

And some languages will even have this built in to the standard library, like C#'s MemoryCache class.

cormacrelf · on Jan 30, 2023

IIRC there’s a nice abstraction in .NET Core for caching, which has a swappable backend that means you can upgrade from MemoryCache to Redis if you ever need it. If only all software was this good. One of the least-leaky abstractions of all time.

zX41ZdbW · on Jan 29, 2023

It is nice to see more on "build your own database" topics.

Although people constantly bug me on attempts to build their own ClickHouse. Someone is trying to do it with Apache Arrow and DataFusion. Folks from DuckDB are trying to build their own crippled version of ClickHouse. Friends from China doing it with their Apache Doris. InfluxDB is being rewritten to be closer to ClickHouse in an attempt to make it better and so on...

joinbrita · on Jan 30, 2023

Incase you're unaware, InfluxDB IOx (the rewrite) is built on Apache Arrow and Datafusion, and is the primary driver behind the project.

I also think calling DuckDB a crippled version of Clickhouse disingenuous. That's like calling SQLite a cripped version of postgres. They have very different goals.

That's also a bit like calling Clickhouse a build-your-own vectorwise/MonetDB because they did it first.

tidwall · on Jan 29, 2023

Maybe I'm missing something but the server application resulting from the source code in this book does not actually work with redis-cli or any Redis clients. It appears to be using a custom binary protocol, not valid RESP. I feel that the RESP protocol itself is the most challenging task of building ones own Redis.

x86hacker1010 · on Jan 29, 2023

It says on the site:

> The end result is a mini Redis alike with only about 1200 lines of code. 1200 LoC seems low, but it illustrates many important aspects the book attempts to cover.

The techniques and approaches used in the book are not exactly the same as the real Redis. Some are intentionally simplified, and some are chosen to illustrate a general topic. Readers can learn even more by comparing different approaches.

I wouldn't emphasize the importance is understanding Redis per-se but the ideas around a system like Redis.

kajaktum · on Jan 30, 2023

What? I genuinely think RESP is the easy part. I used to use your redcon library to get bootstrapped but then rewrote it from scratch and it was really easy. Unfortunately Golang doesnt permit good abstraction sp i have to write the same thing over and over again.

avinassh · on Jan 30, 2023

I am a big fan of Build Your Own X educational projects. Recently, I released a Go version of my build your own KV Store project. I have set up this project in TDD fashion with the tests. So, you start with simple functions, pass the tests, and the difficulty level goes up. There are hints if you get stuck (e.g. link). When all the tests pass, you will have written a persistent key-value store.

go - https://github.com/avinassh/go-caskdb

python - https://github.com/avinassh/py-caskdb

BossingAround · on Jan 30, 2023

Is there a public repo we could send PRs/MRs/patches to? The prose has room for improvement and since the book is free, I'd love to send a couple of PRs your way.

akie · on Jan 30, 2023

This take on "building blocks of modern computing" is fairly opinionated though.

They mention "NGINX, SQLite, PostgreSQL, Kafka, Linux kernel, etc." - of which I'd only consider NGINX and the Linux kernel (and Redis) as "building blocks". The others might be part of their own preferred stack, but if you mention Postgres, why not MySQL? If Kafka, why not RabbitMQ?

But yeah, NGINX, Redis, and the Linux kernel are basically outside of discussion.

chillfox · on Jan 31, 2023

They are all arbitrary (including nginx) in the sense that there are many stacks out there that don't use any of them. But I don't think it's a controversial selection as they are all extremely popular.

viksit · on Jan 29, 2023

i love this! is there a book like this to learn rust with? it would be amazing to have a working system at the end. (not looking for a beginner, but say, intermediate level book).

dairem · on Jan 29, 2023

You could try https://www.zero2prod.com/ , which walks through building an newsletter service in rust.

osynavets · on Jan 30, 2023

I've created a repository for all tutorials that I found in rust [1] I guess that's what you're looking for

[1] -https://github.com/osynavets/build-your-own-x-in-rust

linhns · on Jan 30, 2023

Thanks for all your work finding these. Great job!

x86hacker1010 · on Jan 29, 2023

distributed services in go is intermediate/advanced programming book where you build a commit log like Kafka. It's pretty intense. VERY fun

dandrust · on Jan 30, 2023

You should check out Rust in Action. It teaches rust through a bunch of system programming projects.

thebamby · on Jan 29, 2023

One of the top level comments recommends a tokio.rs tutorial. See here: https://news.ycombinator.com/item?id=34573028

pull_my_finger · on Jan 30, 2023

Purchased this via Gumroad, but the epub version seems to not be formatted correctly. After about page 3 it decides the book is finished on my Kobo. Hopefully the author will get a fix out, I'm pretty excited to read it and pdfs are pretty hard to read on my device.

weird_user · on Jan 30, 2023

Sorry to hear that. I don't have a Kobo so I can not troubleshoot this. Would a small-sized PDF be readable on your device?

pull_my_finger · on Jan 30, 2023

k2pdfopt actually has an option for my model, cropping some of the excess margins helps a lot, but I will say epubcheck found some invalid markup in there, so it could be related to that.

weird_user · on Jan 30, 2023

I have updated the EPUB on Gumroad. Would you mind checking it again? I might add a PDF version with bigger a font size if it's not working.

pull_my_finger · on Jan 30, 2023

I tried and it still doesn't work. File shows 1/1 pages, which is blank.

weird_user · on Jan 30, 2023

Added a cropped PDF with a larger font size. Hope that will work.

messe · on Jan 29, 2023

Looks neat and relatively self contained. I’m adding this to my list of projects to use to learn a new language. I might give it a shot in Kotlin and Zig (I know it and use it, but am not as familiar with its event loop implementation as I’d like to be) over the next few weeks.

sandgiant · on Jan 30, 2023

In the first code snippet in "Introduction to Sockets", shouldn't it be `close(conn_fd)` in the while loop?

That confused me a bit. Otherwise this looks interesting, thanks for sharing.

varispeed · on Jan 29, 2023

One of my pet peeves when I was learning programming is that I couldn't find any books that would teach how to write tests in real world.

There was that chicken and egg problem then, when potential employers would skip my CV because I didn't have any TDD experience and I couldn't find anywhere how to learn this.

Before I even learned that something like testing exist, I was so confident like "oh I could write something like that over the weekend, how come they needed months to do that?". Then my software would crash the first time someone other than me used it.

Anyway - what I want to say that while this book sounds like a great idea, without showing TDD and how to write code so that it can be proven it works the way intended and that it can handle unhappy paths and edge cases, it won't teach someone trying to learn programming much and it doesn't actually stand over about million other books about programming that really just scratch the surface and don't show how to write production ready code.

That's what is still missing on the market. It's almost like a well kept secret that only developers working at large corporations know.

That skill was very very difficult to acquire.

bilekas · on Jan 29, 2023

While i have so many things i could say about your idea of what development means and the "only path" you seem to subscribe to. I think you're missing the point of the book.

> it won't teach someone trying to learn programming much and it doesn't actually stand over about million other books about programming that really just scratch the surface and don't show how to write production ready code.

It's not a book to teach people how to program, infact the author goes out of their way to mention only C and minor c++ has been used and that it may be beneficial for learning how to build out such a POC of redis, to DIY your own.

This is not a book to hand over to an outsourcing company and expect production ready work. Nor was it described as such.

cormacrelf · on Jan 30, 2023

It’s right there in the name TDD. You write a test and then you change the code to make the test pass. There are no secrets, that’s all there is to it. If you can do that you can do TDD. That’s why everyone else is putting it on their CV and you aren’t: you think there’s something more to it. There isn’t. It’s a very basic technique.

> how to write code so that it can be proven it works the way intended and that it can handle unhappy paths and edge cases

That is not what TDD promises. All TDD does is ensure you have some tests at all, which is a good base on which to add more tests, so that when you find edge cases you can do more TDD, and TDD is a nice way to work. Nobody in history has come up with a way to make testing comprehensive. You can try quickcheck and fuzzing to generate lots and lots of test cases, but they are still sampling from the input space rather than covering it. You can cover the input space for a single 32-bit number input, but that’s about it. The only way to do better is to formally prove your software is correct using mathematical logic. TDD does not provide this and never will. You might have seen “100% test coverage”, but that claim is close to meaningless with respect to the range of possible inputs that your code is meant to handle. All it says is that every edge case that you did think of has a test that exercises it. A function consisting of “return 0” only will have 100% test coverage with a single test in the suite. Doesn’t mean it works. See SQLite, which still finds bugs despite 100% test coverage.

What you seem to be looking for is someone to teach you how to think of edge cases. That’s a skill that can’t really be taught. Just have a go. When you find an edge case later that you didn’t think of, great, maybe you will think of similar ones next time.

dcuthbertson · on Jan 30, 2023

A long time ago I found the chicken and egg problem to be more general than TDD. It was more along the lines of, "how do you write a program when you don't have the code (subroutines, other functions) it needs to run?" Where do you start - at main() or say, opening a file that doesn't yet exist to read/write some data? It was always a struggle.

Then I found Structure and Interpretation of Computer Programs (SICP)[0] and the video lectures from MIT[1]. I had an epiphany when Sussman talked about "wishful thinking" in video for lecture 1b[0] (around 48:00 in). The lesson was something along the lines of start naming functions that would do what you needed done, and write them later. Just pretend they exist and eventually bring them into existence. SICP has so many gems. It really made a difference for me.

If applied to TDD, write a test that won't even compile because the function under test doesn't even exist. Then iterate by writing something that compiles, but will probably fail, and then improve it until the test passes.

[0]: https://mitp-content-server.mit.edu/books/content/sectbyfn/b...

[1]: https://ocw.mit.edu/courses/6-001-structure-and-interpretati...

[2]: https://www.youtube.com/watch?v=V_7mmwpgJHU&t=2s

Edit: How to Design Programs and How to Design Worlds are two other resources I enjoyed. I don't write programs in Lisp or Scheme anymore, but the experience of just running through the exercises in these and other books was enlightening.

varispeed · on Jan 30, 2023

Brilliant, thank you for sharing!

Zvez · on Jan 30, 2023

I remember how tdd (together with scrum) was basically a religion in 'large corporations'. How a lot of people talked about 'the right way to do things'. Your comment made me smile nostalgically...

But back to the topic. I'm pretty sure there are enough books about how to write tests. And more or less all engineers understand the value of testability and coverage (not necessary tdd!). At leas I wouldn't need a book for this. But books about building something closer to complex real world systems - that's a good stuff engineers would enjoy.

robertlagrant · on Jan 30, 2023

What would you say makes better software than TDD?

x86hacker1010 · on Jan 29, 2023

I commented this above, but Distributed Services in Go does testing and low-level kafka-like implementation. It taught me so much, I'm probably going to reread at some point and implement in Rust.

victor106 · on Jan 30, 2023

> Redis could be considered one of the building blocks of modern computing

Is Redis really that critical to modern computing?

jjice · on Jan 30, 2023

I'd say that for "modern computing", no.

Modern web applications? I can see it.

Redis rules and fixes a lot of problems, especially when you probably already have a server/cluster for something else. A building block of modern computing is a bit overzealous to me.

KptMarchewa · on Jan 30, 2023

Depends on how you define "critical" - humanity would survive without it - but a lot of large web systems depend on it one way or another.

feichuan · on Jan 30, 2023

I did something similar by building Redis in Rust a few years ago. It’s a really good learning and exercise to a lot of topics like network programming. https://github.com/huangjiahua/reredis

danjc · on Jan 30, 2023

Kudos to the author for a very succinct, practical book. It would be nice to see a few chapters on high availability though - seems to me that failover and consistency guarantees are one of the really hard bits.

Dowwie · on Jan 30, 2023

Is anyone aware of an in-depth explanation about how key-value store persistence, such as Redis RDB, is built? A guide on compaction would be really interesting and useful.

fxnn · on Jan 31, 2023

Highly recommending Martin Kleppmanns „Designing Data-Intensive Applications“ book (O‘Reilly). Covers virtually any topic around database implementation. Well written, big source of knowledge. Introduces concepts gradually and with many examples. Lots of references to actual software out there, and how they are done.

KV persistence is covered in Chapter 3, right from the start. Redis is also mentioned as an example for an in-memory store with „weak durability by writing to disk asynchronously“.

tr33house · on Jan 30, 2023

I believe this book is such an explanation. At least for the key-value store persistence bit

Dowwie · on Jan 30, 2023

Could you point me to the chapter on persistence?

cassepipe · on Jan 30, 2023

The book seems really interesting but I am quite poor so thanks for making a free web version available. I'll think of buying the epub when I earn a wage <3

blacklion · on Jan 30, 2023

Am I paranoid, or `container_of` macro is UB in modern C/C++, as it cast null pointer to some struct type and operates on members of this type?

yazzku · on Jan 29, 2023

A good read for sure, thank you. '4. Protocol Parsing', hand-writing protocol parsers in C looks particularly thrilling.

theideaofcoffee · on Jan 29, 2023

The redis wire protocol is simple enough where it's not very difficult. It's instructive to write it in C because there's a lot of translatable knowledge to other languages: read in a buffer, cast that buffer to the protocol-specific struct, read the various fields from memory, etc. Basic operations for a systems-level project like this.

yazzku · on Jan 29, 2023

It's simple enough, and yet:

https://nvd.nist.gov/vuln/detail/CVE-2021-32675 https://nvd.nist.gov/vuln/detail/CVE-2021-41099 https://nvd.nist.gov/vuln/detail/CVE-2021-32761

I agree it's instructive, but on the other hand, the time might be right to start teaching why not to do this stuff. Modeling a protocol parser in a high-level language that can spit up correct and highly-optimized C code would be just as instructive and perhaps even more fun.

blagie · on Jan 29, 2023

<-- This.

Yes, "read in a buffer, cast that buffer to the protocol-specific struct, read the various fields from memory, etc" are all operations one should generally NOT do. This was instructive maybe three decades ago, when the internet was relatively safe.

Knowing how to do this RIGHT is important.

The place this is still helpful is low-level programming, and embedded is a far better place to learn. If one part of your microwave is talking to a sensor, security isn't really an issue, since you control both ends.

yazzku · on Jan 29, 2023

Yeah, and I said the above being myself a big fan of C. I write C in my hobby projects. Yet if I were doing anything non-trivial that talks over the network, that protocol parsing would for sure not be hand-written.

blagie · on Jan 30, 2023

I really enjoyed writing low-level C code, maybe three decades ago. I grew out of it, eventually, but I remember the joy. Similar for assembly, actually.

(And rereading, I don't mean "growing out" in a derogatory way -- I grow into and out of a lot of things -- and low-level programming is something anyone can enjoy for a few years)

77pt77 · on Jan 29, 2023

What could go wrong?

It's not like there are decades of paying the price of such decisions...

swatcoder · on Jan 29, 2023

You don't have to learn raw systems coding yourself, but if nobody bothers, the code of 2050 is going to be built on a lot of stale, brittle magic that nobody can reproduce or understand.

This is exactly the sort of playground effort where those who do want to learn can get their hands dirty. If this is their first time doing this kind of project, they shouldn't use any of this code in production. There will be all sorts of cheated corners and vulnerabilities, and not just in obvious high-risk places like this. That's how hands-on learning goes.

yazzku · on Jan 29, 2023

At a cursory glance, they assert((size_t)rv <= n), but if that's the assert.h assert(), it's just void when NDEBUG is defined, and cmake seems to define that when building a Release target. Also, gcc does not seem to define it by default. So that's a bit of an overflow there.

CyberDildonics · on Jan 29, 2023

I think you're mixing up parsing a binary protocol with parsing strings.

eBombzor · on Jan 29, 2023

> Here are some examples that meet my criteria of the “building block”: NGINX, SQLite, PostgreSQL, Kafka, Linux kernel, etc.

Kafka?

Jolter · on Jan 29, 2023

Why not? I’m curious as to why it wouldn’t fit among those others. Not used broadly enough?

nkozyra · on Jan 29, 2023

Given there's a focus on basic data structures it seems like a good fit.

ibbtown · on Jan 29, 2023

Probably MQ as example for a message bus was not OSS and hip enough;)

Zvez · on Jan 30, 2023

kafka is a good fit when you design modern system. As your system grows larger, necessity of message queue becomes obvious. And kafka is probably the most widely used general-purpose high performance distributed q

x86hacker1010 · on Jan 29, 2023

check out distributed services in Go and you'll see why Kafka would be in this list.

rickreynoldssf · on Jan 29, 2023

That's something I wish I had time for.

bkuehl · on Jan 30, 2023

Redis is great, but sometimes it isn't always the best solution.

We almost configured it, but instead implemented our own Cache web service and used the built in memory/cache management of that. Yes, it's only accessible via http but it's given us a lot of flexibility. We are primarily using it for caching of large datasets (hundreds of MBs). When service has to be restarted, it makes a call to get all the items it needs.