Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've done a moderate amount of compiler work, and the resources I'd recommend are:

- "Compilers: Principles, Techniques, and Tools" (v popular, generally referred to as the 'dragon book' on account of its cover): http://ce.sharif.edu/courses/94-95/1/ce414-2/resources/root/...

- "Parsing Techniques": https://staff.polito.it/silvano.rivoira/LingTrad/ParsingTech...

- Not quite about compilers, but the website 'Crafting Interpreters' touches on the pre-compilation parts of a compiler (scanning, lexing, parsing, optimisation): https://www.craftinginterpreters.com/

- If you're compiling a typed language, this blog post is a good intro to type inference algorithms, esp with a view to implementing a Hindley-Milner (bidirectional) type system: https://eli.thegreenplace.net/2018/type-inference/

I highly, highly recommend at least trying to write a compiler or interpreter. It's a fantastic way to understand more about PL theory, and improve your proficiency with data structures and algorithms.

As for functional programming, it's basically the art of removing the word 'self' from your programs.



Wow thank you very much!


No probs, best of luck!

If I have any advice, it's:

- Don't try to write a scannerless parser. It's a fool's errand. Lex first into tokens (e.g. LEFT_PAREN, KEYWORD(for), STRING_LITERAL("foo"), NUM_LITERAL(7), etc), then parse that.

- By the same, uh, token: keep everything modular and loosely coupled. Don't mix up parsing state into your lexer (unless you're lexing C or Python, famously), don't mix up lexing code into your interpreter, etc.

- For parsing, start with a simpler grammar. Parsing mathematical expressions is a classic example. Avoid complex grammars that require lots of backtracking or lookaheads. If you want a real language, Go is a good example - a large part of its famous compilation speed is due to its simplicity (due to the fact that its authors are old men who live in a counterfactual version of the 90s imagined in the 70s).

- YMMV, but I find the best approach to parsing is a packrat-inspired bottom-up parser. Iterate over the tokens, and for each token filter your list of rules ('productions') to those which match. 'Reduce' the simpler expressions as you go, and build the more complex expressions out of them (e.g. functions will typically have several statements/expressions, expressions several operations, etc).

- For the compilation step, unless you specifically want to learn about writing object code, then target a 'backend' IR like GCC or LLVM. You'll benefit from their optimisations, and the vast number of platforms they support.

- Choose a language you're familiar with - ideally a simple one - to write it in. You don't want to be learning a new language as you're doing this, trust me.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: