OCaml has been pretty common tool to write parsers for many years. Not a bad cho...

miningape · on Nov 8, 2024

Thanks for your comment, you've given me a lot to chew on and I think I'll need to bookmark this page.

> I've written parsers professionally with Rust for two companies now

If you don't mind me asking, which companies? Or how do you get into this industry within an industry? I'd really love to work on some programming language implementations professionally (although maybe that's just because I've built them non-professionally until now),

> Especially if you need to do runtime parsing in a network service serving large traffic.

I almost expected something like this, it just makes sense with how the language is positioned. I'm not sure if you've been following cloudflare's pingora blogs but I've found them very interesting because of how they are able to really optimise parts of their networking without looking like a fast-inverse-sqrt.

> There's also no reference counting needed anywhere, so you don't need to play the dance with Arc/Weak.

I really like the sound of this, it wasn't necessarily confusing to work with Rc and Weak but more I had to put in a lot of extra thought up front (which is also valuable don't get me wrong).

> I would say an amazing strategy is to first write a prototype with Ocaml, it's really good for that.

Thanks! Maybe then the Rust code I have so far won't be thrown in the bin just yet.

pimeys · on Nov 8, 2024

> If you don't mind me asking, which companies? Or how do you get into this industry within an industry? I'd really love to work on some programming language implementations professionally (although maybe that's just because I've built them non-professionally until now),

You do not need to write programming languages to need parsers and lexers. My last company was Prisma (https://prisma.io) where we had our own schema definition language, which needed a parser. The first implementation was nested structures and reference counting, which was very buggy and hard to fix. We rewrote it with the index/walker strategy described in my previous comment and got a significant speed boost and the whole codebase became much more stable.

The company I'm working for now is called Grafbase (https://grafbase.com). We aim to be the fastest GraphQL federation platform, which we are in many cases already due to the same design principles. We need to be able to parse GraphQL schemas, and one of our devs wrote a pretty fast library for that (also uses Logos):

https://crates.io/crates/cynic-parser

And we also need to parse and plan the operation for every request. Here, again, the ID-based model works miracles. It's fast and easy to work with.

> I really like the sound of this, it wasn't necessarily confusing to work with Rc and Weak but more I had to put in a lot of extra thought up front (which is also valuable don't get me wrong).

These are suddenly _very annoying_ to work with. If you come from the `Weak` side to a model, you need to upgrade it first (and unwrap), which makes passing references either hard or impossible depending on what you want to do. It's also not great for CPU caches if your data is too nested. Keep everything flat and sorted. In the beginning it's a bit more work and thinking, but it scales much better when your project grows.

> Thanks! Maybe then the Rust code I have so far won't be thrown in the bin just yet.

You're already on the right path if you're interested in Ocaml. Keep going.

miningape · on Nov 8, 2024

I should've expected prisma! It's actually my main "orm" for my TS web projects, so thanks for that! Also grafbase seems interesting, I've had my fair share of issues with federated apollo servers so it'd be interesting to check out.

> If you come from the `Weak` side to a model, you need to upgrade it first (and unwrap), which makes passing references either hard or impossible depending on what you want to do.

You're literally describing my variable environment, eventually I just said fuggit and added a bunch of unsafe code to the core of it just to move past these issues.

packetlost · on Nov 8, 2024

There's also the phenomenal pest library. It probably wouldn't be as fast, but I've found that usually parsing a performance critical part of a system. If it is, a manually writing the parser is definitely the way to go.

marcosdumay · on Nov 8, 2024

> Especially if you need to do runtime parsing in a network service serving large traffic

Yeah, that's the focus of it, and the thing you can use Rust well.

All the popular Rust parsing libraries aren't even focused on the use that most people use "parser" to name. They can't support language-parsing at all, but you only discover that after you spent weeks fighting with the type-system to get to the real PL problems.

Rust itself is parsed by a set of specialized libraries that won't generalize to other languages. Everything else is aimed at parsing data structures.

pimeys · on Nov 8, 2024

There is also the rust-analyzer which is a separate binary. Should compile with a stable rust compiler. I remember reading it's source together with the zig compiler. Both are quite impressive codebases.