Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting read, especially as someone poking w/ writing a parser in zig for fun :)

One area of improvement worth mentioning in this area is that currently zig errors cannot be tagged with extra information, which tends to be important for surfacing things like line numbers in parsing errors. There was a proposal[0] to improve this and it's not the end of the world for me anyways (as a workaround, I'm toying with returning error tokens similar to what treesitter does)

On a different note, there's another cool zig thing I found recently that is mildly related to parsing: a common thing in parsers is mapping a parsed string to some pre-specified token type (e.g. the string `"if"` maps to an enum value `.if` or some such so you can later pattern match tokens efficiently). The normal way to avoid O(n) linear search over the keyword space is to use a hashmap (naively, one would use a runtime std.StringHashMap in zig). But I found an article from Andrew[0] about a comptime hashmap where a perfect hashing strategy is computed at comptime since we already know the search space ahead of time! Really neat stuff.

[0] https://github.com/ziglang/zig/issues/2647

[1] https://andrewkelley.me/post/string-matching-comptime-perfec...



The comptime switch idea has been expanded into a full fledged implementation in the standard library!

std.ComptimeStringMap

https://github.com/ziglang/zig/blob/master/lib/std/comptime_...


Oh, very cool, I somehow missed that! Thanks!


The zig standard library also has a ComptimeStringMap type for this use case which is used by the self hosted tokenizer for example.

https://github.com/ziglang/zig/blob/master/lib/std/comptime_...


Beat you by 3 full minutes :P


I'd be curious to hear your thoughts on Zig so far. I have a lot of respect for your design taste based on mithril.js, particularly when it comes to tradeoffs between functionality and simplicity.


I'll get the bads out of the way first: there are areas where the language isn't quite there yet (e.g. the error thing I mentioned) I ran into an issue where you can't do async recursive trampolines yet (think implementing client-side http redirect handling in terms of a recursive call).

The io_mode global switch plus colorblind async combo is something I'm a bit wary of since it's a fairly novel/unproven approach and there are meaningful distinctions between the modes (e.g. whether naked recursion is a compile error).

Another big thing (for me, as a web person) is lack of https in stdlib, which means realistically that you'd have to setup cross compilation of C source for curl or bearssl or whatever. There's a project called iguana-tls making strides in this area though.

With all that said, there's a pretty high ratio of "huh, that's a cool approach" moments. There are neat data structures that take advantage of comptime elegantly. There's a thing called copy ellision to avoid returning pointers. The noreturn expression type lets you write expressive control flows such as early returning a optional none (called null in zig lingo) from the middle of a switch expression. Catch and its cousin orelse feel like golang error tuples done right. Treeshaking granularity is insanely good ("Methods" are treeshaken by default; so are enums' string representations, etc). The lead dev has a strong YAGNI philosophy wrt language features, which is something I really value.

Overall there's a lot of decisions that gel with my preferences for what an ideal language should do (as well as what it should avoid)


> Another big thing (for me, as a web person) is lack of https in stdlib, which means realistically that you'd have to setup cross compilation of C source for curl or bearssl or whatever.

But that should be very easy in zig, as zig can compile c?

https://andrewkelley.me/post/zig-cc-powerful-drop-in-replace...


Linking to the system curl is indeed very easy, just pass `--library curl`, but cross-compiling means you can't just do that (since e.g. a windows dll is not going to work on macos). Instead you need either source code or a `.o` file.

Compiling C source is "easy" in the sense that the compiler can do it without huffing and puffing, but it comes with a bit of yak shaving (namely, setting up build.zig or fiddling w/ the respective CLI flags, and obviously you also need the actual source code files to be in the right place, etc.) It also means that maybe the memory allocation scheme will not be quite what you want (e.g. you can't pass your arena allocator to curl_easy_cleanup)


Awesome, thanks!


Super interesting, thanks for sharing! I'd be curious to learn more about how you workaround the lack of extra info in error types in practice? Are you just returning e.g. a struct with additional info?


Yes, e.g. `const Token = struct { kind: Kind, otherStuff: ... }`, where Kind is an enum where one of the values is Kind.error. Then since switch is exhaustive, you can just pattern match on kind as you iterate over tokens to handle the error case at whatever syntactic context is most appropriate.

The nice side-effect about this approach is that rather than following the standard error flow of bailing early and unwinding stack, you can keep parsing and collecting errors as you go.


There are some features exclusive to errors though (errdefer, stack traces, implicit error unions). Did you find yourself missing any of these by doing it this way? I'm partially asking because I was just making this decision the other day, and I went with errors for now.


For parsing specifically, I haven't felt the need for them (errdefer is not really relevant since I don't typically need to clean up resources half-way through parsing, and likewise, zig stack traces aren't necessarily as useful to an end user as contextual parsing metadata.

I do use errors for other stuff, and I wish I could, for example, attach actionable error messages to errors, to be dispatched to whatever logging mechanism is setup. Bubbling up an error from stdlib and printing it from main makes for poor end user experience, and pattern matching an entire application's worth of an error union in order to map an error to a descriptive message is not as ideal as writing the messages where they come from as you would w/ e.g. golang's `errors.New(message)`.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: