One thing people don't often bring up is how great nim is for writing DSLs.
DSLs really matter in the hardware and digital design space - a space that
seems to be completely devoid of innovation when it comes to tooling.
A couple languages that try to bring RTL into the 21st century
are nMigen-Python and Chisel-Scala.
I'm currently writing an RTL in Nim.
Nim's macro system allows you to do really cool things like instantiate
named RTL wires and create custom operators like `:=` for Signal assignment.
Really excited about tier 1 BigNum support coming soon - which will make
it easier to simulate arbitrary width integer signal arithmetic.
This sounds cool, I would be very interested if you had any public code. It’s _very_ early days but I’m working on a similar project in Swift: https://github.com/circt/Edith
It's great to see reference to hardware design in this thread.
Not hardware, but I have been using Nim for DPI-C interface with SystemVerilog for quite some time. Few weeks back, I started tinkering with Nim macros, and started working on a project to make the VPI interface more approachable.
This language is kind of a hidden gem that not many people are aware of in the HDL/HVL space.
Please consider creating a write up about this. As someone that had very little knowledge about the hardware design space I'd love to learn more about it and see how Nim solves some problems there :)
Do you have anything public? I’ve been following LLHD for a while; I like the idea of a shared middle end. LLHD’s front end is Verilog/VHDL which are … not great.
Are you translating, or building your own sim runtime?
Simulator is written internally in Nim. There isn't really a good FOSS IR for RTL sims on the market currently. I absolutely refuse to use MLIR or anything based on LLVM for that matter. LLVM takes up to 40 minutes to rebuild on my Apple silicon. This is completely unacceptable in terms of rapid iteration.
LLHD is greenfield code, so it’s not based on LLVM. I’m familiar with the authors both academically (15 years ago), and on an engineering level (now). I think it is definitely worth reviewing what they’re bringing to the table.
Anyways, I think the RTL market is ripe for disruption. There was a lot of great language work done by, for instance, Ronald Garcia, directly aimed at fixing pain points in HDLs, like parsmetrization.
Nim really had fantastic potential for RTL and other hardware interaction languages. Best of luck in making progress with the RTL! I’d be really interested if it could become something for program CPLD or FPGA included in some microcontrollers now.
Yep, one of the main things I did with nim for the month or 2 I was really into it was write a duktape wrapper[0] (in retrospect, I should have made the name a pun related to wrapping with tape...). It was pretty interesting given the stack-based nature of almost every duktape operation.
I started using Nim recently and have been very impressed. It feels like a better Rust than Rust, and a better Go than Go. Why doesn’t Nim get more attention?
I think one reason is that there's no big corporation behind it, for Rust it was Mozilla and now the members of the Rust Foundation, for Go it is Google.
But I totally agree, Nim is an awesome language and definitely deserves more attention!
"I think one reason is that there's no big corporation behind it"
I agree. The popularity of some programming languages is undoubtedly buoyed by corporate sponsorship or the association with a company. This is not a bad thing, but it means other languages struggle to generate as much interest.
Also, without a generous benefactor, open source languages have to scrape funding together piecemeal from different sources. For example, both Rust and Go have had (or still have) dedicated staff writing documentation for the language. This is a luxury that other languages cannot fund or afford.
I actually posted a Ask HN question of this very topic recently: Can new programming languages attract developers without funding?"
Python did had a couple of smallish killer applications like Zope, being used as scripting language in 3D applications, and gaining adoption among Perl users after Perl 6 never ending story.
Additionally Guido has always worked in well known companies, and Microsoft and IBM have toyed with Python on their products as well.
I think that‘s a possible path for Nim to develop. Win over a trickle of python devs who‘ve learned the problems arising once you hit a certain performance ceiling / scale. This is being accentuated by hardware & cloud developments - startup time & compute efficiency becoming increasingly important (containerization/serverless driving one, I/O performance outgrowing CPU perf. driving the other). If those trends continue, Python will increasingly be ill fitted. And I‘m saying this as a Python dev.
Except that Guido went to Microsoft to work on their new JIT project, while Instagram is also pursuing their own, and there is Cython which follows the same compile to native code workflow (C based backed) as Nim.
The biggest problem in Python slowness isn't that solutions don't exist, rather the community at large tends not to embrace them as other ecosystems do.
My experience with optimizing Python code after ~10y is that after a certain point it's just not worth it and you'd be better off rewriting in a more performant runtime once you hit that. That includes experiments I did with pypy, Cython and Numba for various projects.
It may be better than Go but I don't see how it is comparable to Rust. Rust's biggest selling point is compilation-checked memory and thread safety guarantee without garbage collection. Nim instead uses a garbage collector, or a reference counter similar (but deterministic) to Swift which can cause memory leaks, or manual memory management.
I really don't understand what your requirements are. Again karax is well known and better alternative than all these (i.e loops and other control flow work as expected).
Understandably, I don't think I explained it all that well.
If I look at the Karax example, it seems more like a way of building the HTML in Nim, but I want to write HTML. If you look at Jinja2, then it's HTML, but with a bit of embedded syntax to do loops, conditionals and string replacement, plus a few handy functions.
The built in templates in Go is a bit simplistic, but it's enough for my needs. So something like it would be nice.
As you pointed out the Karax it has type safety, which my suggestions doesn't have. It's also not required, because HTML/text templates doesn't need type safety, it's all just text.
Basically I want nwt, with loops and control statements implemented.
How is the compiler, speed wise? One thing I really like about Go and really don't like about Rust are their relative compiler speeds. I think it makes a big different for the language and Nim has a chance to get this right as they didn't make the mistake of using LLVM and it's bad compiler UX.
This is really cool. I really want a language that has GC, but lets me be explicit about memory usage as well / leverage move semantics to optimize. I use Rust, and I suspect I'll always have it as a driver, but I wouldn't mind another higher level language to take the place of Python, which I use for prototyping.
Yes garbage-collected, with a caveat. From the website:
> Nim's memory management is deterministic and customizable with destructors and move semantics, inspired by C++ and Rust. It is well-suited for embedded, hard-realtime systems.
The reference counter with "destructors and move semantics..." is not presently the default, though it will be in the future. The OP mentions that.
The current default GC involves a somewhat unusual memory model that can make certain kinds of multi-threaded programming difficult: each thread has its own heap and independent GC; memory subject to GC cannot be read/written across threads and locks can't be used to remedy the situation; if you force the compiler to accept code that violates the "no shared" rule, the ref count of the GC on one thread or another will eventually get corrupted and the program will crash with SIGSEGV.
However, by allocating on and copying to/from the shared heap (which is never subject to garbage collection) you can do message passing across threads.
Note I’m using Nim with the new GC (both ARC & ORC) on dual core esp32’s.
Locks do work with ARC, though some care needs to be taken. Both the Nim and free-rtos multi-threaded queues work well with the correct move/sink annotations.
The design of ARC allows you to move entire memory graphs between separate heaps. It works fine, but what’s not working is the compile time verification that a given memory graph is only owned/aliased by the top pointer. So there’s a lot of work to do in making it “safer and easy” to use.
There’s a few of us working in this area, well rather Nim on embedded microcontrollers. :-)
It works fantastically, though most of us aren’t traditional hard real time folks. Nevertheless Nim’s new ARC is excellent for this area. The overhead of the GC basically is just an increment/decrement on an integer (no atomic). That means it’s fast and predictable for modern MCUs and plays well with C code. If you want a tight interrupt function you can stick an object in global variable and know it won’t incur GC overhead.
It’s relatively easy to reason about where and when you want memory to be free’ed (actually easier than Rust). Also allocator pools are easy to utilize.
On a side note, as the Nim compiler improves moves and sinks analysis, the runtime overhead of ARC approaches that of Rust’s compile time memory management, especially with a lot of Rust code that use lots of Rc wrappers.
P.S. I’m planning to write a Nim wrapper for Zephyr RTOS when I get time later in the summer, and improve the existing FreeRTOS support too.
First, I like that you said "multicores" and not "multithreads". Default-sharing of all memory is overrated. Sometimes it is useful/necessary, but people reach for threads too readily, IMO.
For multiprocessing (like Python's module of that name), you can roll your own little system in probably 100 lines of code or use something like this [1] with an example program [2]. For me, that toy program runs 1.5-2x faster than ripgrep on the same (admittedly limited) problem on Linux. { I suspect this perf diff is due to mmap IO being faster than syscall IO due to SIMD register use as discussed here [3], but this deserves deeper investigation than I have time for right now. If my hunch is right, that may constitute further argumentative support for not leaping to threads even if the programming language "makes them 'easy'", though. }
As for threads/parallelism with shared memory in Nim..Honestly, there is probably too much to recap. Weave [4] would be a good place to start reading, though, or searching the Nim Forum.
Is it possible to share the commands/inputs you used to run a benchmark with your code against ripgrep? (Including commands to compile your Nim program.)
This has some Zsh-isms but should get you going, assuming it is even reproducible across machines.
$ cd cligen-github-root/examples
# default system nim.cfg
$ nim c -d:danger --gc:arc -d:useMalloc --passC:-flto --passL:-flto grl
# best to fetch this while you can as they get updated regularly
# oh yeah, just unpacked; not configured/anything.
$ cd /dev/shm/linux-5.12.5
# Zsh-ism to eliminate file tree traversal variation
$ fs=(**.[ch])
# On an Intel i7-6700K
$ repeat 5 utime ~1/grl -sburntsushi $fs
0.143836198 0.23 0.21 305.9%
0.137177744 0.22 0.22 320.8%
0.136714417 0.22 0.22 321.8%
0.149867097 0.23 0.22 300.3%
0.139042282 0.22 0.22 316.5%
# make sure there is no ripgreprc file.
$ repeat 5 utime rg burntsushi $fs
0.204383740 0.33 0.26 288.7%
0.200495993 0.34 0.25 294.3%
0.205767854 0.37 0.22 286.7%
0.200485116 0.33 0.25 289.3%
0.201692769 0.34 0.24 287.6%
$ rg --version
ripgrep 12.1.1
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)
200.5/136.7 =~ 1.47x
OS for the above is Linux 5.12 running on bare metal. When I do a gcc PGO build I can squeeze about another 7 ms off that 136.7 time, but this is probably not very informative. Honestly, even 1.5x is kinda small, too.
Anyway, were I to study this in more detail, as it sounds like you may be wanting to do, my recommendation would be to just factor out the substring/regex search (rg's vs glibc memmem, etc.) and study a more "pure IO" case. E.g., just switch to memchr(non-existent char) or such (e.g. Daniel Lemire's Faster 64-bit universal hashing using carry-less multiplications, Journal of Cryptographic Engineering is very fast or maybe even a SIMD-summation) to force all the IO but do other things SIMD-efficiently/at the highest GB/sec easily achievable. Then try that same simplified program with ripgrep's file size heuristics with threads vs the grl N-kid processes + always mmap. Just fewer moving parts/better isolation of behaviors.
This also might lead to more general/re-applicable knowledge for the best way to do read-only multi-core file IO, not only tuning ripgrep or only Rust or only any PL. If it all holds up, and that same "test setup" could be applied to differing OS contexts like OSX/Win/etc. and then maybe it might be worthwhile re-jiggering ripgrep IO. (It is not a foregone conclusion the answer will be the same across OSes or CPUs, or even that my hunches/hypotheses are correct. I am trying to help here without making strong, general claims to be cross-examined upon, as is the Internet's way...)
since including the C files caused the argument list to be too long on my system.
My CPU is a bit dated, but is a i7-6900K @ 3.2 GHz.
My Nim version:
$ nim --version
Nim Compiler Version 1.4.6 [Linux: amd64]
Compiled at 2021-04-26
Copyright (c) 2006-2020 by Andreas Rumpf
active boot switches: -d:release -d:nativeStackTrace
With those caveats, this is what I get on my system:
I think on the one hand, I'm confused as to why I can't reproduce your result. But the more interesting thing to me is why ripgrep is so slow when it uses memory maps, but your program is not.
The strace output for ripgrep also shows more syscalls than I would expect, so I'll be investigating that as well.
There are also lots of 'pselect6' syscalls in your program. Do you know what those are from?
Anyway, thanks for the interesting benchmark! Some interesting bits to investigate!
Sorry. In my kernel build script I patch _STK_LIM in include/uapi/linux/resource.h and MAX_ARG_PAGES in include/linux/binfmts.h to boost my command line lengths. There was a short time when Rob Pike had some patch that fixed this with a fun comment like "dynamic memory allocation is a done deal, guys". It could be that .h files are smaller than .c files (or just fewer/less total data) making it harder to reproduce. Another element to add to my suggested research program above besides eliminating string search algos would be to standardize file sizes to all 100 KiB or something. Generate them with random data or some such. Play around with file sizes in your experiments, etc. Besides that, your CPU has twice the cores as mine and we may have pretty different memory bandwidths as mine has very low latency and high BW RAM. Reproduction is actually not generally very easy which is why I said 1.5x was not so big. I often find it hard to reproduce ratios <2..3x, but it does look like a number of cases of the min of 5 `grl` beating `ripgrep` in your tests. Anyway, this is all really just the start of some project, not the end.
`grl` is really just a demo program for the 8-bit clean message passing variant in `procpool` (similar to Python's multiprocessing module). `procpool` uses parent-kid pipes for its communication. The pselect6's come from using select on those N pipes. And, yeah, a lot of selects are expected since the filename passing-answer receiving happens a lot with a lot of files. { Yes, yes...I know past 1024 fd's select will be a problem, but TSMC only just hit that 1nm mark. So, we probably still have a few years before I personally can access 1024 core-thread machines.. ;-) ;-) }
mmaps and threads was discussed already in [3] linked in my first post of this thread with a brief summary. To recap briefly - since threads share all memory and since mmap alters those page tables, it is plausible the kernel just locks the whole process out of simplicity, blocking execution of other threads. It is also possible OS-? can use devious tricks/semantics to avoid that locking, but I am unaware of an exhaustive survey or proof of impossibility. Because in `procpool`/`grl` the kids which are what do the per-file mmaps, are their own processes there is no suspension of the other kids when they mmap. So, they can realize the faster IO (from SIMD register use, done by Linux for context switch optimization or "performance in the large/system-wide"). This is (perhaps) why your last test has better performance without any parallelism. I had meant to test this last time we discussed mmap & threads, but didn't at the time. `grl` is just a very preliminary test along those lines.
This is all at least partly theoretical. Someone ought to study it and write a nice blog article about it. I do not have that kind of time right now (or a blog). Or, for all I know, nice academic paper(s) already study this somewhere. I have not looked. Sadly, as old as this thread is, the general HN pool will probably not see any of this to crowd source such wisdom with direct pointers for you. I almost missed your question in my old threads checking, and I have kind of used up my time budget for this right now. You can email if you want. I get better notifies that way.
Oh, I think I missed the fact that grl is spawning processes and not threads. That's very interesting.
With respect to extra syscalls, ripgrep is actually doing a lot more stat'ing than you might expect because it is "recursive" by default. So for each file it gets, it has to at least check if it's a directory. And there's some symlink behavior to handle. I think I can get rid of one of them per file given, and others are harder because of abstraction boundaries. Sigh. Anyway, that could be impacting an I/O-only comparison here as well.
It would be interesting to see a compilation of all of the times people asked that question. Like literally every time anything about Nim is posted, someone asks that.
Including inheritance in nim was a design mistake and is enough to make me pass on it. Go and Rust were correct in avoiding inheritance in the language.
I think you should include “thoughts” (as it says in the forum post) in the title or in some way disclose that this is just a design/planning/brainstorming session for Nim 2.0 and not actually a major new release.
The restructuring of the system namespaces come as a very welcome addition. --threads:on by default is awesome too!
I love this language so much, I am so happy to see this much momentum coming from Araq and the gang. They are very kind people that help you in the Discord server.
For all the help I've been given for free time and time again by these guys, I'm working on a series of exercises to hopefully minimize the onboarding time for people looking to explore Nim. https://github.com/sergiotapia/nimlings
Does Nim have comprehensions? Even if you're striving for equivalent signatures the inner body of the function can be expressed as a comprehension in Python:
T = TypeVar("T")
def odd_numbers(a: Iterable[T]) -> Iterator[T]:
yield from (n for n in a if n % 2 == 1)
It's not list comprehension, it's generator comprehension, which is lazily computed. However it's still an unnecessary layer of iteration, when you could just return the comprehension instead.
I can't get used to the fact that for builtin types grow methods, you need to import corresponding modules everywhere.
You import module X, which gets you objects as defined in the module Y. To use methods defined for that type, you need to import Y yourself. Python, of course, by virtue of binding objects to methods, doesn't need it.
There are some more namespacing quirks that may require one to give up on that sweet syntactic sugar to disambiguate things (two unrelated modules defining methods on a single type, with the same signature but different behaviors, and you need both modules imported for some reason?), but this is, again, something one would have to get used to. It's not a different world like Rust or Prolog.
It's not a complete showstopper, just something I keep bumping my head into now and then.
I love python as well, and I literally just started exploring Nim yesterday, so this is very timely.
The one thing that stuck out to me like a sore thumb is the camelCase convention. I know it's a minor thing (and a personal preference), but I'm a bit biased against languages that use camelCase (Java, JavaScript, etc). There's something uneasy about it IMHO.
my_foo, myfoo, and myFoo are interchangeable. MyFoo is different because the first letter is capitalized.
If it's not clear, what I mean is that if you've imported a module that exports myFoo, you can refer to it in your code as my_foo, if you prefer.
Some people love this aspect of Nim, others loathe it. I don't have strong feelings about it, though one has to be mindful of it when searching through a codebase with e.g. the_silver_searcher.
Good to know. Although I'd argue that this is something that I'd count _against_ the language. I'm also biased against languages that allow you to do the same thing in multiple ways (hint: Scala). Code bases tend to vary according to the team's own conventions, so you have to keep adapting to whatever code base you happen to be reading.
Your bias is your own problem. There's clearly a solution that allows you to use snake_case and those that potentially import your library to seamlessly use all exports as camelCase, yet you're still bothered by that. It's nonsensical, Nim literally came up with the absolute most elegant solution to this.
That's unfounded, please try it at least once in the playground play.nim-lang.org/. To me it's very cool that wrapped C libs with identifiers like ATOMIC_RELEASE can be written as AtomicRelease without any trouble.
I get what you're at with the multiple ways to do stuff, but in Nim it actually makes it more consistent (imo).
If my codebase enforces camelCase and a library uses snake_case, then I can still use that library using camelCase, and Nim has nimgrep to make this easier all the way.
Me too. I do a lot of grepping on codebases and this seems like it could bite me at some point. But it is a very interesting approach to solving the naming convention problem.
recursion vs. iteration, inheritance vs. composition, interfaces vs. abstract classes, etc. are design choices that depend on the problem being solved. Language syntax is different; it's an opinion of the language designer which gives the language its distinctive style, and IMO shouldn't have much leeway in expressing the same thing in multiple ways. It's what gives the language its identity.
For example, here's how you can iterate over a list in scala to print each element:
> recursion vs. iteration, inheritance vs. composition, interfaces vs. abstract classes, etc. are design choices that depend on the problem being solved.
Recursion and inheritance are interchangeable (in fact, either can be implemented as syntax sugar over the other); while some people find one or the other more natural for a particular problem, there is considerable disagreement among people over which is more natural for which problems. In practice people choose between them based on personal preference and what the language they are using favors (e.g., you probably [0] don’t use recursion unless it is of fairly tightly bounded depth in a language like Python that doesn’t optimize tail calls.
> Language syntax is different; it's an opinion of the language designer which gives the language its distinctive style, and IMO shouldn't have much leeway in expressing the same thing in multiple ways.
As noted, recursion vs. iteration is exactly a syntax-preference decision, and one on which some languages are highly opinionated (Python) forming an important part of their distinctive style, while others are not.
[0] though you could use a tail call optimization decorator
One thing that’s always bugged me about snake case is that people are so averse to the additional length of strings that they just omit the underscores. So you end up with methods like array.tolist() and itertools.ziplongest(*lists, fill_value=None) — that last one isn’t even internally consistent. In an ideal world I would find snake case more readable, but people actually adhere to camel case, which makes it more readable than snake case overall.
Even though it's not according to the styleguide, you can just use snake_case whenever and wherever you want, consumers of your libraries or other way around will never notice!
You can literally do whatever you want with any library you use in your own codebase and have a consistent convention in your own code regardless of what your dependencies' authors preferences are. In other words, if the library authors used camelCase but you want to use snake_case in your code, you just do it.
It's very strange at first, and I had the same reaction as you, but it's actually quite nice. (I say that as somebody who strongly dislikes snake_case :) )
You mean the language is case insensitive and drops _'s or something? Doesn't that break searchability with command line tools (or make for much more complicated regexes)?
Or do you just mean use a mixture just like you could in most other languages?
Same here, at first making camelCase and snake_case interchangeable seemed like a terrible idea. But in practice I can read and grok libraries that use either and in your code call it using your preference. It just kinda works.
If you want a statically typed language with Python-alike syntax but Nim is too "noisy" for you, then Lobster (http://strlen.com/lobster/) is likely spot-on.
Do you have examples of what you mean with unnecessarily noise?
For me it's the JSON operators. `@[]` for sequences could also be a thing, but it's really minor(for me).
Completely fair, syntax is something that resonate differently from person to person. I love the idea of Rust, it’s speed, capabilities, it’s politics and it’s independence, but I cannot stand it’s syntax.
That’s fine, I can use other languages and it doesn’t take anything away from Rust or the people who love the syntax.
I started using it for parsing heavy files, just for the speed. Slowly I replaced Python with Nim for almost everything, even when it doesn't make much sense (e.g. for controlling Selenium), and only use Python when I need graphics (Matplotlib) or some other libraries that I cannot replace.
IMO the main case is when you need C speed but you don't want to code in C.
Nim as a replacement for Python makes sense to me. We often hear that many Python developers went on to choose Go when needing more performance, but Nim feels like an easier move.
Nim not only faster than Go but have a very similar syntax to python. It make more sense for Python developers to use Nim when need more speed instead of Go.
I didn’t meassure, but I also don’t have any experience saying otherwise. Both are great languages and depending on the problem I’d be happy to use either.
When you want the performance of a low level compiled language with the ease of use and powerful constructs of a high level language - and don't want all the bondage and discipline of rust.
The biggest thing that nim needs isn't a language change, it is to make it easy (or more expected) to put .dll's in nimble libraries. Wheels in python is one of the major reasons i'm using it. Who needs to figure out how to compile qt? I can just pip install PySide6, why can't I do that with nim? Yes, it's a lot more work vetting licenses and testing/patching things to make them work on various platforms (about the same amount of work as a linux distro), but thats the best road I can see for adoption.
> The bundler takes specific commits of Nimble packages and makes it part of the official Nim distribution (that is, the zips and tarballs you can download from our website). Documentation of these modules is included, there is CI integration and to bring a new commit into the distribution, the changes need to have been reviewed, exactly like stdlib PRs are handled. I hope this will keep the benefits of today's monorepo but with more flexibility -- we can then more easily place modules
I wish more languages would take this approach. Modules that are officially endorsed and shipped with the compiler, but versioned rather than perma-stable like the standard library.
It would work fantastically for lots of Rust's defacto standard crates.
DSLs really matter in the hardware and digital design space - a space that seems to be completely devoid of innovation when it comes to tooling.
A couple languages that try to bring RTL into the 21st century are nMigen-Python and Chisel-Scala.
I'm currently writing an RTL in Nim.
Nim's macro system allows you to do really cool things like instantiate named RTL wires and create custom operators like `:=` for Signal assignment. Really excited about tier 1 BigNum support coming soon - which will make it easier to simulate arbitrary width integer signal arithmetic.