Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I thought one of major points of pride for Rust was eschewing garbage collectors. Why don’t I understand?


I think it would be more accurate that Rust doesn't want to need a garbage collector. Many more recent languages have chosen that as a default from day one. (And others haven't, but it remains a popular option.)

It has always been an option if you want it, though.

Nice thing about it in Rust too is that you can get the benefit of the garbage collector, but only and exactly where you want it, rather than it being used for the entire language.


Rust doesn't want to depend on a garbage collector. I'm not aware of anyone being against garbage collection on an opt-in basis. That said, the article represents a GC type specifically designed for the author's scheme variant (implemented in Rust).


When smart pointers first came to C++ Herb Sutter gave a take where he claimed that 80% (numbers from memory as I don't want to rewatch several hours of cppcon videos to find the real claim) of the time you should unique_ptr, 15% shared_ptr - but that last 5% you need a mark and sweep garbage collector. I'm not a rust guy (I've only done hello world), but my understanding is it only gives you unique pointers - I've done enough C++ to state that over the base decade unique_ptr has always been enough for everything, though sometimes I've needed some thought to make it work in weird situations, while shared_ptr would have just worked.

There are known real world data structures that are sometimes useful (though I've never encountered a need to use them) where because you have circular references to things you have to use a mark and sweep garbage collector. If you happen to need such a thing in your Rust code than this will be useful.


Rust does give you reference-counted shared pointers, both within a single thread (Rc<>) and across multiple threads (Arc<>). But reference counting does not properly clean up objects in the presence of reference cycles: that's when proper tracing GC may be required.


To complete your post and the parent: most Rust applications don't need GC. Lifetimes, Box and Rc or Arc will generally amply cover all your needs.

For a few applications, though (e.g. a JavaScript VM, or something that deals with complex graph structures with unpredictable lifetimes), a GC is a lifesaver. Even in these applications, most of the data structures won't need GC pointers. Rust has a few GCs at hand, and lets you opt-in to GC specific pointers. In theory, this means that the GC can be extremely fast, because the graphs it needs to walk will be much smaller than in languages with a general-purpose GC. The cost, of course, is that you, as a developer, need to pay attention to what needs a GC and what doesn't.


> The cost, of course, is that you, as a developer, need to pay attention to what needs a GC and what doesn't.

But even there, lifetimes make your life easier. If lifetimes work out on their own you know the object will be dropped at some point (at the end of the lifetime). Only when you opt out of unique_ptr-like behavior you have to consider whether Rc/Arc will work (90% of the cases), will work with some special care by using some weak references, or whether you will require a GC


Tracing GC has much better throughput than reference counting GCs (yeah, they both are GC algorithms), especially when compared against Arc<>, which would be quite expensive if many objects were making use of it, due to the synchronization overhead. (Note: I'm not saying that Rust would be slow or anything, as you can precisely go as low-level as you wish and circumvent the price of any form of GC. But if you were to have a lot of Arc objects around, e.g. you use them to represent stuff the user can also manipulate and thus have completely opaque lifetimes, then the overhead can become non-negligible)

Tracing GCs can do most of the work completely independently of the user code, and in parallel. Also, only live objects are touched, which can be also beneficial for certain kinds of applications.


> Note: I'm not saying that Rust would be slow or anything, as you can precisely go as low-level as you wish and circumvent the price of any form of GC.

Rust can also circumvent some of the overhead of Arc<> as it can update the reference count whenever a true "owner" is added or removed, as opposed to just any reference to the object. But yes, you're right that relying on automated reference counting as the default memory management strategy involves some overhead, and tracing GC may be preferable to that. This is a big issue wrt/ Swift.


> I'm not a rust guy (I've only done hello world), but my understanding is it only gives you unique pointers

Some very rough analogies:

  C++        : Rust
  unique_ptr : Box
  shared_ptr : Rc or Arc
  references : references


One key difference between Rust smart pointers and C++ ones is that Rust ones can’t be null. So C++ unique_ptr is more like an Option<Box> that automatically calls unwrap_unchecked every time you try to use it.


Fair enough.


A C++ reference is more like a Rust &UnsafeCell<T> in practice - and there are thorny issues around the possibility of null references in C++ (which are supposed to be insta-UB per the C++ standard but are allowed by most compilers) that introduce additional questions about what the best mapping might be.


Microsoft also made a version of C++ that integrated with the .NET garbage collector. Just because it doesn't fit with how most people use the language doesn't mean someone won't find it useful. Not like this will ever become part of Rust's standard library anyways.


Why shouldn't a high-performance concurrent GC implementation enter the standard library? I think it might be quite useful if it did, for the rare use case where possibly-cyclical graph structures make sense and Arc<> is not quite enough.


This feels unlikely. Rust is trying very, very hard to not have a runtime [1], by opposition to Go (which ships the runtime in every executable) or Java, .Net, JS, Python... (which require the runtime to be installed separately). That's why you need tokio (or some other executor) to execute async code, for instance.

The benefit is that it makes Rust much easier to maintain and port to new architectures, and much more polyvalent, e.g. you can use Rust on true embedded, by opposition to most other modern languages, and presumably more future-proof. The drawback is that you, as a developer, need to pick a runtime.

[1] In truth, there's still a runtime, but it's really tiny.


I think it’s less about the runtime and more about nailing down and stabilizing the implementation details. GCs have a lot of subtle trade offs just like async runtimes (i.e. like the decision to require Send + Sync versus single threaded Futures) where there’s no obvious sane default. There’s too much variation in what the community needs from a GC compared to something like a HashMap where a general implementation works fine with options to change the hasher or allocator covering >90% of use cases.


I think most async runtimes give you the option of launching a single-threaded executor when you explicitly request it.

AIUI the most esoteric tradeoffs wrt. GC involve how to have data representations that can be efficiently traced, and whether you should allow e.g. moving data in memory as part of the collection phase. But I think you're overestimating somewhat the variation within the Rust-focused community - given what most Rust code looks like, they're going to be mostly interested in something straightforward yet efficient, that can interoperate with existing Rust code w/ relative ease while addressing the common issue of cyclical data references.


> or Java, .Net, JS, Python... (which require the runtime to be installed separately).

Modern .NET has both modes that can 'include' the specific runtime bits built against, as well as AOT compilation modes that don't involve a jitter and can be smaller (heck with enough effort you can get Snake in about 8kb[0]).

> e.g. you can use Rust on true embedded, by opposition to most other modern languages, and presumably more future-proof. The drawback is that you, as a developer, need to pick a runtime.

Also -technically- possible on .NET but may require more finagaling, There's been at least one POC where a C# app can run in EFI [1]

The drawback in .NET's case vs picking a runtime, is you might have to do finagaling/tweaking/shimming on your own since there are few docs/templates.

[0] - https://github.com/MichalStrehovsky/SeeSharpSnake

[1] - https://github.com/MichalStrehovsky/zerosharp/tree/master/ef...


You make a good point, but what's discussed in the OP seems to be less complex than a full async runtime, e.g. it does not really depend on low-level OS features. So it seems to be the kind of thing that might become part of the standard library at some point, much like the Rust HashMap. You would still be able to avoid it altogether by just not using Gc<> objects (you "don't pay for what you don't use") so it wouldn't be a true runtime.


As already stated, one of Rust’s explicit goals is not to have a runtime of any kind. If your program happens to require a garbage collector, and you want to write it in Rust, then you can simply depend on a garbage collector provided by a crate. No need to have it built in when 90% of programs don’t need it. And since you can just include one when you do need it, there can easily be multiple options to choose from. As engineering is all about managing tradeoffs, you want to be able to choose the crate that best matches your particular needs. A single option buried in the standard library and forced upon you may not be the best choice.

That said, most programs should not be written in Rust. They should be written in a language that, like Rust, is memory safe, but it is best if they are written in a language which makes the developers more productive than Rust does. Languages with garbage collectors built in are at the top of that list. Common Lisp, Javascript, Python, Raku, etc; there are plenty of choices. These languages are ideal for exploration and productivity. Even Java or C#, if you can stomach it. Save Rust for later when you have to squeeze every drop of performance out of your program, you’ve already written it once so that you already know exactly what you need, and you can justify the expense of the rewrite using operational savings. Optimal performance will probably happen once you can drop the garbage collection entirely, and Rust truly shines when used that way.


> it is best if they are written in a language which makes the developers more productive than Rust does.

AIUI, recent assessments (including by Google) have found that Rust developers' productivity is competitive with GC languages such as Go.

There's also plenty of reasons to think that a program that's first written in Rust will have fewer defects in the long run than the other languages you mentioned. Of course alternative languages may still have a role to play for quick-and-dirty, purely exploratory coding, but that's about it.


True, but it is my understanding that in those cases they were reimplementing existing services, where the precise requirements are already known. Exploration isn’t for quick-and-dirty programs, but for finding the right solution in a wide space of possibilities where you don’t even know what all the requirements are yet.

For example, consider programming a game. If your only concern is putting triangles on the screen, then there are some well-known solutions that you can adopt that will work great and you can do that in any language you want. But gameplay mechanics are a different beast entirely; inventing (or discovering) something new and fun is exploratory. You’ll benefit most from a dynamic language where you can change your code rapidly, and features like strict typing and lifetimes are not so helpful. For this a language like Common Lisp, with garbage collection, a REPL, gradual typing, just-in-time compilation, and support for multiple programming paradigms is ideal. Or Javascript, or whatever.


> The benefit is that it makes Rust much easier to maintain and port to new architectures, and much more polyvalent, e.g. you can use Rust on true embedded

"supposedly" Rust's architecture support for all but the most popular archs is horrendous.


Well, to be frank, the unpopular archs aren't really well supported by other languages.

You may be thinking "What about C?" and the answer is that for those unpopular archs you are almost certainly looking at a hand patched version of GCC (3.2... probably) created by the vendor to support C.

You generally won't see good support for unpopular archs mainlined in compilers.


The structure of Rust makes it possible to port it to those architectures, and if Rust continues its present momentum, someday in the future you’ll be just as likely to see some random embedded system support Rust as it is to support C now. We just haven’t had the time for it to catch on enough to get there, but there’s nothing fundamental about the language to prevent it.


> "supposedly" Rust's architecture support for all but the most popular archs is horrendous.

Unpopular architectures are a bit of a pain. But what if you compiled from Rust to WASM and then transpiled the WASM object code to C? Is this a viable approach?


The Rust philosophy is to add as little as possible to the standard library, for many reasons, but particularly because otherwise you have to update your compiler to get library updates (which also means long release cycles).

This is why there are various "de facto standard" crates for basic things like regexes or error handling that would be built-in in other languages.


To add on to the siblings one of the major motivating use cases for rust is to interact with another GC, originally the JS runtime GC for the servo browser. An on-by-default one would interfere with that usecase heavily. What I could see and there have been some half-hearted efforts in this direction is to standardize on a `Trace` trait so that users who want to bring their own GC can do so safely and can allocate std lib objects into the GC heap without concern.


I guess where I disagree is wrt. viewing something like the OP's implementation as something that's "on by default". To me it seems to be pretty much the opposite of that. You're right though that a standard Trace trait might be interesting.


Rust copied algebraic data types from ML style languages, yet programming in a ML style is pretty bad in Rust because there is no garbage collector (languages like Haskell would be completely unusable without garbage collector). This could fix that.


I personally believe that things like automatically calling `PureDeref::pure_deref` (or similar) on patterns (similar to the apply/unapply methods in Scala) would be a better way to improve that experience.


Considering not needing a garbage collector is a goal of Rust, if programming in an ML style without a garbage collector is bad then it sounds like Rust picked the wrong style.


I would specify grandparent's claim a bit more - certain kind of code, frequently used in ML style without a GC is hard/impossible.

E.g. heavily recursive data structures may not be a best fit (also not the best fit for the CPU).

Rust has definitely grown out its own style, which is more of a "strongly typed FP with local-scoped imperative parts", elegantly combining ML's strengths with C/C++-style performant patterns.


Haskell is a pure FP language with what amounts to imperative EDSL's in the form of the ST, STM, and IO monads. Haskell has its own issues, but is that approach of an FP outer layer with embedded imperative sections such a bad approach? If you wanted to not have a GC, you could use just the imperative fragment.


> If you wanted to not have a GC, you could use just the imperative fragment.

I think it would be quite hard in practice to make Haskell not dependent on a GC. What Haskell is most obviously lacking at present is a more elegant story of how to interface "strict" and "lazy" parts of a single program. PLT researchers and logicians have been exploring these questions for some time, coming up with notions such as "polarity" and "focusing", views of some types as "naturally strict" (such as tagged unions), others as "naturally lazy" (such as functions) with still others coming in both strict and lazy varieties ("product" or record types), and elementary operators to shift between "strict" and "lazy" uses of any type - but the practical implications of these concepts in real-world programming language are still unclear. Haskell would make an ideal testing ground as such.


I think what they mean is that you can program without a GC in a GC'd language, but it is much harder to do it the other way around.


I think it's just not feasible to totally avoid the GC in a language such as Haskell. It's inherently involved in anything that would require heap-allocated data in C/C++, and more besides. For example, Haskell does not have a borrow checker, so it needs the GC to ensure that closures' environments will stick around for as long as they're needed. (The Rust approach is the opposite by default - the compiler errors out whenever a closure will outlive the established lifetime of its referenced environment. You can always use Rc<> or Arc<> to have multiple owners that will all keep a program object alive as needed, but it's not the default.)


It's quite possible to program imperatively with no GC, no borrow checker, and no memory leaks in Haskell or in other languages such as Ada. The main technique is to simply not allocate or free anything, but instead just use fixed buffers, or objects that you have allocated before entering the no-GC region. In Haskell and other GC'd languages, GC generally triggers on allocation, so no allocation = no GC.

Back when memory was scarce this was a well known practice in Lisp. You'd just adopt an ugly style in the parts of the program where performance was important, doing destructive updates instead of consing in order to avoid GC'ing in those parts. Rust's contribution (the borrow checker) is in letting you make more use of dynamic memory allocation while still not using a GC or programming unsafely.

You might look at this presentation for more: https://www.st.cs.uni-saarland.de/edu/seminare/2005/advanced...


I tried writing simple imperative programs in GHC using the IO and IORef types, and the programs ran about 1000 times slower than analogous programs in C.


Did they ever get siblings tail recursion? I remember using it, but since it only worked in release mode I could never rely on it.

I found "loop" pretty horrible.


Hmm. IANALL but aren't types orthogonal to allocation?


This is in the context of implementing a smart pointer that can be used within a interpreter for Scheme


The OP is implementing a scheme interpreter in Rust.


Your question is reasonable; I don't think you should have been downvoted.

That's indeed one of the major advantages of Rust but it's not the only reason people like the language. Just to give an example, you can write concurrent code without ever worrying that you accidentally failed to protect something with a mutex; whether some code is thread-safe or not is encoded in the type system and violating thread safety is a compiler error. This is a huge advantage that has little to do with lack of GC.


I wonder if there is an STM (software transactional memory) implementation that uses the borrow checker that way, similar to how Haskell's STM monad uses Haskell's type system to ensure nothing leaks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: