> C libraries typically return opaque pointers to their data structures, to hide...

rectang · on March 13, 2021

> ABI compatibility

Rust provides ABI compatibility against its C ABI, and if you want you can dynamically link against that. What Rust eschews is the insane fragile ABI compatibility of C++, which is a huge pain to deal with as a user:

https://community.kde.org/Policies/Binary_Compatibility_Issu...

I don't think we'll ever see as comprehensive an ABI out of Rust as we get out of C++, because exposing that much incidental complexity is a bad idea. Maybe we'll get some incremental improvements over time. Or maybe C ABIs are the sweet spot.

anfilt · on March 13, 2021

Rust has yet to standardize an ABI. Yes you can call or expose a function with C calling conventions. However, you cant pass all native rust types like this, and lose some semantics.

However, as the parent comment you responded to you can enable LTO when compiling C. As rust is mostly always statically linked it basically always got LTO optimizations.

johncolanduoni · on March 13, 2021

Even with static linking, Rust produces separate compilation units a least at the crate level (and depending on compiler settings, within crates). You won't get LTO between crates if you don't explicitly request it. It does allow inlining across compilation units without LTO, but only for functions explicitly marked as `#[inline]`.

moonchild · on March 13, 2021

Swift has a stable ABI. It makes different tradeoffs than rust, but I don't think complexity is the cliff. There is a good overview at https://gankra.github.io/blah/swift-abi/

kelnos · on March 13, 2021

Swift has a stable ABI at the cost of what amounts to runtime reflection, which is expensive. That doesn't really fit with the goals of Rust, I don't think.

saagarjha · on March 13, 2021

This is misleading, especially since Swift binaries do typically ship with actual reflection metadata (unless it is stripped out). The Swift ABI does keep layout information behind a pointer in certain cases, but if you squint at it funny it's basically a vtable but for data. (Actually, even more so than non-fragile ivars are in Objective-C, because I believe actual offsets are not provided, rather you get getter/setter functions…)

I don't disagree that Rust probably would not go this way, but I think that's less "this is spooky reflection" and more "Rust likes static linking and cares less about stable ABIs, plus the general attitude of 'if you're going to make an indirect call the language should make you work for it'".

skohan · on March 13, 2021

Do you have a source on this? I didn't think Swift requires runtime reflection to make calling across module boundaries work - I thought `.swiftmodule` files are essentially IR code to avoid this

kelnos · on March 13, 2021

Pretty sure the link the parent (to my comment) provided explains this.

It's not the same kind of runtime reflection people talk about when they (for example) use reflection in Java. It's hidden from the library-using programmer, but the calling needs to "communicate" with the library to figure out data layouts and such, and that sounds a lot like reflection to me.

moonchild · on March 13, 2021

Yes, and if you use the C abi to dynamically link rust code, you will have exactly the same problem as c: you can't change the layout of your structures without breaking compatibility, unless you use indirecting wrappers.

quietbritishjim · on March 13, 2021

That's ABI compatibility of the language, not of a particular API.

If you have an API that allows the caller to instantiate a structure on the stack and pass a reference to it to your function, then the caller must now be recompiled when the size of that structure changes. If that API now resides in a separate dynamic library, then changing the size of the structure is an ABI-breaking change, regardless of the language.

gspr · on March 13, 2021

Rust seems great to me, but aren't we losing a lot by giving up on C's dynamic linking and shared libraries?

rstuart4133 · on March 14, 2021

To give some context to the parent comment:

$ ls -lh $(which grep) $(which rg)

-rwxr-xr-x 1 root root 199K Nov 10 06:37 /usr/bin/grep

-rwxr-xr-x 1 root root 4.2M Jan 19 09:31 /usr/bin/rg

My very unscientific measurement of the startup time of grep vs ripgrep is 10ms when the cache is cold (ie, never run before) and 3ms when the cache is hot (ie, was run seconds prior). For grep even in the cold case libc will already be in memory, of course. The point I'm trying to make is even the worst case, 10ms, is irrelevant to a human using the thing.

However, speaking as a Debian Developer, it makes a huge difference to maintaining the two systems that use the two programs. If a security bug is found in libc, all Debian has to do is make the fixed version of libc as a security update. If a bug is found in the rust stdlib create Debian has to track down every ripgrep like program that statically includes it, recompile it. There are current 21,000 packages that link to libc6 right in Debian right now. If it was statically linked, Debian would have to rebuilt and distribute _all_ of them. (As a side note, Debian has a lot hardware resources donated to it but if libc wasn't dynamlic I wonder if it could get security updates to a series of bugs in libc6 out in a timely fashion.)

I don't know rust well, but I thought it could dynamically link. The Debian rust packagers don't, for some reason. (As opposed 21,000 dependencies, libstd-rust has 1.) I guess there must be some kink in the rust tool chain that makes it easier not to. I imagine that would have to change if rust replaces C.

eeZah7Ux · on March 14, 2021

Thanks!

dr-ando · on March 13, 2021

I am sympathetic to the point you make but to be accurate, one can consume and create C and C compatible dynamic libraries with rust. So, one is not “losing” something because what you (and me) want - dynamic linking and shared libraries with a stable and safe rust ABI - was not there to begin with.

hctaw · on March 13, 2021

Some would argue you gain more than you lose.

Also to be pedantic, C doesn't spec anything about linkage. Shared objects and how linkers use them to compose programs is a system detail more than a language one.

shakow · on March 14, 2021

Dynamic linking and shared libraries are an OS feature, not a C one. C worked fine on DOS with no DLLs at the time.

This being said, Rust has no problem using dynamic libraries.

dan-robertson · on March 13, 2021

The reason Common Lisp uses pointers is because it is dynamically typed. It’s not some principled position about ABI compatibility. If I define an RGB struct for colours, it isn’t going to change but it would still need to be passed by reference because the language can’t enforce that the variable which holds the RGBs will only ever hold 3 word values. Similarly, the reason floats are often passed by reference isn’t some principled stance about the float representation maybe changing, it’s that you can’t fit a float and the information that you have a float into a single word[1].

If instead you’re referring to the fact that all the fields of a struct aren’t explicitly obvious when you have such a value, well I don’t really agree that it’s always what you want. A great thing about pattern matching with exhaustiveness checks is that it forces you to acknowledge that you don’t care about new record fields (though the Common Lisp way of dealing with this probably involves CLOS instead).

[1] some implementations may use NaN-boxing to get around this

kazinator · on March 13, 2021

Lisp users pointers because of the realization that the entities in a computerized implementation of symbolic processing can be adequately represented by tiny index tokens that fit into machine registers, whose properties are implemented elsewhere, and these tokens can be whipped around inside the program very quickly.

dan-robertson · on March 13, 2021

What your describing are symbols where the properties are much less important than the identity. Most CL implementations will use fixnums rather than pointers when possible because they don’t have some kind of philosophical affinity to pointers. For data structures, pointers aren’t so good with modern hardware. The reason Common Lisp tends to have to use pointers is that the type system cannot provide information about how big objects are. Compare this to the arrays which are often better at packing because they can know how big their elements are.

This is similar in typed languages with polymorphism like Haskell or ocaml where a function like concat (taking a list of lists to a single list) needs to work when the elements are floats (morally 8 bytes each) or bools (morally 1 bit each). The solution is to write the code once and have everything be in one word, either a fixnum or a pointer.

pharmakom · on March 13, 2021

Rust makes building from source and cross compiling so easy that I don’t really care for dynamic linking in my use cases of Rust.

skohan · on March 13, 2021

Dynamic linking is one thing I miss from Swift - I used dynamic linking for hot code reloading for several applications, which resulted in super fast and useful development loops. Given Rust's sometimes long compile times, this is something which would be welcome.

jdright · on March 13, 2021

There are crates for hot reloading in Rust, and they use dynamic linking.

skohan · on March 13, 2021

Do you have to stick to a C-FFI like interface, or can they handle rust-native features like closures and traits?

howinteresting · on March 13, 2021

Some stick to C FFI, some enforce that the Rust compiler version is the same which makes ABI issues irrelevant.

kazinator · on March 13, 2021

> This costs heap allocations and pointer indirections.

Heap allocations, yes; pointer indirections no.

A structure is referenced by pointer no matter what. Remember that the stack is accessed via a stack pointer.

The performance cost is that there are no inline functions for a truly opaque type; everything goes through a function call. Indirect access through functions is the cost, which is worse than a mere pointer indirection.

An API has to be well-designed this regard; it has to anticipate the likely use cases that are going to be performance critical and avoid perpetrating a design in which the application has to make millions of API calls in an inner loop. Opaqueness is more abstract and so it puts designers on their toes to create good abstractions instead of "oh, the user has all the access to everything, so they have all the rope they need".

Opaque structures don't have to cost heap allocations either. An API can provide a way to ask "what is the size of this opaque type" and the client can then provide the memory, e.g. by using alloca on the stack. This is still future-proof against changes in the size, compared to a compile-time size taken from a "sizeof struct" in some header file. Another alternative is to have some worst-case size represented as a type. An example of this is the POSIX struct sockaddr_storage in the sockets API. Though the individual sockaddrs are not opaque, the concept of providing a non-opaque worst-case storage type for an opaque object would work fine.

There can be half-opaque types: part of the structure can be declared (e.g. via some struct type that is documened as "do not use in application code"). Inline functions use that for direct access to some common fields.

pornel · on March 13, 2021

Escape analysis is tough in C, and data returned by pointer may be pessimistically assumed to have escaped, forcing exact memory accesses. OTOH on-stack struct is more likely to get fields optimized as if they were local variables. Plus x86 has special treatment for the stack, treating it almost like a register file.

Sure, there are libraries which have `init(&struct, sizeof(struct))`. This adds extra ABI fragility, and doesn't hide fields unless the lib maintains two versions of a struct. Some libraries that started with such ABI end up adding extra fields behind internal indirection instead of breaking the ABI. This is of course all solvable, and there's no hard limit for C there. But different concerns nudge users towards different solutions. Rust doesn't have a stable ABI, so the laziest good way is to return by value and hope the constructor gets inlined. In C the solution that is both accepted as a decent practice and also the laziest is to return malloced opaque struct.

spacechild1 · on March 13, 2021

> This costs heap allocations

I'd like to point out that this is not always the case. Some libraries, especially those with embedded systems in mind, allow you to provide your own memory buffer (which might live on the stack), where the object should be constructed. Others allow you to pass your own allocator.