Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I'm still not sure why we even need a message bus in there in the first place.

Because traditional POSIX IPC mechanisms are absolute unworkable dogshit.

> It is one of the worst serialisation decisions we ever made as a society.

There isn't really any alternative. It's either JSON or "JSON but in binary". (Like CBOR.) Anything else is not interoperable.



There are a world of serialization formats that can offer a similar interoperability story to JSON or JSON-but-binary formats. And sure, implementing them in every language that someone might be interested in using them in might require complication, but:

- Whatever: people in more niche languages are pretty used to needing to do FFI for things like this anyhow.

- Many of them already have a better ecosystem than D-Bus. e.g. interoperability between Protobuf and Cap'n'proto implementations is good. Protobuf in most (all?) runtimes supports dynamically reading a schema and parsing binary wire format with it, as well as code generation. You can also maintain backwards compatibility in these formats by following relatively simple rules that can be statically-enforced.

- JSON and JSON-but-binary have some annoying downsides. I really don't think field names of composite types belong as part of the ABI. JSON-like formats also often have to try to deal with the fact that JSON doesn't strictly define all semantics. Some of them differ from JSON is subtle ways, so supporting both JSON and sorta-JSON can lead to nasty side-effects.

Maybe most importantly, since we're not writing software that's speaking to web browsers, JSON isn't even particularly convenient to begin with. A lot of the software will be in C and Rust most likely. It helps a bit for scripting languages like Python, but I'm not convinced it's worth the downsides.


Sorry, but bash isn't a "niche language" and it doesn't have an FFI story.



I don't know how to tell you this, but, you don't need to implement an RPC protocol in bash, nor do you need FFI. You can use CLI tools like `dbus-send`.

I pray to God nothing meaningful is actually doing what you are insinuating in any serious environment.


I'm trying to tell you that something that isn't straceable and greappable shouldn't belong in your system services stack.


Strace/ptrace is awful, and I don't know what "greappable" means here. Nothing stops us from adding DTrace probes.


FFI is the shell's only job.


This is a quite frankly ridiculous point. Most of that garb came from the HPC people who built loads of stuff on top of it in the first place. It's absolutely fine for this sort of stuff. It's sending the odd little thing here and there, not on a complex HPC cluster.

As for JSON, are you really that short sighted that it's the only method of encoding something? Is "oh well it doesn't fit the primitive types, so just shove it in a string and add another layer of parsing" acceptable? Hell no.


> ...it's the only method of encoding something?

If you want something on the system level parsable by anything? Yes it is.


protobufs / asn.1 / structs ...

Edit: hell even XML is better than this!


Structs are a part of C semantic. They are not an ipc format. You can somewhat use them like one if you take a lot of precaution about how they are laid out in memory including padding and packing but it’s very brittle.

Asn.1 is both quite complicated and not very efficient.

They could certainly have gone with protobufs or another binary serialisation format but would it really be better than the current choice?

I don’t think the issue they are trying to solve is related to serialisation anyway. Seems to me they are unhappy about the bus part not the message format part.


ASN.1 BER/DER is more or less the same thing as CBOR. The perceived complexity of ASN.1 comes from the schema language and specifications written in the convoluted telco/ITU-T style (and well, the 80's type system that has ~8 times two different types for “human readable string”).


That "convoluted telco/ITU-T style" yields amazingly high quality specifications. I'll take X.68x / X.69x any day over most Internet RFCs (and I've written a number of Internet RFCs). The ITU-T puts a great deal of effort into its specs, or at least the ASN.1 working group did.

ASN.1 is not that complicated. Pity that fools who thought ASN.1 was complicated re-invented the wheel quite poorly (Protocol Buffers I'm looking at you).


For our sins, our industry is doomed to suffer under unbearable weight of endless reinvented wheels. Of course it would have been better to stick with ASN.1. Of course we didn't, because inexperience and hubris. We'll never learn.


It sure seems that way. Sad. It's not just hubris nor inexperience -- it's cognitive load. It's often easier to wing something that later grows a lot than it is to go find a suitable technology that already exists.


One thing I liked about a Vernor Vinge sci-fi novel I read once was the concept of "computer archeologist". Spool the evolution of software forwards a few centuries, and we'll have layers upon layers of software where instead of solving problems with existing tooling, we just plaster on yet another NIH layer. Rinse and repeat, and soon enough we'll need a separate profession of people who are capable of digging down into those old layers and figure out how they work.


> The perceived complexity of ASN.1 comes from the schema language and specifications written in the convoluted telco/ITU-T style (and well, the 80's type system that has ~8 times two different types for “human readable string”).

I can’t resist pointing that it’s basically a longer way of saying quite complicated and not very efficient.


> I can’t resist pointing that it’s basically a longer way of saying quite complicated and not very efficient.

That's very wrong. ASN.1 is complicated because it's quite complete by comparison to other syntaxes, but it's absolutely not inefficient unless you mean BER/DER/CER, but those are just _some_ of the encoding rules available for use with ASN.1.

To give just one example of "complicated", ASN.1 lets you specify default values for optional members (fields) of SEQUENCEs and SETs (structures), whereas Protocol Buffers and XDR (to give some examples) only let you specify optional fields but not default values.

Another example of "complicated" is that ASN.1 has extensibility rules because the whole "oh TLV encodings are inherently extensible" thing turned out to be a Bad Idea (tm) when people decided that TLV encodings were unnecessarily inefficient (true!) so they designed efficient, non-TLV encodings. Well guess what: Protocol Buffers suffers from extensibility issues that ASN.1 does not, and that is a serious problem.

Basically, with a subset of ASN.1 you can do everything that you can do with MSFT RPC's IDL, with XDR, with Protocol Buffers, etc. But if you stick to a simple subset of ASN.1, or to any of those other IDLs, then you end up having to write _normative_ natural language text (typically English) in specifications to cover all the things not stated in the IDL part of the spec. The problem with that is that it's easy to miss things or get them wrong, or to be ambiguous. ASN.1 in its full flower of "complexity" (all of X.680 plus all of X.681, X.682, and X.683) lets you express much more of your protocols in a _formal_ language.

I maintain an ASN.1 compiler. I've implemented parts of X.681, X.682, and X.683 so that I could have the compiler generate code for the sorts of typed holes you see in PKI -all the extensions, all the SANs, and so on- so that the programmer can do much less of the work of having to invoke a codec for each of those extensions.

A lot of the complexity in ASN.1 is optional, but it's very much worth at least knowing about it. Certainly it's worth not repeating mistakes of the past. Protocol Buffers is infuriating. Not only is PB a TLV encoding (why? probably because "extensibility is easy with TLV!!1!, but that's not quite true), but the IDL requires manual assignment of tag values, which makes uses of the PB IDL very ugly. ASN.1 originally also had the manual assignment of tags problem, but eventually ASN.1 was extended to not require that anymore.

Cavalier attitudes like "ASN.1 is too complicated" lead to bad results.


> That's very wrong. ASN.1 is complicated because it's quite complete by comparison to other syntaxes

So, it's quite complicated. Yes, what I have been saying from the start. If you start the conversation by "you can define a small subset of this terrible piece of technology which is bearable", it's going to be hard convincing people it's a good idea.

> Cavalier attitudes like "ASN.1 is too complicated" lead to bad results.

I merely say quite complicated not too complicated.

Still, ASN.1 is a telco protocol through and through. It shows everywhere: syntax, tooling. Honestly, I don't see any point in using it unless it's required by law or by contract (I had to, I will never again).

> but it's absolutely not inefficient unless you mean BER/DER/CER, but those are just _some_ of the encoding rules available for use with ASN.1.

Sorry, I'm glade to learn you can make ASN.1 efficient if you are a specialist and now what you are doing with the myriad available encodings. It's only inefficient in the way everyone use it.


> So, it's quite complicated.

Subsets of ASN.1 that match the functionality of Protocol Buffers are not "quite complicated" -- they are no more complicated than PB.

> Still, ASN.1 is a telco protocol through and through.

Not really. The ITU-T developed it, so it gets used a lot in telco protocols, but the IETF also makes a lot of use of it. It's just a syntax and set of encoding rules.

And so what if it were "a telco protocol through and through" anyways? Where's the problem?

> It shows everywhere: syntax, tooling.

The syntax is very much a 1980s syntax. It is ugly syntax, and it is hard to write a parser for using LALR(1) because there are cases where the same definition means different things depending on what kinds of things are used in the definition. But this can be fixed by using an alternate syntax, or by not using LALR(1), or by hacking it.

The tooling? There's open source tooling that generates code like any XDR tooling and like PB tooling and like MSFT RPC tooling.

> Sorry, I'm glade to learn you can make ASN.1 efficient if you are a specialist and now what you are doing with the myriad available encodings. It's only inefficient in the way everyone use it.

No, you don't have to be a specialist. The complaint about inefficiency is about the choice of encoding rules made by whatever protocol spec you're targeting. E.g., PKI uses DER, so a TLV encoding, thus it's inefficient. Ditto Kerberos. These choices are hard to change ex-post, so they don't change.

"[T]he way everyone use it" is the way the application protocol specs say you have to. But that's not ASN.1 -- that's the application protocol.


> The tooling? There's open source tooling that generates code like any XDR tooling and like PB tooling and like MSFT RPC tooling.

There is no open source tooling that combines really used scheme understanding - in 5G this includes parameterized specifications by X.683 - and decoder able to show partially decoded message before an error, with per-bit explanation of rules led to its encoding.

> E.g., PKI uses DER, so a TLV encoding, thus it's inefficient.

When it is used, ~5% space economy is never worth people efforts to diagnose any problem. I strictly vote for this "inefficiency".


> Structs are a part of C semantic.

Uh, no, structs, records, whatever you want to call them, are in many, if not most programming languages. "Structs" is not just "C structs" -- it's just shorthand for "structured data types" (same as in C!).

> Asn.1 is both quite complicated and not very efficient.

Every rich encoding system is complicated. As for efficiency, ASN.1 has many encoding rules, some of which are quite bad (BER/DER/CER, which are the first family of ERs, and so many thing ASN.1 == BER/DER/CER, but that's not the case), and some of which are very efficient (PER, OER). Heck, you can use XML and JSON as ASN.1 encoding rules (XER, JER).

> They could certainly have gone with protobufs or another binary serialisation format but would it really be better than the current choice?

Protocol buffers is a tag-length-value (TLV) encoding, same as BER/DER/CER. Having to have a tag and length for every value encoded is very inefficient, both in terms of encoding size as well as in terms of computation.

The better ASN.1 ERs -PER and OER- are much more akin to XDR and Flat buffers than to protobufs.

> I don’t think the issue they are trying to solve is related to serialisation anyway. Seems to me they are unhappy about the bus part not the message format part.

This definitely seems to be be the case.


It seems you posit taglessness to be a universal crucial merit of any encoding scheme. This is good in an ideal world, heh.

I have had a misfortune to work for 5G which is full of PER-encoded protocols. Dealing with discrepancies in them - incompatible changes in 3GPP standard versions, different vendorsʼ errors, combined with usually low level of developers and managers in a typical corporation - was an utter nightmare.

IETF, in general, provides a good policy combining truly fixed binary protocols when they are unavoidable (IP/TCP/UDP levels) and flexible, often text, protocols where there is no substantial overhead from their use. Their early moves, well, suffered from over-grammaticalization (as RFC822). CBOR is nice here because it combines tagness and compactness. 3-bit basic tag combined with value (if fit) or length, it is commensurable with OER in efficiency but is decodable without scheme - and it is extremely useful in practice.


> Uh, no, structs, records, whatever you want to call them

It's plenty clear from discussion context that OP is talking about C struct but yes, replace C with any languages which suit you. It will still be part of the language semantic and not an IPC specification.

The point is you can't generally use memory layout as an IPC protocol because you generally have no guarantee that it will be the same for all architectures.


If it's IPC, it's the same architecture (mostly; typically there's at most 3 local architectures). The receiver can always make right. If there's hidden remoting going on, the proxies can make things right.


> The receiver can always make right.

Certainly but that’s hardly structs anymore. You are implicitly defining a binary format which is aligned on the sender memory layout then.


It's "structs" when the sender and receiver are using the same architecture, and if they're using the same int/long/pointer sizes then the only work to do is swabbing and pointer validation / fixups. That's a lot less work than is needed to do just about any encoding like protocol buffers, but it's not far from flat buffers and capnp.


Thank goodness they didn’t pick YAML though.


Yet!


You don't quite understand how this works.

One requirement is being able to strace a misbehaving service and figure out quickly what it's sending and receiving.

This is a system-level protocol, not just a random user app.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: