I've worked with many serialization formats (XDR, SOAP XML w/ XML schema, CORBA IDL and IIOP, JSON with and without schema, pickle, and many more). Protocol buffers remind me of XDR (https://en.wikipedia.org/wiki/External_Data_Representation). Which was a great technology in the day; NFS and NIS were implemented using it.
I do agree with some of the schema modelling criticisms in this article, but the ultimate thing to understand is: protocol buffers were invented to allow google to upgrade servers of different services (ads and search) asychronously and still be able to pass messages between them that can be (at least partly) decoded. They were then adopted for wide-range data modelling and gained a number of needed features, but also evolved fairly poorly.
IIRC you still can't make a 4GB protocol buffer because the Java implementation required signed integers (experts correct me if I'm wrong) and wouldn't change. This is a problem when working with large serialized DL models.
I was talking to Rob Pike over coffee one morning and he said everythign could be built with just nested sequences key/value pairs and no schema (IE, placing all the burden for decoding a message semantically on the client) and I don't think he's completely wrong.
I do agree with some of the schema modelling criticisms in this article, but the ultimate thing to understand is: protocol buffers were invented to allow google to upgrade servers of different services (ads and search) asychronously and still be able to pass messages between them that can be (at least partly) decoded. They were then adopted for wide-range data modelling and gained a number of needed features, but also evolved fairly poorly.
IIRC you still can't make a 4GB protocol buffer because the Java implementation required signed integers (experts correct me if I'm wrong) and wouldn't change. This is a problem when working with large serialized DL models.
I was talking to Rob Pike over coffee one morning and he said everythign could be built with just nested sequences key/value pairs and no schema (IE, placing all the burden for decoding a message semantically on the client) and I don't think he's completely wrong.