Tagged PDF Funsies

sam_goody · on Dec 28, 2023

There is a well known comment regarding another Adobe format [1]:

> PSD is not even a bad format. Calling it such would be an insult to other bad formats, such as PCX or JPEG. No, PSD is an abysmal format.

> Having worked on this code for several weeks now, my hate for PSD has grown to a raging fire that burns with the fierce passion of a million suns.

> If there are two different ways of doing something, PSD will do both, in different places....

[1]: https://github.com/zepouet/Xee-xCode-4.5/blob/master/XeePhot...

jjgreen · on Dec 28, 2023

Yet they created PostScript, which is a delight. Strange.

gumby · on Dec 28, 2023

Postscript was a close descendant (basically a V2) of Interpress, which is the printing language we used for laser printers at (and from) PARC from the 1970s. Postscript was designed by the authors of Interpress when they left PARC to start Adobe. Two nerds, engineers and computer scientists, wrote it, taking into account their experiences of what worked and what didn't.

Things like PSD were designed by committees within Adobe, taking account marketing's concerns and building consensus around a camel when a horse, or perhaps cheetah, would have been vastly superior.

082349872349872 · on Dec 28, 2023

Conway's law suggests that the products any company creates when it is the gorilla of its niche will fail to have the creative unity of those it did back when it was a tiny startup.

> "In the long run every [technology] becomes rococo - then rubble." — AJP

ursusmaritimus · on Dec 28, 2023

I don't think that it's silly - it's just over-engineered :)

What the author describes is a general data structure called a "number tree", which is a general mapping from integers to arbitrary objects, represented as a tree of PDF objects. It is used in the standard at many places; in some cases, the keys are not consecutive.

The advantage over a plain dictionary or array is that the whole structure need not fit it memory at once. It's questionable if this is ever needed, but in early days of the PDF standard machines were much smaller and the authors decided to plan for documents with millions of pages processed by small machines. This style of thinking permeates the whole standard: see for example the page tree.

Still, it's unnecessarily complex for all but a tiny fraction (possibly zero) of uses. I would really appreciate if it were possible to use dictionaries instead of number trees whenever the tree is small.