I agree with both you and the person you're replying to, but...
My centrist take is that data can be represented wastefully, which is often ignored.
Most "wide" log formats are implemented... naively. Literally just JSON REST APIs or the equivalent.
Years ago I did some experiments where I captured every single metric Windows Server emits every second.
That's about 15K metrics, down to dozens of metrics per process, per disk, per everything!
There is a poorly documented API for grabbing everything ('*') as a binary blob of a bunch of 64-bit counters. My trick was that I then kept the previous such blob and simply took the binary difference. This set most values to zero, so then a trivial run length encoding (RLE) reduced a few hundred KB to a few hundred bytes. Collect an hour of that, compress, and you can store per-second metrics collected over a month for thousands of servers in a few terabytes. Then you can apply a simple "transpose" transformation to turn this into a bunch of columns and get 1000:1 compression ratios. The data just... crunches down into gigabytes that can be queried and graphed in real time.
I've experimented with Open Telemetry, and its flagrantly wasteful data representations make me depressed.
I think Prometheus works similar to this with some other tricks like compressing metric names.
OTEL can do gRPC and a storage backend can encode that however it wants. However, I do agree it doesn't seem like efficiency was at the forefront when designing OTEL
These tricks are essential for every database optimized for metrics / logs / traces. For example, you can read on how VictoriaMetrics can compress production metrics to less than a byte per sample (every sample includes metric name, key=value labels, numeric metric value and metric timestamp with millisecond precision). https://faun.pub/victoriametrics-achieving-better-compressio...
I only ever got it to a proof of concept. The back end worked as advertised, the issue was that there are too many bugs in WMI so collecting that many performance counters had weird side effects.
Google was doing something comparable internally and this spawned some fun blog titles like “I have 64 cores but I can’t even move my mouse cursor.”
My centrist take is that data can be represented wastefully, which is often ignored.
Most "wide" log formats are implemented... naively. Literally just JSON REST APIs or the equivalent.
Years ago I did some experiments where I captured every single metric Windows Server emits every second.
That's about 15K metrics, down to dozens of metrics per process, per disk, per everything!
There is a poorly documented API for grabbing everything ('*') as a binary blob of a bunch of 64-bit counters. My trick was that I then kept the previous such blob and simply took the binary difference. This set most values to zero, so then a trivial run length encoding (RLE) reduced a few hundred KB to a few hundred bytes. Collect an hour of that, compress, and you can store per-second metrics collected over a month for thousands of servers in a few terabytes. Then you can apply a simple "transpose" transformation to turn this into a bunch of columns and get 1000:1 compression ratios. The data just... crunches down into gigabytes that can be queried and graphed in real time.
I've experimented with Open Telemetry, and its flagrantly wasteful data representations make me depressed.
Why must everything be JSON!?