This is a very narrow definition of a time series DB, really more of a pure metrics store that requires individual time series to be kept separately. I find it odd, as I wouldn't define time series DB, in general, to be what they are. Rather, a time series DB is usually simple some sort of DB that supports a time column as the main index. There's tons that fall in that category (including Postgres, which you should absolutely use unless you have a compelling reason not to) that can easily handle many of the situations the article claims won't work. Druid can handle high-cardinality, for instance.
I couldn't agree more, and I've talked to the folks at TimescaleDB about exactly this issue; it can be hard for folks familiar with the narrow definition to understand how many more usecases a tool like Timescale can fit.
More broadly I think this is an issue with a narrow definition of "time series", aside from the DB angle. When I was doing more forecasting and predictive modeling, I was constantly stymied by "time series" resources only considering univariate time series, where my problems were always extremely multivariate (and also rarely uniformly sampled...).
When I've asked around for other vocabulary here, the options are slim. Panel data can work, but that has more to do with there being both a time and spatial dimension (e.g. group, cohort, user, etc.) than there being multiple metrics observed over time. It's also an unfamiliar term for data scientist folks without a traditional stats background. "Multivariate time series" might be technically correct, but that works much better in the modelling domain than the database domain.
The main challenge here is that extending relational algebra ("key tuple maps to row") into time ("key tuple maps to function from time to row") essentially cross-products two different dimensions, creating a much bigger domain to process and reason about.
Once you have the theory down, you can express some pretty handy relationships.
For you functional folks, you can consider the TRA as the regular RA lifted into the time monad. Except the composition can also go cross-wise; TRA can also be considerd as the narrow timeline lifted into the relational monad. Fun times!
Can you talk more about why you should use Postgres unless you have a compelling reason not to? I'm currently investigating options for storing high-frequency data. Postgres is looking like a good option.
Postgres has a lot of engineering work put into it to handle all kinds of use cases, plus massive effort by the community building tutorials, videos, wrappers, libraries, etc. If you have a problem with Postgres config, or are trying to get it to do something odd, there is undoubtedly a bunch of Stack Overflow discussions about that thing. For most other databases, the selection of all of those is much thinner.
Additionally, often we believe our application needs feature X, and there is some database tech that purports to excel at X, the fact is postgres is probably able to do X unless you get to some extreme velocity or volume. Furthermore, by going with another database you are almost always giving up on features W, Y, and Z that you don't realize you need that Postgres supports and isn't supported by the "Exotic" database you are thinking of.
In short, Postgres has amazing breadth and depth in features and support and tooling. Be sure you are ok giving that up!
If you like Postgres, you may want to try TimescaleDB, which is a time-series database built on Postgres (packaged as a Postgres extension). Postgres database + time-series database all in one.
This btw is one of the reasons I love Postgres - its extensibility.
Did you guys used to call it just ScaleDB? I remember talking to some people about something called ScaleDB that was built on top of Postgres back when I was looking for database solutions for another product, this was in 2015. We ended up going with Druid for that.
What do yo mean by high-frequency data? 100Hz, 1KHz, 100KHz? For that kind of use cases many time-series DBs break apart. We have customers storing multiple millions of high frequency measurements per sec in arrays.
I would say, Postgres is not too storage efficient in itself for large amounts of data, especially if you need any sorts of indexes. Timescale basically mitigates that by automatically creating new table in the background ("chunks") and keeping individual tables small.
TimescaleDB also implements compression. From the docs:
When compression is enabled, TimescaleDB converts data stored in many rows into an array. This means that instead of using lots of rows to store the data, it stores the same data in a single row. Because a single row takes up less disk space than many rows, it decreases the amount of disk space required, and can also speed up some queries.
We tried to use Postgres with TimescaleDB plugin for high frequency data several TB in size. It was unusable. Switched to Clickhouse, which was roughly 50-100 faster on the same hardware and 10 times less disk space. They use very different storage engines with different functionality so check the docs to see what fits your use case.