Hacker Newsnew | past | comments | ask | show | jobs | submit | more honestSysAdmin's commentslogin

> From my experience, the vast majority of complications in systems is people not realizing they are asking an OLAP question while wanting parts of OLTP semantics.

If you could elaborate on this further, I and others are probably very interested in reading more about it.


As the sibling says, it is easy to think about in terms of what you are doing with the data. Reporting on how many transactions of a type have happened over a duration? Probably OLAP. Really, if the word "report" shows up, at all, probably OLAP. Executing parts of a workflow for an active transaction? OLTP.

Briefly looking, I can't find the books that I thought covered a lot of this in a good way. Will keep looking, apologies.


Designing Data Intensive Applications is a very good book in this space.


Designing Data Intensive Applications by Martin Kleppmann is required reading if you're in this space.

I can't recommend it highly enough.

[1] https://www.amazon.com/Designing-Data-Intensive-Applications...


Indeed, I upvoted that response to me, but I should have said as much. This is the book I couldn't remember the name of for the life of me. Really good book.


The main data access difference between OLAP systems and OLTP systems is how many records on average do you need to access:

- OLAP: most queries need most records (aggregations span large swaths of data)

- OLTP: most queries access just a few records

Also, in OLAP, in many cases, you can live with a single-updater model without much trouble, where OLTP, the strength is to have many concurrent updaters (but mostly non-overlapping).


- OLAP: read-mostly, table-scan heavy, many queries run ad-hoc by users

- OLTP: write-mostly, index-seek heavy, ~all queries pre-defined up front


OLAP - Most queries need an aggregate of records. Generally you do NOT need most records, but simply the records grouped by dimensions per interval (for almost all olap reporting). You do not change the data, you observe it. If you change it, you are not dealing with OLAP data.

OLTP - You are dealing with the ins and outs of people using stuff to do things. You buy something, you check out something, you some way perturb the state of things. This should not require large amount of row lookups in 99.9% of cases.


So the first focuses on analytics and reporting, the second on transactions and performance. They are not meant to replace each other, they serve different purposes. Some teams may need both.


Apart from the above differences, another important difference is that OLAP is often columnar based, as opposed to the typical OLTP being row-based. So, OLAP queries use different kinds of index. Snowflake has introduced Hybrid tables where the same data is stored and indexed twice, once in OLAP columnar type and the other in OLTP style row index.


> then I'd call that a no-downtime upgrade!

It'd be really convenient for me, well not me but others, if we could tell our customers this. However, those of us running DBaaS do have to offer an actual no-downtime upgrade.


Zero-downtime Postgres upgrades have been kind of normalized, at least in the environments I have been exposed to, with pgcat

  https://github.com/postgresml/pgcat


Indenting with 2 spaces is for code formatting, which is why the URL isn't a link. Don't indent it if you want a link:

https://github.com/postgresml/pgcat


Is there some resource that explains how to do a major version upgrade with pgcat? Would love to take a look


Probably something like the steps listed in this blog post: https://www.instacart.com/company/how-its-made/zero-downtime...


As far as I know, there is not. I could probably write something up.


This is really cool/useful to know about - thanks for dropping the link!


I do consulting for multiple GPU hosting companies. DeepSeek means more things can be done with less and the demand for access to GPU has only increased.

The analysts, analysts who do not code, analysts who don't run GPU clouds, are wrong.

The anals are wrong.


This is my intuition as well. DeepSeek and the proliferation of local LLMs have showed us just how much can be done with relatively light hardware. It finally has me wanting to build a rig for local use when before I felt like the technology wasn't quite there for me to do anything too useful with some consumer hardware. This changed over night. I imagine there are many folks like me.


There's also lot you can do with not-brand-new server hardware that doesn't cost that much. The hard part is a location to put it where the noise is not bothersome. Though many who are "upper middle class" or on low-cost private land can manage to make this happen. I have seen people that don't have a lot of money, rather young people, do a lot on private land. Small dedicated server buildings are a thing.


Or, we just end up in a Kowloon Walled City cyberpunk hellscape with water cooling pipes and fans running though all our environment.


I have been consulting with startups in the SF Bay Area, many of them are in their B-series and C-series funding rounds ; offices in SF skyscrapers and general SV / San Jose area. In the C-series I have been involved in long enough, I can count the closet Trump supporters on more than one hand. At the B-series companies, it's more than just one person.

Personally, I'm not among them, but I don't lean into "Orange Man is Hitler" either. I grew up in a place where I know what real extremists do actually look like.

Treating those with different opinions, is brewing the "fascism" we are hearing so much about but aren't going to see, not in this presidential term at least. We need to try actually talking to people more.

But these people are never going to admit they are Trump supporters. If you think you are someone that could they should believe they could have a rational conversation with them, ask yourself if you still believe the "fine people" hoax, and check what Snopes (yes, Snopes) has to say about that hoax before replying here.


  https://www.nytimes.com/2025/01/24/us/embassy-us-flag-blm-gay-pride.html


You're in good hands.

  https://vanillaos.org/team


A good law would be that if a customer's data is leaked, any and all revenue that was made with/through that customer must be returned to the customer. All of a sudden companies will magically remember how to do half-way sober IT again.


This would be awesome, few if any companies would be able to take the risk of storing customer info, since they would need very good security, and very good reason for every piece of data they store, and insurance to cover themselves in case they do lose your data. In fact companies would go out of their way to not store any of your data.


> since they would need very good security

As someone with 20+ years experience in IT/DevOps/Cloud/whatever, I disagree.

They would simply need to actually use the security that is already there. Data leaks that happen due to lack of "very good security" are extremely rare. In almost every case, someone was doing something very stupid that everyone already agrees is a very obvious thing to not do.

.

> In fact companies would go out of their way to not store any of your data.

The companies that already use existing IT systems, as they are already designed to be used, have no problem protecting customer data and not leaking it. The companies that can not properly hire our outsource competent IT people shouldn't be storing data in the first place. Commerce is subject to regulation, due to human nature, and different regulation is needed today.

.

> and insurance to cover themselves in case they do lose your data

I would prefer that this kind of insurance not exist.


No JMAP huh.


I wonder how many of those are bots.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: