Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Which problem is more serious? 1) your small company has an over-complex system that could have been postgres; 2) your medium-sized company has a postgres that's on fire at the bottom of the ocean every day despite the forty people you hired to stabilize postgres, and your scalable replacement system is still six months away?


#1 is more serious. #2 limits the growth of your already successful company. #1 sinks your struggling small business. You have to be successful to be a victim of your own success, after all. Not to mention the fact that #1 is way more common. Do you know how far Postgres scales? Because it's way past almost any medium- scale business.


Exactly. A lot of us work at #2 so we wish our predecessors saved us our current pain. But if they went that route we wouldn't be employed at that company because it wouldn't exist


Exactly, if a medium-sized company is struggling with Postgres, either they have very niche requirements or the scalability problems are in their own code.


What about #1b: you have an overly-complex "system", but most of that "system" is serverless (i.e. managed architecture that's Somebody Else's Problem), with your own business-logic only being exposed to a rather simple API?

I'm thinking here of engineering teams who, due to worries about scaling their query IOPS, turn not to running a Hadoop cluster, but rather to using something like Google's BigTable.


Sounds like a best practice to me?


Probably 3) the system you overengineered too early solved the wrong problem, and your replacement is six months away, but you've paid for it twice.


I have very rarely seen the second scenario, but the first seems more common.


Isn't the second example representative of all tech debt / neglect ever? If so, it's very common.


In the second scenario, they can't do math. They could have bought themselves 6-18 months by getting the most powerful machine available using probably at most 1-2 salaries worth of those 40 people.

Less a single digit percentage of workloads needs massive, hard to use horizontal scale out (for things that can solved on a single machine, or a single database).

MR is useful as an adhoc scheduler over data. Need to OCR 10k files, MR it.

Hadoop was the worst possible implementation of MR, wasted so much of everything. That was its primary strength.


Very early on in my enterprise career, in a continuance of a discussion where it was mentioned that our customer was contemplating a terabyte disk array (that would fill an entire server rack, so very fucking early) I learned about the great grandfather of NVME drives: battery backed RAM disks that cost $40k inflation adjusted.

“Why on earth would you spend the cost of a brand new sedan on a drive like this?” I asked. Answer: to put the Oracle or DB2 WAL data on so you could vertically scale your database just that much higher while you tried to solve the throughput problems you were having another way. It was either the bargaining phase of loss or a Hail Mary you could throw in to help a behind-schedule rearchitecture. Last resort vertical scaling.


Reminds me when I had a 3-machine Hadoop cluster in my home lab and 2 nodes were turned off but I was submitting jobs to get and getting results just fine.

I remember all the people pushing erasure code based distributed file systems pointing out how crazy it is to have three copies of something but Hadoop could run in a degraded condition without degraded performance.


I agree. I used Disco MR to do amazing things. Trivial to use, like anyone could be productive in under an hour.

Erasure codes are awesome, but so is just having 3 copies. When you have skin in the game, simplicity is the most important driver of good outcomes. Look at the dimensions that Netezza optimized, they saw a technological window and they took it. Right now we have workstations that can push 100GB/s from from flash. We are talking about being able to sort 1TB of data in 20 seconds (from flash) the same machine could do it from ram in 10.

https://github.com/discoproject/disco

I need to give Ray and Dask a try.

I don't know where to put this comment so I'll put it here. DeWitt and Stonebraker are right, but also wrong. Everyone is talking past each other there. Both are geniuses, this essay wasn't super strong.

If I was their editor, I would say, reframe it as MapReduce is an implementation detail, we also need these other things for this to be usable by the masses. Their point about indexes proves my point about talking past each other. If you are scanning the data basically once, building an index is a waste.


No, plenty of tech debt is caused by over-engineering or pre-maturely optimizing for the wrong thing.

I'm not sure if the second outcome is meant to blame Postgres specifically on under-engineering in general, but neither seems to me like it should be a concern for an early-stage startup.


I generally classify tech debt more as a long todo/wish list that we'll never get a chance to work on rather than a server or service being on fire.


I have found that these fires become uncontrollable because of tech debt. Whole rarely the spark, it’s a latent fuel source.

It’s like our modern forests; unless something clears out the brush, we see wildfires start from the smallest spark. Once it starts, it’s almost impossible to do anything but try to limit the extent of the disaster.


This was true in 2009. Since then, multiple PostgreSQL-compatible databases have launched.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: