Hacker Newsnew | past | comments | ask | show | jobs | submit | TheSoftwareGuy's commentslogin

>Hard-code some logic to identify cranes and always assume there's a cable dangling from the end.

Probably this one. Even if the drone sees the crane, there's no guarantee the cable won't move faster than the drone can react.


>It's not like there's some secret sauce here in most of these implementation details. If there was, I'd understand not telling us. This is probably less an Apple-style culture of secrecy and more laziness and a belief that important details have been abstracted away from us users because "The Cloud" when in fact, these details do really matter for performance and other design decisions we have to make.

Having worked inside AWS I can tell you one big reason is the attitude/fear that anything we put in out public docs may end up getting relied on by customers. If customers rely on the implementation to work in a specific way, then changing that detail requires a LOT more work to prevent breaking customer's workloads. If it is even possible at that point.


Right now, it is basically impossible to reliably build full applications with things like DynamoDB (among other AWS products), without relying on internal behaviour which isn't explicitly documented.


I've built several DynamoDB apps, and while you might have some expectations of internal behaviour, you can build apps that are pretty resilient to change of the internal behaviour but rely heavily on the documented behaviour. I actually find the extent of the opacity a helpful guide on the limitations of the service.


Agree. TTL 48h SLA comes to mind.


I am also a former AWS employee. What non public information did you need for DDB?


Try ingesting the a complete WHOIS dump into DDB sometime. This was before autoscaling worked at all when I tried... but it absolutely wasn't anything one can consider fun.

In the end, after multiple implementations, finally had to use a Java Spring app on a server with a LOT of ram just to buffer the CSV reads without blowing up on the pushback from DDB. I think the company spent over $20k in the couple months on different efforts in a couple different languages (C#/.Net, Node.js, Java) across a couple different routes (multiple queues, lambda, etc) just to get the initial data ingestion working a first time.

The Node.js implementation was fastest, but would always blow up a few days in without the ability to catch with a debugger attached. The queues and lambda experiments had throttling issues similar to the DynamoDB ingestion itself, even with the knobs turned all the way up. I don't recall what the issue with the .Net implementation was at the time, but it blew up differently.

I don't recall all the details, and tbh I shouldn't care, but it would have been nice if there was some extra guidance of trying to take in a few gb of csv into DynamoDB at the time. To this day, I still hate ETL work.



Cool... though that would make it difficult to get the hundred or so CSVs into a single table, since it isn't supported I guess stitching them before processing would be easy enough... also, no idea when that feature became available.


It’s never been a good idea to batch ingest a lot of little single files using any ETL process on AWS, whether it be DDB, Aurora MySQL/Postgres using “load data from S3…”, Redshift batch import from S3, or just using Athena (yeah I’ve done all of them).


These weren't "little" single files... just separated by tld iirc.


Why would you expect an OLTP db like DDB to work for ETL? You'd have the same problems if you used Postgres.

It's not like AWS is short on ETL technologies to use...


Even in an OlTP db, there is often a need to bulk import and export data. AWS has methods in most supported data stores - ElasticSearch, DDB, MySQL, Aurora, Redshift, etc to bulk insert from S3.


A tool to look at hot partitions, for one thing.



The keyword here is "should" :) Back then DynamoDB also had a problem with scaling the data can be easily split into partitions, but it's never merged back into fewer partitions.

So if you scaled up and then down, you might have ended with a lot of partitions that got only a few IOPS quota each. It's better now with burst IOPS, but it still is a problem sometimes.


Totally incorrect for Dynamo.

It was probably correct for Cognito 1.0.


And yet "Hyrum's Law" famously says people will come to rely on features of your system anyway, even if they are undocumented. So I'm not convinced this is really customer-centric, it's more AWS being able to say: hey sorry this change broke things for you, but you were relying on an internal detail. I do think there is a better option here where there are important details that are published but with a "this is subject to change at any time" warning slapped on them. Otherwise, like OP says, customers just have to figure it all out on their own.


You're right, people absolutely do rely on internal behavior intentionally and sometimes even unintentionally. And we tried our hardest not to break any of those customers either. but the point is that putting something in the docs is seen as a promise that you can rely on it. And going back on a promise is the exact opposite of the "Earns Trust" leadership principal that everyone is evaluated against.


Sure, but the court isn’t going to consider hyrum’s law in a tort claim, but might consider AWS documentation - even with a disclaimer - with more weight.

Rely on undocumented behavior at your own risk.


Has Amazon ever been taken to court for things like this? I really don't think this is a legal concern.


I don't buy the legal angle. But if I was an overworked Amazon SWE I'd also like to avoid the work of documentation and a proper migration the next time implementation is changed.


Amazon is involved in so many lawsuits right now, I honestly can’t tell. I did some google searches and gave up after 5+ pages.


Thanks for this, that's a really insightful comment.



You have been quoted Simon Willison on his blog - his blog is popular on HN.

https://simonwillison.net/2025/Sep/8/thesoftwareguy/#atom-ev...


Just add an option to re-enable spacebar heating.


Sounds like your organization isn’t learning from these periods of high bill. What lead to the bill creeping up, and what mechanisms could be put in place to prevent them in the first place?


At only 20k a month, the work put into reducing the bill back down probably costs more in man hours than the saving, time which would presumably be better spent building profitable features that more than make up for the incremental cloud cost. Assuming of course the low hanging fruit of things like oversized instances, unconstrained cloudwatch logs and unterminated volumes have all been taken care of.


> what mechanisms could be put in place to prevent them in the first place?

Those mechanisms would lead to a large reduction in their "engineering" staff and the loss of potential future bragging rights in how modern and "cloud-native" their infrastructure is, so nobody wants to implement them.


I'd be wary of draining the battery while the car is off. You don't want to prevent the car from starting


The ~10 or 20mA or so one of these things draws would take months to do that.


Your lack of empathy is obvious when you say the benefit of these services is that "people are lazy". Many many people simply don't have extra time, and taking one thing off of their plate makes life easier. For many decades, pizza was one of the only meals you could get delivered, these services just expand that to more restaurants.


If you really wanted to look like Amazon codex you would write Java :)


I see I missed an opportunity to make it even funnier.


Is this meant to be used in production systems, or is it just a learning exercise?


Plotting is one task I find such huge benefits to AI coding assistants. I can ask "make a plot with such and such data, one line per <blank>" etc. Since its so east to validate the code (just run the program and look at the plots) iterations are super easy


That's probably 50% what I use Claude for. But always "use matplotlib's explicit / object-oriented interface and don't add comments".


Technology: Noun

1. The application of science, especially to industrial or commercial objectives.


On the other hand, I can imagine that banning that banning one form of advertising drives those would-be advertisers to other mediums, such as the ones that drive addictive apps and such. This would in turn increase the revenue of those apps, and make that business model more attractive, compared to e.g. apps that are a one-time purchase


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: