I mean, focus is a thing that Google has always struggled with. But I kind of doubt that customers who need online marketing (ads) are going to convert en masse to users who rent cloud TPUs instead.
Of course there's the general purpose RISC V CPU controller component but also, each NPU is designed in troikas that have one core reading data in, one core performing the actual kernel work, and the third core forwarding data out.
You don't actually need nanosecond latency to trade effectively in futures markets but it does help to be able to evaluate and make decisions in the single-digit milliseconds range. Almost no generative model is able to perform inference at this latency threshold.
A threshold in the single-digit milliseconds range allows the rapid detection of price reversals (signaling the need to exit a position with least loss) in even the most liquid of real futures contracts (not counting rare "flash crash" events).
> The models engage in mid-to-low frequency trading (MLFT) trading, where decisions are spaced by minutes to a few hours, not microseconds. In stark contrast to high-frequency trading, MLFT gets us closer to the question we care about: can a model make good choices with a reasonable amount of time and information?
This is true for some classes of strategies. At the same time there are strategies that can be profitable on longer timeframes. The two worlds are not mutually exclusive.
Yes, but LLM can barely cope with following the ordering of complex software tutorials linearly. Why would you reasonably expect them unprompted to understand time any better enough to trade and turn a profit?
> Your database is now handling your normal transactional workload, analytical queries, AND maintaining graph structures in memory for vector search.
No. No one in production is trying to use the same instance for all of these use-cases at scale. The fundamental misunderstanding here is assuming or even "demanding" that one instance should be able to provide OLTP, OLAP and vector ops with no compromises. The workloads are fundamentally different and doing serious work requires architecting the solution much more intelligently.
Please see famous essay,'The rise of "Worse is Better"' by James Gabriel[0] for an appreciation of how long this kind of thing has been happening in software.
> Both postgres and redis are used with the out of the box settings
Ugh. I know this gives the illusion of fairness, but it's not how any self-respecting software engineer should approach benchmarks. You have hardware. Perhaps you have virtualized hardware. You tune to the hardware. There simply isn't another way, if you want to be taken seriously.
Some will say that in a container-orchestrated environment, tuning goes out the window since "you never know" where the orchestrator will schedule the service but this is bogus. If you've got time to write a basic deployment config for the service on the orchestrator, you've also got time to at least size the memory usage configs for PostgreSQL and/or Redis. It's just that simple.
This is the kind of thing that is "hard and tedious" for only about five minutes of LLM query or web search time and then you don't need to revisit it again (unless you decide to change the orchestrator deployment config to give the service more/less resources). It doesn't invite controversy to right-size your persistence services, especially if you are going to publish the results.
I disagree. They found that Postgres, without tuning, was easily fast enough on low level hardware and would come with the benefit of not deploying another service. Additionally tuning it isn’t really relevant.
If the defaults are fine for a use case then unless I want to tune it for personal interest it’s either a poor use of my fun time or a poor use of my clients funds.
The default shared memory is 128MiB, not even 1% of typical machines today. A benchmark run with these settings is effectively crippling your hardware by making sure 99% of your available memory is ignored by postgres. It's an invalid benchmark, unless redis is similarly crippled.
> If the defaults are fine for a use case then unless I want to tune it for personal interest it’s either a poor use of my fun time or a poor use of my clients funds.
It doesn't matter if you've crippled the benchmark if the performance of both options still exceeds your expectations. Not all of us are trying eek out every drop of performance
And, well, if you are then you can ignore the entire post because Redis offers better perf than postgres and you'd use that. It's that simple.
good point, even postgres crippled was "good enough" so it doesn't change the overall message. Nonetheless, we should strive to do realistic and valid benchmarks, no?
Sometimes people host software on a server they own or rent, the server is plenty fast, and it costs literally nothing to issue those queries at the scale on which they’re needed.
Yes, that is true, but the original poster said getting rid of caches was always a good idea, when in reality the answer (as usual with engineering) is “it depends.”
I see people downvoting this. Anyone who disagrees with this, we have YAGNI for a reason - if someone said to me my performance was fine and they added caches, I would look at them with a big hairy eyeball because we already know cache invalidation is a PITA, that correctness issues are easy to create, and now you have the performance of two different systems to manage.
Amazon actually moved away from caches for some parts of its system because consistent behavior is a feature, because what happens if your cache has problems and the interaction between that and your normal thing is slow? What if your cache has some bugs or edge case behavior? If you don't need it you are just doing a bunch of extra work to make sure things are in sync.
> "If we don't need performance, we don't need caches" feels like a great broader takeaway here.
I don't think this holds true. Caches are used for reasons other than performance. For example, caches are used in some scenarios for stampede protection to mitigate DoS attacks.
Also, the impact of caches on performance is sometimes negative. With distributed caching, each match and put require a network request. Even when those calls don't leave a data center, they do cost far more than just reading a variable from memory. I already had the displeasure of stumbling upon a few scenarios where cache was prescribed in a cargo cult way and without any data backing up the assertion, and when we took a look at traces it was evident that the bottleneck was actually the cache itself.
Not really. Running out of computational resources to fulfill requests is not a performance issue. Think of thinks such as exhausting a connection pool. More often than not, some components of a system can't scale horizontally.
> They found that Postgres, without tuning, was easily fast enough on low level hardware
Is that production? When you basket it into "low level" it sounds like a base case but it really isn't.
In production you don't have local storage, RAM being used for all kinds of other things, your CPU only available in small slices, network effects and many others.
> If the defaults are fine for a use case
Which I hope isn't the developer's edition of it works on my machine.
> copy pasting some configuration I might not really understand
Uh, yea... why would you? Do you do that for configurations you found that weren't from LLMs? I didn't think so.
I see takes like this all the time and I'm really just mind-boggled by it.
There are more than just the "prompt it and use what it gives me" use cases with the LLMs. You don't have to be that rigid. They're incredible learning and teaching tools. I'd argue that the single best use case for these things is as a research and learning tool for those who are curious.
Quite often I will query Claude about things I don't know and it will tell me things. Then I will dig deeper into those things myself. Then I will query further. Then I will ask it details where I'm curious. I won't blindly follow or trust it like I wouldn't a professor or anyone or any thing else, for that matter. Just like I would when querying a human for or the internet in general for information, I'll verify.
You don't have to trust it's code, or it's configurations. But you can sure learn a lot from them, particularly when you know how to ask the right questions. Which, hold onto your chairs, only takes some experience and language skills.
My comment is mainly in opposition to the "five minutes" part from parent.
If you have 5 minutes then you can't as you say :
> Then I will dig deeper into those things myself ...
So my point is I don't care if it's coming from LLM or a random blog, you won't have time to know if it's really working (ideally you would want to benchmark the change).
If you can't invest the time better to stay with the defaults, which in most project the maintainers spent quite a bit of time to make sensible.
Original commenter here. I don't disagree with your larger point. However, it turns out that the default settings for PostgreSQL have been super conservative for years; as a stable piece of infrastructure they seem to prefer defaulting to a constrained environment rather than making assumptions about resources. To their credit, PostgreSQL does ship with sample configs for "medium" and "large" deployments which are well-documented with comments and can be simply copied over the original default config.
I happen to have a good bit of experience with PostgreSQL, so that colored the "5 minutes" part of it. Still, most of the time, you "have" more than 5 minutes to create the orchestrator's deployment config for the service (which never exists by default on any k8s-based orchestrator). I'm simply saying to not be negligent of the service's own config, even though a default exists.
It's crazy how wildly inaccurate "top-of-the-list" LLMs are for straightforward yet slightly nuanced inquiries.
I've asked ChatGPT to summarize Go build constraints, especially in the context of CPU microarchitectures (e.g. mapping "amd64.v2" to GOARCH=amd64 GOAMD64=v2). It repeatedly smashed its head on GORISCV64, claiming all sorts of nonsense such as v1, v2; then G, IMAFD, Zicsr; only arriving at rva20u64 et al under hand-holding. Similar nonsense for GOARM64 and GOWASM. It was all right there in e.g. the docs for [cmd/go].
This is the future of computer engineering. Brace yourselves.
Isn't that the whole point, to ask it specific tidbits of information? Are we to ask it large, generic pontifications and claim success when we get large, generic pontifications back?
ChatGPT is exceptionally good at using search now, but that's new this year, as of o3 and then GPT-5. I didn't trust GPT-4o and earlier to use the search tool well enough to be useful.
You can see if it's used search in the interface, which helps evaluate how likely it is to get the right answer.
I use it as a tool that understands natural language and the context of the environments in work in well enough to get by, while guiding it to use search or just facts I know if I want more one-shot accuracy. Just like I would if I were communicating with a newbie who has their own preconceived notions.
I mean, like most tools they work when they work and don't when they fail. Sometimes I can use an llm to find a specific datum and sometimes I use google and sometimes I use bing.
You might think of it as a cache, worth checking first for speed reasons.
The big downside is not that they sometimes fail, its that they give zero indication when they do.
How was the LLM accessing the docs? I’m not sure what the best pattern is for this.
You can put the relevant docs in your prompt, add them to a workspace/project, deploy a docs-focused MCP server, or even fine-tune a model for a specific tool or ecosystem.
I've done a lot of experimenting with these various options for how to get the LLM to reference docs. IMO it's almost always best to include in prompt where appropriate.
For a UI lib that I use that's rather new, specifically there's a new version that the LLMs aren't aware of yet, I had the LLM write me a quick python script that just crawls the docs site for the lib and feeds the entire page content back into itself with a prompt describing what it's supposed to do (basically telling it to generate a .md document with the specifics about that thing, whether it's a component or whatever, ie: properties, variants, etc in an extremely brief manner) as well as build an 'index.md' that includes a short paragraph about what the library is and a list of each component/page document that is generated. So in about 60 seconds it spits out a directory full of .md files and I then tell my project-specific LLM (ie: Claude Code or Opencode within the project) to review those files with the intention of updating the CLAUDE.md in the project to instruct that any time we're building UI elements we should refer to the index.md for the library to understand what components are available and when appropriate to use one of them we _must_ review the correlating document first.
Works very very very well. Much better than an MCP server specifically built for that same lib. (Huge waste of tokens, LLM doesn't always use it, etc) Well enough that I just copy/paste this directory of docs into my active projects using that library - if I wasn't lazy I'd package it up but too busy building stuff.
Don't ask LLMs that are trained on a whole bunch of different versions of things with different flags and options and parameters where a bunch of people who have no idea what they're doing have asked and answered stackoverflow questions that are likely out of date or wrong in the first place how to do things with that thing without providing the docs for the version you're working with. _Especially_ if it's the newest version, regardless if it's cutoff date was after that version was released - you have no way to know if it was _included_. (Especially about something related to a programming language with ~2% market share)
The contexts are so big now - feed it the docs. Just copy paste the whole damn thing into it when you prompt it.
So run the LLM in an agent loop: give it a benchmarking tool, let it edit the configuration and tell it to tweak the settings and measure and see how much if a performance improvement it can get.
That's what you'd do by hand if you were optimizing, so save some time and point Claude Code or Codex CLI or GitHub Copilot at it and see what happens.
Probably about 10 cents, if you're even paying for tokens. Plenty of these tools have generous free tiers or allowances included in your subscription.
I run a pricing calculator here - for 50,000 input tokens, 5,000 output tokens (which I estimate would be about right for a PostgreSQL optimization loop) GPT-5 would cost 11.25 cents: https://www.llm-prices.com/#it=50000&ot=5000&ic=1.25&oc=10
I use Codex CLI with my $20/month ChatGPT account and so far I've not hit the limit with it despite running things like this multiple
times a day.
Anyone can learn to unblock a sink by watching YouTube videos these days, and yet most people still hire a professional to do it for them.
I don't think end users want to "optimize their PostgreSQL servers" even if they DID know that's a thing they can do. They want to hire experts who know how to make "that tech stuff" work.
My analogy holds up. Anyone could type "optimize my PostgreSQL database by editing the configuration file" into an LLM, but most people won't - same as most people won't watch YouTube to figure out how to unblock a sink.
If you don't like the sink analogy what analogy would you use instead for this? I'm confident there's a "people could learn X from YouTube but chose to pay someone else instead" that's more effective than the sink one.
You're exactly right (original commenter here). I began my career in professional software engineering in 1998. I've despaired that trained monkeys could probably wreck this part of the economy for over 25 years. But we're still here. :D
Personally I'd like to hire a DB expert who also knows how to drive an agentic coding system to help them accelerate their work. AI tools, used correctly, act as an amplifier of existing knowledge and experience.
Some years ago when everybody here gave their anecdotal evidence about how Bitcoin and Blockchain were the future and they used it every day. You were a fool if you did not jump on the bandwagon.
If the personal opinions on this site were true, half of the code in the world would be functional, lisp would be one of the languages most used and Microsoft would have not bougth DropBox.
I really think HN hive minds opinions means nothing. Too much money here to be real.
You can become a DB expert by reading books, forums and practicing hard.
These days you can replace those books and forums with a top tier LLM, but you still need to put in the practice yourself. Even with AI assistance that's still a lot of work.
I don't appreciate how you accuse me of "making statements that are just not true" without providing a solid argument (as opposed to your own opinion) as to why what I'm saying isn't true.
IME very very few people tune the underlying host. Orgs like uber, google or whatever do but outside of that few people know what they're really doing/cares that much. Easier to "increase EC2 size" or whatever.
Defaults have all sorts of assumptions built into them. So if you compare different programs with their respective defaults, you are actually comparing the assumptions that the developers of those programs have in mind.
For example, if you keep adding data to a Redis server under default config, it will eat up all of your RAM and suddenly stop working. Postgres won't do the same, because its default buffer size is quite small by modern standards. It will happily accept INSERTs until you run out of disk, albeit more slowly as your index size grows.
The two programs behave differently because Redis was conceived as an in-memory database with optional persistence, whereas Postgres puts persistence first. When you use either of them with their default config, you are trusting that the developers' assumptions will match your expectations. If not, you're in for a nasty surprise.
For someone so enthusiastic about giving feedback you don't seem to have invested a lot of effort into figuring out how to give it effectively. Your done and demeanor diminish the value of your comment.
Yep. I worked in a famous-big-company that had a 15 years old service that was dogslow, systemd restarts would take multiple hours.
Everyone was talking about C++ optimizations, mutex everywhere etc - which was in fact a problem.
However.. I seemed to be the first person to actually try to debug what the database was doing, and it was going to disk all the time with a very small cache.. weird..
I see the MySQL settings on a 1TB ram machine and they were... out-of-the-box settings.
With small adjustments I improved the performance of this core system an order of magnitude.
> This is the kind of thing that is "hard and tedious" for only about five minutes of LLM query or web search time
not even! if you don't need to go super deep with tablespace configs or advanced replication right away, pgtune will get you to a pretty good spot in the time it takes to fill out a form.
It's easy. I just remind myself that, in about 5 billion years, the Sun will have sufficiently run out of fuel to begin its transition to a Red Giant. At that point, all remnants of biologic life that ever lived on the crust of the earth will be incinerated and it won't have mattered whether anyone carefully conserved the precious time remaining in their lives or not. I have so thoroughly incorporated this understanding into my psyche that I can merely blink now and all of that context is immediately present to me.
reply