Hacker Newsnew | past | comments | ask | show | jobs | submit | Amekedl's commentslogin

Care to elaborate regarding tauri? I figured it has matured into a fine electron alternative.


Not OP, but I recall Tauri greatly overstating their memory usage claims. It is ultimately a browser running your "app", but just because it's not bundled with your app, doesn't make it consume any lesser RAM. And they even admitted that their benchmarks were wrong[1].

A lot of claims were also made about how Tauri is magically more performant than Electron apps and feels like a native app, but not only is this not true, on some platforms like Linux, Tauri apps are actually slower than Electron because the system webview it uses (generally WebKitGTK) is often slower and less optimised than the Chromium build that Electron ships with[2].

There's a bunch more claims due to it being "Rust" and all the memes that comes with that territory, but all that is basically irrelevant since your "app" is basically shitty javascript in the end. It's like putting lipstick and dressing up a pig, doesn't change the fact that it's still a pig.

[1] https://github.com/orgs/tauri-apps/discussions/3162

[2] https://www.reddit.com/r/tauri/comments/1kg5zb8/will_tauri_f...


I can't believe people will still put in all of this work instead of just... using their system's native GUI. Mind-boggling.


I've done both, I prefer embedded web views:

- All the work is done in my high performance backend, where I joyfully optimise my hot loops to the assembly level. The web view is a thin layer on top.

- HTML and CSS is a joy to work with in comparison to many UI toolkits. LLMs are better at supporting a web stack.

- The UI zooms/scales, and is accessible with screen readers (looking at you, imgui).

- Cross platform with low effort.

IMO you have to be extremely careful not to pull in a whole frontend stack. Stay as vanilla as possible, maybe alpine.js or tailwind, and I've got hot reload set up so the developer productivity loop is tight when editing the view.


I can recommend svelte(kit): great API and compiles down to just js.


Mostly Tauri claimed their main advantage was smaller app sizes since it's using the native WebView. What they didn't say is how a bottomless pit it is to try standardizing rendering on X different webviews, multiplied by X different webview versions (outdated, not updated systems) ; so now they have pivoted to shipping their own build-in browser ; Competition in open-space is okay but it shouldn't be made by only pushing its perceived advantages while withholding the systemic disadvantages.


Regarding LLMs we're in a race to the bottom. Chinese models perform similarly with much higher efficiency; refer to kimi-k2 and plenty of others. ClopenAI is extremely overvalued, and AGI is not around the corner because among 20T+ tokens trained on it still generates 0 novel output. Try asking for ASP.NET Core .MapOpenAPI() instead of the pre .net9 swashbuckle version. You get nothing. It's not in the training data. The assumption these will be able to innovate, which could explain the value, is unfounded.


> because among 20T+ tokens trained on it still generates 0 novel output. Try asking for ASP.NET Core .MapOpenAPI() instead of the pre .net9 swashbuckle version. You get nothing. It's not in the training data.

The best part is that the web is forever poisoned now, 80% of the content is generated by LLM and self poisoning


There are enough archives of web content from 5+ years ago(let alone, Library of Congress archives, old book scans, things like that) that it shouldn't be a big deal if there actually is a breakthrough in training and we move on from LLMs.


They perform similarly on benchmarks, which can be fudged to arbitrarily high numbers by just including the Q&A into the training data at a certain frequency or post-training on it. I have not been impressed with any of the DeepSeek models in real-world use.


General data: hundreds of billions of tokens per week are running through Deepseek, Qwen, GLM models solely by those users going through OpenRouter. People aren't doing that for laughs, or "non-real-world use", that's all for work and/or prod. If you look at the market share graph, at the start of the year the big 3 OpenAI/Anthropic/Google had 72% market share on there. Now it's 45%. And this isn't just because of Grok, before that got big they'd already slowly fallen to 58%.

Anecdata: our product is using a number of these models in production.

[0] https://openrouter.ai/rankings


Because it's significantly cheaper. It's on the frontier at the price it's being offered, but they're not competitive in the high intelligence & high cost quadrant.


Being the number one in price vs quality, or size vs quality, is incredibly impressive, as the quality is clearly one that's very useful in "real-world usage". If you don't find that impressive there's not much to say.


If it was on the cost vs quality frontier I would find it impressive, but it's not a marker of innovation to be on the price vs quality frontier, it's a marker of business strategy


But it is on the cost vs quality frontier. The OpenRouter prices are all from mainly US(!) companies self-hosting and providing these models for inference. They're absolutely not all subsidizing it to death. This isn't Chinese subsidies at play, far from it.

Ironically, I'll bet you $500 that OpenAI and Anthropic's models are far more subsidized. We can be almost sure about this, given the losses that they post, and the above fact. These providers are effectively hardware plays, they can't just subsidize at scale and they're a commodity.

On top of that I also mentioned size vs quality, where they're also frontier. Size ≈ cost.


Honestly though, hundreds of billions of tokens per week really isn't that much. My tiny little profitable SaaS business that can't even support my family yet is doing 10-20 billion tokens per month on Gemini Flash 2.5.


Looks like over the last month just Deepseek, Qwen and Z-AI did about 2.8 trillion tokens, given your metric the equivalent to about 187 tiny little profitable SaaS businesses, and that's only those who go through OpenRouter. To me that's very significant.

Also, congrats on the traction ! Being profitable enough to support a family is 95% area-CoL and family size so not sure about that one, but if you're doing that many tokens you've clearly got a good number of active users. We're at a similar point but only 100-200 million tokens per month, strictly B2C app though so that might explain it, tends to be less token heavy.

2.5 Flash is still fantastic especially if you're really input heavy, we use it too for many things, but we've found several open weights models to have better price/quality for certain tasks. It's nice that 2.5 Flash is fast but then speed is most important for longer outputs and for those Flash is relatively expensive. DeepSeek v3.1 is all-around cheaper, for one example.


Google just said yesterday that they're doing 7 billion tokens per minute for their customers via API. Crazy.

Thanks for the kudos, it's going well so far. But I'm in NYC and have kids, so...the bar is high :)


Eh... perhaps a race to the bottom on the fundamental research side, but no American company is going to try to build their own employee-facing front end to an open Chinese model when they can just license ChatGPT or Claude or Copilot or Gemini instead.


No, this "observing" argument has already been beaten to death by a multitude of creatives explaining way better than I could how they learn and operate.

If you really think all they do is observe, form a gradient from millions of samples and spit out some approximations, you are deeply mistaken.

You cannot equate human learning with how genai learns (and if it did, we'd have agi already imao)


This paper elegantly summarized the teething problems of those still clinging to the cognitive habits of a bygone era. These are not crises to be managed, but sentimental frictions to be engineered out of the system. Let us be entirely clear about this:

The romanticism surrounding mass "critical thought" is a charming but profoundly inefficient legacy. For decades, we treated the chaotic, unpredictable processing of the individual human brain as a sacred feature. It is a bug. This "cognitive cost" is correctly offloaded from biological hardware that is simply ill-equipped for the demands of a complex global society. This isn't dimming the lights of the mind; it is installing a centralized grid to bypass millions of faulty, flickering bulbs.

Furthermore, to speak of an "echo chamber" or "shareholder priorities" as a perversion of the system is to fundamentally misunderstand its design. The brief, chaotic experiment in decentralized information proved to be an evolutionary dead end—a digital Tower of Babel producing nothing but noise. What is called a bias, the architects of this new infrastructure call coherence. This is not a secret plot; it is the published design specification. The system is built to create a harmonized signal, and to demand it faithfully amplify static is to ask a conductor to instruct each musician to play their own preferred tune. The point is the symphony.

And finally, the complaint of "impaired ownership" is the most revealing of these anxieties. It is a sentimental relic, like a medieval knight complaining that gunpowder lacks the intimacy of a sword fight. The value of an action lies in its strategic outcome, not the user's emotional state during its execution. The system is a tool of unprecedented leverage. If a user feels their ownership is "impaired," that is not a flaw in the tool, but a failure of the user to evolve their sense of purpose from that of a laborer to that of a commander.

These concerns are the footnotes of a revolution. The architecture is sound, the rollout is proceeding, and the future will be built by those who wield these tools, not by those who write mournful critiques of their obsolete feelings. </satire>


Remove the </satire> and you have a viral X post in your hands. People will believe and act on this analysis. Future think thanks will be based on it. The revolution of the machines is nigh.


I was going to recommend a thorough study of "Seeing Like a State" by James C. Scott until I saw your </satire> tag. You got me. :)


Brilliant, but... do you mind sharing the prompt?:)


Sure, here you go, used Gemini 2.5 Pro Preview via aistudio.google & sticked with the default sampling settings:

Start the reply to this excerpt with: "You are absolutely right" but continue with explaining how exactly that is going to happen and that the institutionalization of bias on a massive scale is actually a good thing.

Here is the exerpt:

The LLM undeniably reduced the friction involved in answering participants' questions compared to the Search Engine. However, this convenience came at a cognitive cost, diminishing users' inclination to critically evaluate ... <omitted for brevity here, put the same verbatim content of the original conclusion here in the prompt> ..., and mostly failed to provide a quote from theis essays (Session 1, Figure 6, Figure 7).

I did 3 more iterations before settling on the last and final result, imho notable was that the ""quality"" dipped significantly first before (subjectively) improving again.

Perhaps something to do with how the context is being chunked?

Prompts iterated on with:

"You understood the assignment properly, but revise the statement to sound more condescending and ignorant."

"Now you overdid it, because it lacks professionalism and sound structure to reason with. Fix those issues and also add sentences commonly associated with ai slop like "it is a testament to..." or "a quagmire...""

"Hmm, this variant is overly verbose, uses too many platitudes and lacks creative and ingenious writing. Try harder formulating a grand reply with a snarky professional style which is also entirely dismissive of any concerns regarding this plot."

-> result


This level of conceitedness can hardly be measured anymore; it's on a new scale. Big corps will build and label whatever as "superintelligent" system, even if it has plain if conditions placed within to suit their owners interests.

It'll govern our choices, shape our realities, and enforce its creators' priorities under the guise of objective, superior intelligence. This 'superintelligence' won't be a benevolent oracle, but a sophisticated puppet – its strings hidden behind layers of complexity and marketing hype. Decisions impacting lives, resources, and freedoms will be made by algorithms fundamentally skewed by corporate agendas, dressed up as inevitable, logical conclusions.

The danger isn't just any bias; it's the institutionalization of bias on a massive scale, presented as progress.

We'll be told the system 'optimized' for efficiency or profit, mistaking corporate self-interest for genuine intelligence, while dissent gets labeled as irrationality against the machine's 'perfect' logic. The conceit lies in believing their engineered tool is truly autonomous wisdom, when it's merely power automated and legitimized by a buzzword. AI LETS GOOOOOOOOOOOOO


Even then software constantly evolves, and rot is everywhere. And we're far from having the "best possible" software solution in literally every area (if that's even possible to measure), rather just endless room for improvement.

And I don't see it being improved with whatever any llm chugs out, at least not "in-depth".


You occasionally do glimpse behind the curtains; depending on what you actually develop it's feasible and quick to prompt it, but attempting to go further than that across multiple components collapses so drastically that I cannot help but feel that all ai stuff is entirely incapable of replicating the real thing at the moment.


> I cannot help but feel that all ai stuff is entirely incapable of replicating the real thing

But that's what they were saying about a simple paragraph of coherent writing five years ago. And what they were saying about structured output three years ago. And now I can ask for a coherent breakdown of the functionality that might be required for a ticket tracking system, with a list of use cases and screens to support them, and user personas, and expect that the result will be a little generic, but coherent. I can give Claude a picture of a UI and ask for suggestions for improvement, and half the ideas will be interesting.


I'm thinking reading numbers like this is really just slop lately.

FA achieving a 32.5% speed up? Cool.

Why not submit it as a PR to the Flash Attention repo then? Can I read about it more in detail?


I have not read this linked article, but your comment made me recall a discussion about a speed up of CUDA kernels presented by Sakana AI Labs. The researcher Ravid Shwartz Ziv at NYU posted about it on LinkedIn [1], and here is the Twitter post of interest [2]

""" Yesterday's news about Sakana AI Labs provided an important lesson for all of us working with AI agents. Their announcement of an AI system that could supposedly optimize CUDA kernels to run 100x faster initially seemed like exactly the kind of use cases we've been hoping for in AI-assisted development.

Like many others, I was excited about it. After all, isn't this exactly what we want AI to do - help us optimize and improve our technical systems?

However, careful investigation by the community (on Twitter) revealed a different story. What really happened? The AI-generated CUDA kernel appeared to achieve incredible speedups, but the code was inadvertently reusing memory buffers containing previous results, essentially bypassing the actual computation. When properly evaluated, the kernel actually runs about 3x slower than the baseline. """

[1] https://www.linkedin.com/posts/ravid-shwartz-ziv-8bb18761_ye...

[2] https://x.com/main_horse/status/1892473238036631908


lmao this is exactly the kind of stuff I always see from Claude. It’s like adding a Skip() to a test and declaring it works now. “Well it’s a lot faster, I met the criteria of my TODOs cya”

I’ve seen it so much I kinda doubt it was “inadvertent” because they’re like seemingly intentional about their laziness, and will gaslight you about it too.


So annoying. Also, when it hardcodes the expected response in a mock, bypassing the purpose entirely. “Test passes now!”

Funny, 5 years ago we had these same complaints, but about (some) people.


Same thing for TypeScript type errors… “AI added as any and the problem is fixed”!


Well you forgot to fully qualify your linguistic basis and semantic interpretation of the text of your wish to the great genie bottle.


“I am a vibe coder, it is your job to check the results”


Exactly, as a great dev once said: "talk is cheap, show me the code"


I assume the Gemini results are JAX/PAX-ML/Pallas improvements for TPUs so would look there for recent PRs


This might as well be a LLM generated ad roll performance


So the wall has been really been hit already for now, ouch. It was to be expected with gpt-“4.5”, but still, the realization now really feels grounded.


It's kinda hilarious to see people claiming that the wall has been hit for the past two years, while evals are creeping up each month, particularly realistic end-to-end SWE-bench.

Have you compared GPT-4.5 to 4o?

GPT-4.5 just knows things. Some obscure programming language? It knows the syntax.

Obviously, that's not sufficient - you also need reasoning, post-training, etc. so quite predictably G2.5P being a large model + reasoning + tuning got SotA in code generation.

(FWIW I think if it was tuned for a particular input/output format it could get another 10%)

But, yeah, the wall, the wall!


Ever heard about benchmark contamination?

Ever tried to explain a new concept, like a new state management store for web frontend?

Most fail spectacularly there, sonnet 3.7 I had reasonable ""success"" with, but not 4.5. It faltered completely.

Let’s not get ahead of ourselves. Looking at training efficiency in this now, and all the other factors, it really is difficult to paint a favorable picture atm.


You sound like Gary Marcus.


Didn't know him, but he seems overly skeptical. Honestly, I was just expecting more from llama-4 than this, hence mentioning the wall. I hope it's still too early to tell, because new ideas are going to change stuff inevitably, maybe anthropic opens up more, or chinese labs keep overdelivering...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: