Hacker Newsnew | past | comments | ask | show | jobs | submit | more pornel's commentslogin

The Nano model is 3.2B parameters at 4bit quantization. This is quite small compared to what you get from hosted chatbots, and even compared to open-weights models runnable on desktops.

It's cool to have something like this available locally anyway, but don't expect it to have reasoning capabilities. At this size it's going to be naive and prone to hallucinations. It's going to be more like a natural language regex and a word association game.


The big win for those small local models to me isn't knowledge based (I'll leave that to the large hosted models), but more so a natural language interface that can then dispatch to tool calls and summarize results. I think this is where they have the opportunity to shine. You're totally right that these are going to be awful for knowledge.


The point in these models isn't to have all the knowledge in the world available.

It's to understand enough of language to figure out which tools to call.

"What's my agenda for today" -> get more context

cal = getCalendar() getWeather(user.location()) getTraffic(user.location(), cal[0].location)

etc.

Then grab the return values from those and output:

"You've got a 9am meeting in Foobar, the traffic is normal and it looks like it's going to rain after the meeting."

Not rocket science and not something you'd want to feed to a VC-powered energy-hogging LLM when you can literally run it in your pocket.


Isn't this what Apple tried with Siri? I don't see anyone use it, and adding an LLM to the mix is going to make it less accurate.


They wrote a whole ass paper about SLMs that do specifically this - expert small language models with narrow expertise.

And then went for a massive (but private and secure) datacenter instead.


Speculation: I guess the idea is they build an enormous inventory of tool-use capabilities, then this model mostly serves to translate between language and Android's internal equivalent of MCP.


I've had Gemma 3n in edge gallery on my phone for months. It's neat that it works at all but it's not very useful.


There are two CUDAs – a hardware architecture, and a software stack for it.

The software is proprietary, and easy to ignore if you don't plan to write low-level optimizations for NVIDIA.

However, the hardware architecture is worth knowing. All GPUs work roughly the same way (especially on the compute side), and the CUDA architecture is still fundamentally the same as it was in 2007 (just with more of everything).

It dictates how shader languages and GPU abstractions work, regardless of whether you're using proprietary or open implementations. It's very helpful to understand peculiarities of thread scheduling, warps, different levels of private/shared memory, etc. There's a ridiculous amount of computing power available if you can make your algorithms fit the execution model.


Rust has safe and reliable GTK bindings. They used gir to auto-generate the error-prone parts of the FFI based on schemas and introspection: https://gtk-rs.org/gir/book/

Rust's bindings fully embrace GTK's refcounting, so there's no mismatch in memory management.


We also use gir to auto-generate our bindings. But stuff like this is not represented in gir: https://github.com/ghostty-org/ghostty/commit/7548dcfe634cd9... It could EASILY be represented in a wrapper (e.g. with a Drop trait) but that implies a well-written wrapper, which is my argument. It's not inherent in the safety Rust gives you.

EDIT:

I looked it up because I was curious, and a Drop trait is exactly what they do: https://github.com/gtk-rs/gtk-rs-core/blob/b7559d3026ce06838... and as far as I can tell this is manually written, not automatically generated from gir.

So the safety does rely on the human, not the machine.


This is a generic smart pointer. It had to be designed and verified manually, but that line of code has been written once 8 years ago, and nobody had to remember to write this FFI glue or even call this method since. It makes the public API automatically safe for all uses of all weak refs of all GTK types.

The Zig version seems to be a fix for one crash in a destructor of a particular window type. It doesn't look like a systemic solution preventing weak refs crashes in general.


Do you mean gtk-rs (https://gtk-rs.org/)? I have done a bit of programming with it. I respect the work behind it, but it is a monumental PITA - truly a mismatch of philosophies and design - and I would a thousand times rather deal with C/C++ correctness demons than attempt it again, unless I had hard requirements for soundness. Even then, if you use gtk-rs you are pulling in 100+ crate dependencies and who knows what lurks in those?


Yeah, Rust isn't OOP, which is usually fine or even an advantage, but GUIs are one case where it hurts, and there isn't an obvious alternative.

> gtk-rs you are pulling in 100+ crate dependencies and who knows what lurks in those?

gtk-rs is a GNOME project. A lot of it is equivalent to .h files, but each file is counted as a separate crate. The level of trust or verification required isn't that different, especially if pulling a bunch of .so files from the same org is uncontroversial.

Cargo keeps eliciting reactions to big numbers of "dependencies", because it gives you itemized lists of everything being used, including build deps. You just don't see as much inner detail when you have equivalent libs pre-built and pre-installed.

Crates are not the same unit as a typical "dependency" in the C ecosystem. Many "dependencies" are split into multiple crates, even when it's one codebase in one repo maintained by one person. Crates are Rust's compilation unit, so kinda like .o files, but not quite comparable either.

A Cargo Workspace would be conceptually closer to a typical small C project/dependency, but Cargo doesn't support publishing Workspaces as a unit, so every subdirectory becomes a separate crate.


Rust has all the features to do COM and CORBA like OOP.

As Windows proves, it is more than enough to write production GUI components, and the industry leading 3D games API, since the days of Visual Basic 5.

In fact that is how most new Rust components on Windows by Microsoft have been written, as COM implementations.


I remember it being bad enough for a project I was working on that the engineer working on it switched to relm: https://relm4.org

It is built on top of gtk4-rs, and fairly usable: https://github.com/hackclub/burrow/blob/main/burrow-gtk/src/...

I'm sure the gtk-rs bindings are pretty good, but I do wonder if anyone ran Valgrind on them. When it comes to C interop, Rust feels weirdly less safe just because of the complexity.


But the GTK-rs stuff has already abandoned GTK3. Wait... I guess if the GTK-rs API doesn't change and it just uses GTK4 that's a good way to go? Everyone can install both 3 and 4 on their system and the rust apps will just migrate. Is that how they did it?


You're looking at this from the perspective of what would make sense for the model to produce. Unfortunately, what really dictates the design of the models is what we can train the models with (efficiently, at scale). The output is then roughly just the reverse of the training. We don't even want AI to be an "autocomplete", but we've got tons of text, and a relatively efficient method of training on all prefixes of a sentence at the same time.

There have been experiments with preserving embedding vectors of the tokens exactly without loss caused by round-tripping through text, but the results were "meh", presumably because it wasn't the input format the model was trained on.

It's conceivable that models trained on some vector "neuralese" that is completely separate from text would work better, but it's a catch 22 for training: the internal representations don't exist in a useful sense until the model is trained, so we don't have anything to feed into the models to make them use them. The internal representations also don't stay stable when the model is trained further.


It’s indeed a very tricky problem with no clear solution yet. But if someone finds a way to bootstrap it, it may be a new qualitative jump that may reverse the current trend of innovating ways to cut inference costs rather than improve models.


I've wasted time debugging phantom issues due to LLM-generated tests that were misusing an API.

Brainstorming/explanations can be helpful, but also watch out for Gell-Mann amnesia. It's annoying that LLMs always sound smart whether they are saying something smart or not.


Yes, you can't use any of the heuristics you develop for human writing to decide if the LLM is saying something stupid, because its best insights and its worst hallucinations all have the same formatting, diction, and style. Instead, you need to engage your frontal cortex and rationally evaluate every single piece of information it presents, and that's tiring.


It's like listening to a politician or lawyer, who might talk absolute bullshit in the most persuading words. =)


GitLab doesn't have an equivalent of GitHub actions (except an alpha-quality prototype).

GitHub Actions can share runtime environment, which makes them cheap to compose. GitLab components are separately launched Docker containers, which makes them heavyweight and unsuitable for small things (e.g. a CI component can't install a dependency or set configuration for your build, because your build won't be running there).

The components aren't even actual components. They're just YAML templates concatenated with other YAML that appends lines to a bash script. This means you can't write smart integrations that refer to things like "the output path of the Build component", because there's no such entity. It's just some bash with some env var.


There's a lot of "promising" and "interesting" stuff, but I'm not seeing anything yet that actually works reliably.

Sooner or later (mostly sooner) it becomes apparent that it's all just a chatbot hastily slapped on top of an existing API, and the integration barely works.

A tech demo shows your AI coding agent can write a whole web app in one prompt. In reality, a file with 7 tab characters in a row completely breaks it.


> A tech demo shows your AI coding agent can write a whole web app in one prompt. In reality, a file with 7 tab characters in a row completely breaks it.

I like how all the demos just show these crazy simple, single page "web apps" that are effectively just coding tutorial material, and people eat it up. There's no talk of auth, persistence, security, deployment, performance, etc.

Cool...it vibe coded a janky snake game with no collision, now what?


The simple box-shaped container and low framerate/low gravity simulation doesn't show off what the FLIP algorithm can do.

The algorithm is a more expensive combination of two simulation methods to support both splashes and incompressibility, but the benefits are barely visible in the simple container.


For water simulation, look towards learning compute shaders.

Eulerian (grid-based) simulation is one of the classic examples.


npm and Cargo use gzipped tarballs.

Tar is an awful format that has multiple ways of specifying file names and file sizes, so there could be some shenanigans happening.

It's also possible to make archives have different content based on case-sensitivity of the file system.


Ah. Python source distributions are the same, so there may be additional considerations there. Though in general it doesn't seem like there's much concern in the Python ecosystem about that, considering that building them will run arbitrary code anyway....


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: