The Nano model is 3.2B parameters at 4bit quantization. This is quite small compared to what you get from hosted chatbots, and even compared to open-weights models runnable on desktops.
It's cool to have something like this available locally anyway, but don't expect it to have reasoning capabilities. At this size it's going to be naive and prone to hallucinations. It's going to be more like a natural language regex and a word association game.
The big win for those small local models to me isn't knowledge based (I'll leave that to the large hosted models), but more so a natural language interface that can then dispatch to tool calls and summarize results. I think this is where they have the opportunity to shine. You're totally right that these are going to be awful for knowledge.
Speculation: I guess the idea is they build an enormous inventory of tool-use capabilities, then this model mostly serves to translate between language and Android's internal equivalent of MCP.
There are two CUDAs – a hardware architecture, and a software stack for it.
The software is proprietary, and easy to ignore if you don't plan to write low-level optimizations for NVIDIA.
However, the hardware architecture is worth knowing. All GPUs work roughly the same way (especially on the compute side), and the CUDA architecture is still fundamentally the same as it was in 2007 (just with more of everything).
It dictates how shader languages and GPU abstractions work, regardless of whether you're using proprietary or open implementations. It's very helpful to understand peculiarities of thread scheduling, warps, different levels of private/shared memory, etc. There's a ridiculous amount of computing power available if you can make your algorithms fit the execution model.
Rust has safe and reliable GTK bindings. They used gir to auto-generate the error-prone parts of the FFI based on schemas and introspection: https://gtk-rs.org/gir/book/
Rust's bindings fully embrace GTK's refcounting, so there's no mismatch in memory management.
We also use gir to auto-generate our bindings. But stuff like this is not represented in gir: https://github.com/ghostty-org/ghostty/commit/7548dcfe634cd9... It could EASILY be represented in a wrapper (e.g. with a Drop trait) but that implies a well-written wrapper, which is my argument. It's not inherent in the safety Rust gives you.
This is a generic smart pointer. It had to be designed and verified manually, but that line of code has been written once 8 years ago, and nobody had to remember to write this FFI glue or even call this method since. It makes the public API automatically safe for all uses of all weak refs of all GTK types.
The Zig version seems to be a fix for one crash in a destructor of a particular window type. It doesn't look like a systemic solution preventing weak refs crashes in general.
Do you mean gtk-rs (https://gtk-rs.org/)? I have done a bit of programming with it. I respect the work behind it, but it is a monumental PITA - truly a mismatch of philosophies and design - and I would a thousand times rather deal with C/C++ correctness demons than attempt it again, unless I had hard requirements for soundness. Even then, if you use gtk-rs you are pulling in 100+ crate dependencies and who knows what lurks in those?
Yeah, Rust isn't OOP, which is usually fine or even an advantage, but GUIs are one case where it hurts, and there isn't an obvious alternative.
> gtk-rs you are pulling in 100+ crate dependencies and who knows what lurks in those?
gtk-rs is a GNOME project. A lot of it is equivalent to .h files, but each file is counted as a separate crate. The level of trust or verification required isn't that different, especially if pulling a bunch of .so files from the same org is uncontroversial.
Cargo keeps eliciting reactions to big numbers of "dependencies", because it gives you itemized lists of everything being used, including build deps. You just don't see as much inner detail when you have equivalent libs pre-built and pre-installed.
Crates are not the same unit as a typical "dependency" in the C ecosystem. Many "dependencies" are split into multiple crates, even when it's one codebase in one repo maintained by one person. Crates are Rust's compilation unit, so kinda like .o files, but not quite comparable either.
A Cargo Workspace would be conceptually closer to a typical small C project/dependency, but Cargo doesn't support publishing Workspaces as a unit, so every subdirectory becomes a separate crate.
I'm sure the gtk-rs bindings are pretty good, but I do wonder if anyone ran Valgrind on them. When it comes to C interop, Rust feels weirdly less safe just because of the complexity.
But the GTK-rs stuff has already abandoned GTK3. Wait... I guess if the GTK-rs API doesn't change and it just uses GTK4 that's a good way to go? Everyone can install both 3 and 4 on their system and the rust apps will just migrate. Is that how they did it?
You're looking at this from the perspective of what would make sense for the model to produce. Unfortunately, what really dictates the design of the models is what we can train the models with (efficiently, at scale). The output is then roughly just the reverse of the training. We don't even want AI to be an "autocomplete", but we've got tons of text, and a relatively efficient method of training on all prefixes of a sentence at the same time.
There have been experiments with preserving embedding vectors of the tokens exactly without loss caused by round-tripping through text, but the results were "meh", presumably because it wasn't the input format the model was trained on.
It's conceivable that models trained on some vector "neuralese" that is completely separate from text would work better, but it's a catch 22 for training: the internal representations don't exist in a useful sense until the model is trained, so we don't have anything to feed into the models to make them use them. The internal representations also don't stay stable when the model is trained further.
It’s indeed a very tricky problem with no clear solution yet. But if someone finds a way to bootstrap it, it may be a new qualitative jump that may reverse the current trend of innovating ways to cut inference costs rather than improve models.
I've wasted time debugging phantom issues due to LLM-generated tests that were misusing an API.
Brainstorming/explanations can be helpful, but also watch out for Gell-Mann amnesia. It's annoying that LLMs always sound smart whether they are saying something smart or not.
Yes, you can't use any of the heuristics you develop for human writing to decide if the LLM is saying something stupid, because its best insights and its worst hallucinations all have the same formatting, diction, and style. Instead, you need to engage your frontal cortex and rationally evaluate every single piece of information it presents, and that's tiring.
GitLab doesn't have an equivalent of GitHub actions (except an alpha-quality prototype).
GitHub Actions can share runtime environment, which makes them cheap to compose. GitLab components are separately launched Docker containers, which makes them heavyweight and unsuitable for small things (e.g. a CI component can't install a dependency or set configuration for your build, because your build won't be running there).
The components aren't even actual components. They're just YAML templates concatenated with other YAML that appends lines to a bash script. This means you can't write smart integrations that refer to things like "the output path of the Build component", because there's no such entity. It's just some bash with some env var.
There's a lot of "promising" and "interesting" stuff, but I'm not seeing anything yet that actually works reliably.
Sooner or later (mostly sooner) it becomes apparent that it's all just a chatbot hastily slapped on top of an existing API, and the integration barely works.
A tech demo shows your AI coding agent can write a whole web app in one prompt. In reality, a file with 7 tab characters in a row completely breaks it.
> A tech demo shows your AI coding agent can write a whole web app in one prompt. In reality, a file with 7 tab characters in a row completely breaks it.
I like how all the demos just show these crazy simple, single page "web apps" that are effectively just coding tutorial material, and people eat it up. There's no talk of auth, persistence, security, deployment, performance, etc.
Cool...it vibe coded a janky snake game with no collision, now what?
The simple box-shaped container and low framerate/low gravity simulation doesn't show off what the FLIP algorithm can do.
The algorithm is a more expensive combination of two simulation methods to support both splashes and incompressibility, but the benefits are barely visible in the simple container.
Ah. Python source distributions are the same, so there may be additional considerations there. Though in general it doesn't seem like there's much concern in the Python ecosystem about that, considering that building them will run arbitrary code anyway....
It's cool to have something like this available locally anyway, but don't expect it to have reasoning capabilities. At this size it's going to be naive and prone to hallucinations. It's going to be more like a natural language regex and a word association game.