Hacker Newsnew | past | comments | ask | show | jobs | submit | Tarean's commentslogin

I think WasmGC is very hard to make work with laziness. A lazy value is always a closure on the heap.

If an expression might be unused, throw a closure which computes it on the heap

If the value is actually needed, invoke the closure. Optionally replace the closure with a black hole. A black hole is just a closure which pauses any thread which calls it, to be resumed once the first thread finishes with the expression

Once finished, replace with a closure which immediately returns the computation result. (Or often save the indirection because most concrete values also act as closures which immediately returns themselves using info table pointers trickery)

Anyway, iirc WasmGC wants very rigid types without dynamic type changes. Extra indirections could fix that, Oor maybe defunctionalizing thunks into a tagged union, but both sound expensive. Especially without being able to hook into the tracing step for indirection removal.

Also, Haskell supports finalizers so WasmGC would need that as well.


> Anyway, iirc WasmGC wants very rigid types without dynamic type changes.

You can have dynamic type changes in the current WasmGC MVP, but they are modeled as explicit downcasts from a supertype of some sort. There's not even any express support for tagged unions, structs and downcasting is all you get at the moment.


Sometimes keeping a fixed shape for the variable context across the computation can make it easier to reason about invariants, though.

Like, if you have a constraint is_even(x) that's really easy to check in your head with some informal Floyd-Hoare logic.

And it scales to extracting code into helper functions and multiple variables. If you must track which set of variables form one context x1+y1, x2+y2, etc I find it much harder to check the invariants in my head.

These 'fixed state shape' situations are where I'd grab a state monad in Haskell and start thinking top-down in terms of actions+invariants.


I'm pretty sure recall was specifically a selling point for laptops with ai chips which could do the processing locally and reasonably efficiently?

Though storing the data locally still could make getting compromised by a targeted attack more dangerous.


This is correct - it was all on-device, with security guarantees that were instantly proven incorrect. Microsoft withdrew Recall, then brought it back with a newer, more secure implementation that was also proven insecure.

It also claimed that it wasn't going to record sensitive information but it did, to the point where some apps, like Signal, used available Windows APIs to set DRM flags on their windows so that Windows wouldn't capture those regions at all.

What Microsoft could have offered is an easy-to-implement API for application developers to opt into (but users can opt out of), and a blanket recall-esque toggle that users can apply to applications without explicit support. Applications like Firefox or Chrome could hook into the API to provide page content to the API along with more metadata than a simple screenshot could provide, while at the same time not providing that data when sensitive fields/data is on the page (and possibly providing ways for the HTML to define a 'secure' area that shouldn't be indexed or captured, useful in lots of other circumstances).

But, as with everything AI, they don't want users to want it; they want users to use it regardless of whether or not they want it. This is the same reason they forced Copilot into everyone's Office 365 plans and then upped the price unless you tried to cancel; they have to justify the billions they're spending and forcing the numbers to go up is the only way to do that.


I have to wonder what edge AI would look like on a laptop. Little super mini Nvidia Jetson? How much added cost? How much more weight for the second and third batteries? And the fourth and fifth batteries to be able to unplug for more than a few minutes?

They're called NPUs and all recent CPUs from Intel, AMD, or Apple have them. They're actually reasonably power efficient. All flagship smartphones have them, as well as several models down the line as well.

IIRC linux drivers are pretty far behind, because no one who works on linux stuff is particularly interested in running personal info like screenshots or mic captures through a model and uploading the telemetry. While in general I get annoyed when my drivers suck, in this particular case I don't care.


It looks like a MacBook Pro and (maybe) a Snapdragon X2 device

Interval arithmetic is only a constant factor slower but may simplify at every step. For every operation over numbers there is a unique most precise equivalent op over intervals, because there's a Galois connection. But just because there is a most precise way to represent a set of numbers as an interval doesn't mean the representation is precise.

A computation graph which gets sampled like here is much slower but can be accurate. You don't need an abstract domain which loses precision at every step.


It would have been sort of interesting if we’d gone down the road of often using interval arithmetic. Constant factor slower, but also the operations are independent. So if it was the conventional way of handling non-integer numbers, I guess we’d have hardware acceleration by now to do it in parallel “for free.”


You can probably get the parallelism for interval arithmetic today? Though it would probably require a bit of effort and not be completely free.

On the CPU you probably get implicit parallel execution with pipelines and re-ordering etc, and on the GPU you can set up something similar.


In my head the two dimensions are tail Vs non-tail jumps, and explicit Vs implicit scope passing.

The most interesting case is implicit scope+non-tail recursion, usually this requires you to capture the variable context in mutable objects/monads/effects or similar.

This explicit capturing is neat because you still have consistent shapes for your state to define invariants over, but it's easier to decompose the logic and ignore parts of the context which are irrelevant. It lets you split problems into domain specific languages, each of which has a set of (maybe overlapping) contexts, and ideally each of which can be proven in isolation.

Also, the control flow of loops is a very restricted case of even tail jumps. Tail recursion has arbitrary jumps between basic blocks and loops are properly nested basic blocks. Even with labeled breaks, efficiently simulating arbitrary tail recursion without goto is tough. Induction proofs/data flow analysis/abstract interpretation don't care, but I'm not sure it makes proofs easier.


Twee (an equational theorem prover in Haskell used by quickspec) has an interesting take on this. Terms are slices of arrays, but you get a normal interface including pattern matching via synonyms. It can also be nice to use phantom types of your references (array offsets), so if you project them into flat view types you can do so type safely

Requires the language to have something equivalent to pattern synonyms to be as invisible as twee, though.

In twee a TermList is a slice of a bytearray (two ints for offset/length plus a pointer).

And a term is an int for the function symbol and an unpacked TermList for the arguments.

The pattern match synonyms load a flat representation from the array into a view type, and the allocation of the view type cancels out with the pattern matching so everything remains allocation free.

https://hackage.haskell.org/package/twee-lib-2.4.2/docs/Twee...


Forgot to mention: In the twee style, the int for the function id contains metadata (is it a unification variable or constant name? how many args does it take?). That way f1(f3(f5(), f7())) would be serialised as something like [1,3,5,7], without even references to other offsets


As of python 3.6 you can nest fstrings. Not all formatters and highlighters have caught up, though.

Which is fun, because correct highlighting depends on language version. Haskell has similar problems where different compiler flags require different parsers. Close enough is sufficient for syntax highlighting, though.

Python is also a bit weird because it calls the format methods, so objects can intercept and react to the format specifiers in the f-string while being formatted.


I didn't mean nested f-strings. I mean this is a syntax error:

    >>> print(f"foo {"bar"}")
    SyntaxError: f-string: expecting '}'
Only this works:

    >>> print(f"foo {'bar'}")
    foo bar


You're using an old Python version. On recent versions, it's perfectly fine:

    Python 3.12.7 (main, Oct  3 2024, 15:15:22) [GCC 14.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> print(f"foo {"bar"}")
    foo bar


This behaviour was introduced in 3.6 (and made part of the spec in 3.7 iirc)

From the python 3.6 change log:

New dict implementation¶ The dict type now uses a “compact” representation based on a proposal by Raymond Hettinger which was first implemented by PyPy. The memory usage of the new dict() is between 20% and 25% smaller compared to Python 3.5.

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon (this may change in the future, but it is desired to have this new dict implementation in the language for a few releases before changing the language spec to mandate order-preserving semantics for all current and future Python implementations; this also helps preserve backwards-compatibility with older versions of the language where random iteration order is still in effect, e.g. Python 3.5). (Contributed by INADA Naoki in bpo-27350. Idea originally suggested by Raymond Hettinger.)

https://docs.python.org/3.6/whatsnew/3.6.html#new-dict-imple...


For Regex I like lens-regex-pcre

    > import Control.Regex.Lens.Text
    > "Foo, bar" ^.. [regex|\p{L}+|] . match
    ["Foo", "bar"]
    > "Foo, bar" & [regex|\p{L}+|] . ix 1 . match %~ T.intersperse '-' . T.toUpper
    "Foo, B-A-R" 
For web requests wreq has a nice interface. The openssl bindings come from a different library so it does need an extra config line, the wreq docs have this example:

    import OpenSSL.Session (context)
    import Network.HTTP.Client.OpenSSL

    let opts = defaults & manager .~ Left (opensslManagerSettings context)
    withOpenSSL $
      getWith opts "https://httpbin.org/get"
There are native Haskell tls implementations that you could plug into the manager config. But openssl is probably the more mature option.


You are matching ASCII letters? Cute. What about Unicode character classes like \p{Spacing_Combining_Mark} and non-BMP characters?

Can you translate the examples at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... to Haskell? This Control.Regex.Lens.Text library doesn't seem to believe in documenting the supported syntax, options, etc.


"Cute" comes across as very dismissive. I'm not sure if you intended that. lens-regex-pcre is just a wrapper around PCRE, so anything that works in PCRE will work, for example, from your Mozilla reference:

    ghci> "California rolls $6.99\nCrunchy rolls $8.49\nShrimp tempura $10.99" ^.. [regex|\p{Sc}\s*[\d.,]+|] . match
    ["$6.99","$8.49","$10.99"]
"Spacing combining mark" seems to be "Mc" so this works:

https://unicode.org/reports/tr18/#General_Category_Property

    ghci> "foo bar \x093b baz" ^.. [regex|\p{Mc}|] . match
["\2363"]

(U+093b is a spacing combining mark, according to https://graphemica.com/categories/spacing-combining-mark)

I think in general that Haskellers would probably move to parser combinators in preference to regex when things get this complicated. I mean, who wants to read "\p{Sc}\s*[\d.,]+" in any case?


U+093b is still in the BMP. By the way, what text encodings for source files are supported by GHC? Escaping everything isn't fun.

And I am not sold on lens-regex-pcre documentation; "anything that works in PCRE will work" comes across as very dismissive. What string-like types are supported? What version of PCRE or PCRE2 does it use?


> U+093b is still in the BMP

I'm sorry, I don't know what that means. If you have a specific character you'd like me to try then please tell me what it is. My Unicode expertise is quite limited.

> I am not sold on lens-regex-pcre documentation

Nor me. It seems to leave a lot to be desired. In fact, I don't see the point of this lens approach to regex.

> "anything that works in PCRE will work" comes across as very dismissive

Noted, thanks, and apologies. That was not my intention. I was trying to make a statement of fact in response to your question.

> By the way, what text encodings for source files are supported by GHC?

UTF-8 I think. For example, pasting that character into GHC yields:

    ghci> mapM_ T.putStr ("foo bar ः baz" ^.. [regex|\p{Mc}|] . match)
    ः
> What string-like types are supported?

ByteString (raw byte arrays) and Text (Unicode, internal representation UTF-8), as you can see from:

https://hackage.haskell.org/package/lens-regex-pcre

> What version of PCRE or PCRE2 does it use?

Whatever your system version is. For me on Debian it's:

    Package: libpcre3-dev
    Source: pcre3
    Version: 2:8.39-15


> version of PCRE

It uses https://hackage.haskell.org/package/pcre-light , which seems to link with the system version. So it depends on what you install. With Nix, it will be part of your system expression, of course.


Either hackernews or autocorrect ate the p, it was supposed to be \p{L} which is a unicode character class.

As the other comment mentioned pcre-compatible Regex are a standard, though the pcre spec isn't super readable. There are some projects that have more readable docs like mariadb and PHP, but it doesn't really make sense to repeat the spec in library docs https://www.php.net/manual/en/regexp.reference.unicode.php

There are libraries for pcre2 or gnu regex syntax with the same API if you prefer those


Both strategies start from roots (e.g. the stack) and then transitively chase pointers. Any memory reachable this way is live.

To do this chasing precisely you need some metadata, saying which fields are pointers vs other data like ints. For OOP languages this metadata is stored with the vtables for objects. For the stack you need similar metadata, at the very least how many pointers there are if you put them first in the stackframe.

Not having this metadata and conservatively treating random ints as pointers isn't always catastrophic, but it has some drawbacks. Two big ones:

- a moving GC is tough, you have to update pointers after moving an object but can't risk updating random ints that happen to have that value

- You do more work during GC chasing random non-pointers, and free less memory meaning more GC runs

Generating ths precise GC metadata for stackframes is sort-of easy.You need specific Safe-Points where all data is at known locations for the GC to work anyway, usually by spilling resgisters to the stack. These GC checkpoints usually coincide with heap allocation, which is why long non-allocating loops can block stop-the-world GC and send other threads into a spin-lock in many GC implementations

Maybe a non-precise GC could treat registers as possible pointers and skip spilling to the stack for Safe-Points? There are alternatives to spilling like a precise stack-map for each program instruction (so every instruction is a safe point), but those are expensive to process. Usually only used for debugging or exception handling, not something frequent like GC


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: