Kinda like advertising "Asbestos-Free Cereal" isn't it? If someone was marketing a product to me and they were super insistent about how super duper safe it was I would probably start getting suspicious
> But it's often disabled for the same reason as having router-level firewalls in the first place.
Yeah, anything that allows hosts to signal that they want to accept connections, is likely the first thing a typical admin would want to turn off.
It’s interesting because nowadays it’s egress that is the real worry. The first thing malware does is phone home to its CNC address and that connection is used to actually control nodes in a bot net. Ingress being disabled doesn’t really net you all that much nowadays when it comes to restricting malware.
In an ideal world we’d have IPv6 in the 90’s and it would have been “normal” for firewalls to be things you have on your local machine, and not at the router level, and allowing ports is something the OS can prompt the user to do (similar to how Windows does it today with “do you want to allow this application to listen for connections” prompt.) But even if that were the case I’m sure we would have still added “block all ingress” as a best practice for firewalls along the way regardless.
> Ingress being disabled doesn’t really net you all that much nowadays when it comes to restricting malware.
But how much of this is because ingress is typically disabled so ingress attacks are less valuable relative to exploiting humans in the loop to install something that ends up using egress as part of it's function.
Since we're talking about programs that are trying to set up a connection no matter what, I'm going to say "not much". It's not significantly shrinking the attack surface and forcing attackers onto a plan B that's meaningfully harder to do. It just adds this layer of awkwardness to everything, and attackers shrug and adapt.
You block inbound to block inbound. Of course it doesn’t do anything for outbound. Acting like you can just turn inbound filtering off because of that is disingenuous.
Port forwarding and hole punching have different objectives and outcomes, and I believe PCP only caters to the former.
While the outcomes might be similar (some inbound connections are possible), the scope (one specific external IP/port vs. everybody) and the semantics ("endorsement of public hosting" vs allowing P2P connections that are understood to require at least some third-party mediation) differ.
I also don't think that port forwarding is possible through multiple levels of firewalls (similar to "double NAT").
PCP has two operating modes, MAP and PEER. The latter should be similar to hole-punching.
And routers can forward PCP requests to their upstream routers. Some dualstack-lite routers do that and according to rumors (random internet forum comments) some CGNATs do support that.
The concurrent state machine example looks like a locking error? If the assumption is that it shouldn't change in the meantime, doesn't it mean the lock should continue to be held? In that case rust locks can help, because they can embed the data, which means you can't even touch it if it's not held.
Does the training process ensure that all the intermediate steps remain interepretable, even on larger models? Not that we end up with some alien gibberish in all but the final step.
Training doesn’t encourage the intermediate steps to be interpretable. But they are still in the same token vocabulary space, so you could decode them. But they’ll probably be wrong.
token vocabulary space is a hull around human communication (emoji, mathematical symbols, unicode scripts, ...), inside that there's lots of unused representation space that an AI could use to represent internal state.
So this seems to be bad idea from an safety/oversight perspective.
What is a bad idea? Allowing reasoning to happen in continuous space instead of discrete token space? This paper can be seen as a variant of the Coconut models (continuous chain of thought). Continuous reasoning is certainly more efficient when it works. Lack of interpret ability makes certain safety systems harder to enforce. Is that your point?
Yes. Coconut has the same issue. See also: a joint statement by researchers from several labs about CoT monitorability: https://arxiv.org/abs/2507.11473
It's hard to know which way this will go. Discrete/text reasoning has many advantages. Safety as you note. Interpretability, which is closely related. Interoperability - e.g. the fact that you can switch models mid-discussion in Cursor and the new model understands the previous model's CoT just fine, or the ability to use reasoning traces from a larger model to train a smaller model to reason.
Continuous latent reasoning is a big hassle, but wins on efficiency, and in some situations I'm sure people will decide that benefit is worth the hassle. Because efficiency is fighting physics, which is hard to argue with on small devices with batteries. So my guess is that we'll see some of each approach in the future - with most cloud stuff being discrete, and a few highly-tuned edge applications being continuous.
Safety is a multi-faceted problem. I think it's easy to over-index on it because the impacts can be so huge. But there are so many different ways to approach the problem, and we must not rely on any one of them. It's like cyber-security - you need to use defense in depth. And sometimes it makes sense to sacrifice one kind of protection in order to get some convenience. e.g. if you decide to use continuous reasoning, that probably means you need to write a custom classifier to detect mal-intent rather than relying on an off-the-shelf LLM to analyze the reasoning trace. So I wouldn't ever take a position like "nobody should ever use continuous reasoning because it's too dangerous" - it just means that kind of safety protection needs to be applied differently.
This is concerning on two fronts. The questions are no longer open (SO is CC-BY-SA) and if Q&A content dies then this herds even more people towards LLM use.
It's basically draining the commons.
Yup. This, to me, provides another explanation for why the social contract is being used as toilet paper by the owner class. They literally see the writing on the wall.
I don't quite understand the issue about public error enums? Distinguishing variants is very useful if some causes are recoverable or - when writing webservers - could be translated into different status codes. Often both are useful, something representing internal details for logging and a public interface.
I agree. Is he really trying to say that e.g. errors for `std::fs::read()` should not distinguish between "file not found" and "permission denied"? It's quite common to want to react to those programmatically.
IMO Rust should provide something like thiserror for libraries, and also something like anyhow for applications. Maybe we can't design a perfect error library yet, but we can do waaay better than nothing. Something that covers 99% of uses would still be very useful, and there's plenty of precedent for that in the standard library.
I doubt epage is suggesting that. And note that in that case, the thing distinguishing the cause is not `std::io::Error`, but `std::io::ErrorKind`. The latter is not the error type, but something that forms a part of the I/O error type.
It's very rare that `pub enum Error { ... }` is something I'd put into the public API of a library. epage is absolutely correct that it is an extensibility hazard. But having a sub-ordinate "kind" error enum is totally fine (assuming you mark it `#[non_exhaustive]`).
It's not uncommon to have it on the error itself, rather than a details/kind auxiliary type. AWS SDK does it, nested even [0][1], diesel[2], password_hash[3].
It's not necessarily about frequency, but about extensibility. There's lot of grey area there. If you're very certain of your error domain and what kinds of details you want to offer, then the downside of a less extensible error type may never actualize. Similarly, if you're more open to more frequent semver incompatible releases, then the downside also may never actualize.
- Struct variant fields are public, limiting how you evolve the fields and types
- Struct variants need non_exhaustive
- It shows using `from` on an error. What happens if you want to include more context? Or change your impl which can change the source error type
None of this is syntactically unique to errors. This becomes people's first thought of what to do and libraries like thiserror make it easy and showcase it in their docs.
> Struct variant fields are public, limiting how you evolve the fields and types
But the whole point of thiserror style errors is to make the errors part of your public API. This is no different to having a normal struct (not error related) as part of your public API is it?
> Struct variants need non_exhaustive
Can't you just add that tag? I dunno, I've never actually used thiserror.
> > Struct variant fields are public, limiting how you evolve the fields and types
> But the whole point of thiserror style errors is to make the errors part of your public API. This is no different to having a normal struct (not error related) as part of your public API is it?
Likely you should use private fields with public accessors so you can evolve it.
Having private variants and fields would be useful, yeah. Std cheats a bit with its ErrorKind::Uncategorized unstable+hidden variant to have something unmatchable.
shipping base64 in json instead of a multipart POST is very bad for stream-processing. In theory one could stream-process JSON and base64... but only the json keys prior would be available at the point where you need to make decisions about what to do with the data.
Still, at least it's an option to put base64 inline inside the JSON. With binary, this is not an option and must send it separately in all cases, even small binary...
You can still stream the base64 separately and reference it inside the JSON somehow like an attachment. The base64 string is much more versatile.
Even with binary, you can store a binary inline inside of another one if it is a structured format with a "raw binary data" type, such as DER. (In my opinion, DER is better in other ways too, and (with my nonstandard key/value list type added) it is a superset of the data model of JSON.)
Using base64 means that you must encode and decode it, but binary data directly means that is unnecessary. (This is true whether or not it is compressed (and/or encrypted); if it is compressed then you must decompress it, but that is independent of whether or not you must decode base64.)
> Still, at least it's an option to put base64 inline inside the JSON. With binary, this is not an option and must send it separately in all cases, even small binary...
There's nothing special about "text" or binary here. You can absolutely put binary inside other binary; you use a symbol that doesn't appear inside the binary, much like you do for text.
You use a divider, like " is for json, and a prearranged way to avoid that symbol from appearing inside the inner binary (the same approach that works for text works here).
What do you think a zip file is? They're not storing compressed binary data as text, I can tell you that.
This reminds me that I just learned the other day that .a files are unix archives, which have a textual representation (and if all the bundled files are textual, there's no binary information in the bundle). I thought .a was just for static libraries for the longest time, and had no idea that it was actually an old archive format.
It may amuse you to learn that tar headers are designed as straight up text tables with fixed-width columns, marred only by the fact that modern implementations pad with 0s instead of spaces. The numbers are encoded as octal digits!
Binary usually means arbitrary byte sequences so you can't choose a single delimiting character. The usual approaches are storing the length somewhere or picking a sufficiently long random sequence that it's vanishingly unlikely to occur in the payload.
reply