Hacker News new | past | comments | ask | show | jobs | submit | creata's comments login

The author seems to mostly be talking about the aliasing rules, but if you don't want to deal with those, can't you use UnsafeCell?

Imo, the more annoying part is dealing with exception safety. You need to ensure that your data structures are all in a valid state if any of your code (especially code in an unsafe block) panics, and it's easy to forget to ensure that.


Sorry if this is a stupid question, but is there a direct link between the choice of weight function and the applications of the polynomial?

Like, is it possible to infer that Chebyshev polynomials would be useful in approximation theory using only the fact that they're orthogonal wrt the Wigner semicircle (U_n) or arcsine (T_n) distribution?


Chebyshev polynomials are useful in approximation theory because they're the minimax polynomials. The remainder of polynomial interpolation can be given in terms of the nodal polynomial, which is the polynomial with the interpolation nodes as zeros. Minimizing the maximum error then leads to the Chebyshev polynomials. This is a basic fact in numerical analysis that has tons of derivations online and in books.

The weight function shows the Chebyshev polynomials' relation to the Fourier series . But they are not what you would usually think of as a good candidate for L2 approximation on the interval. Normally you'd use Legendre polynomials, since they have w = 1, but they are a much less convenient basis than Chebyshev for numerics.


True, and there are plenty of other reasons Chebyshev polynomials are convenient, too.

But I guess what I was asking was: is there some kind of abstract argument why the semicircle distribution would be appropriate in this context?

For example, you have abstract arguments like the central limit theorem that explain (in some loose sense) why the normal distribution is everywhere.

I guess the semicircle might more-or-less be the only way to get something where interpolation uses the DFT (by projecting points evenly spaced on the complex unit circle onto [-1, 1]), but I dunno, that motivation feels too many steps removed.


If there is, I'm not aware of it. Orthogonal polynomials come up in random matrix theory. Maybe there's something there?

But your last paragraph is exactly it... it is a "basic" fact but the consequences are profound.


Could you guys recommend an easy book on this topic?

Which topic?

Some sort of numerical analysis book that covers these topics - minimax approx, quadrature etc. I’ve read on these separately but am curious what other sorts of things would be covered in courses including that.

I would check out "An Introduction to Numerical Analysis" by Suli and Mayers or "Approximation Theory and Approximation Practice" by Trefethen. The former covers all the major intro numerical analysis topics in a format that is suitable for someone with ~undergrad math or engineering backgrounds. The latter goes deep into Chebyshev approximation (and some related topics). It is also very accessible but is much more specialized.

I'd suggest: Trefethen, Lloyd N., Approximation theory and approximation practice (Extended edition), SIAM, Philadelphia, PA (2020), ISBN 978-1-611975-93-2.

Yes. Precisely that they are orthogonal means that they are suitable.

If you are familiar with the Fourier series, the same principle can be applied to approximating with polynomials.

In both cases the crucial point is that you can form an orthogonal subspace, onto which you can project the function to be approximated.

For polynomials it is this: https://en.m.wikipedia.org/wiki/Polynomial_chaos


What you're saying isn't wrong, per se, but...

There are polynomials that aren't orthogonal that are suitable for numerics: both the Bernstein basis and the monomial basis are used very often and neither are orthogonal. (Well, you could pick a weight function that makes them orthogonal, but...!)

The fact of their orthogonality is crucial, but when you work with Chebyshev polynomials, it is very unlikely you are doing an orthogonal (L2) projection! Instead, you would normally use Chebyshev interpolation: 1) interpolate at either the Type-I or Type-II Chebyshev nodes, 2) use the DCT to compute the Chebyshev series coefficients. The fact that you can do this is related to the weight function, but it isn't an L2 procedure. Like I mentioned in my other post, the Chebyshev weight function is maybe more of an artifact of the Chebyshev polynomials' intimate relation to the Fourier series.

I am also not totally sure what polynomial chaos has to do with any of this. PC is a term of art in uncertainty quantification, and this is all just basic numerical analysis. If you have a series in orthgonal polynomials, if you want to call it something fancy, you might call it a Fourier series, but usually there is no fancy term...


I think you are very confused. My post is about approximation. Obviously other areas use other polynomials or the same polynomials for other reasons.

In this case it is about the principle of approximation by orthogonal projection, which is quite common in different fields of mathematics. Here you create an approximation of a target by projecting it onto an orthogonal subspace. This is what the Fourier series is about, an orthogonal projection. Choosing e.g. the Chebychev Polynomials instead of the complex exponential gives you an Approximation onto the orthogonal space of e.g. Chebychev polynomials.

The same principle applies e.g. when you are computing an SVD for a low rank approximation. That is another case of orthogonal projection.

>Instead, you would normally use Chebyshev interpolation

What you do not understand is that this is the same thing. The distinction you describe does not exist, these are the same things, just different perspectives. That they are the same easily follows from the uniqueness of polynomials, which are fully determined by their interpolation points. These aren't distinct ideas, there is a greater principle behind them and that you are using some other algorithm to compute the Approximation does not matter at all.

>I am also not totally sure what polynomial chaos has to do with any of this.

It is the exact same thing. Projection onto an orthogonal subspace of polynomials. Just that you choose the polynomials with regard to a random variable. So you get an approximation with good statistical properties.


No, I am not confused. :-) I am just trying to help you understand some basics of numerical analysis.

> What you do not understand is that this is the same thing.

It is not the same thing.

You can express an analytic function f(x) in a convergent (on [-1, 1]) Chebyshev series: f(x) = \sum_{n=0}^\infty a_n T_n(x). You can then truncate it keeping N+1 terms, giving a degree N polynomial. Call it f_N.

Alternatively, you can interpolate f at at N+1 Chebyshev nodes and use a DCT to compute the corresponding Chebyshev series coefficients. Call the resulting polynomial p_N.

In general, f_N and p_N are not the same polynomial.

Furthermore, computing the coefficients of f_N is much more expensive than computing the coefficients of p_N. For f_N, you need to evaluate N+1 integral which may be quite expensive indeed if you want to get digits. For p_N, you simply evaluate f at N+1 nodes, compute a DCT in O(N log N) time, and the result is the coefficients of p_N up to rounding error.

In practice, people do not compute the coefficients of f_N, they compute the coefficients of p_N. Nevertheless, f_N and p_N are essentially as good as each other when it comes to approximation.


At this point I really hate to ask. Do you know what "orthogonal subspace" means and what a projection is?

Ah, shucks. I can see you're getting upset and defensive. Sorry... Yes, it should be clear from everything I've written that I'm quite clear on the definition of these.

If you would like to read what I'm saying but from a more authoritative reference that you feel you can trust, you can just take a look at Trefethen's "Approximation Theory and Approximation Practice". I'm just quoting contents of Chapter 4 at you.

Again, like I said in my first response to you, what you're saying isn't wrong, it just misses the mark a bit. If you want to compute the L2 projection of a function onto the orthogonal subspace of degree N Chebyshev polynomials, you would need to evaluate a rather expensive integral to compute the coefficients. It's expensive because it requires the use of adaptive integration... many function evaluations per coefficient! Bad!

On the other hand, you could just do polynomial interpolation using either of the degree N Chebyshev nodes (Type-I or Type-II). This requires only N+1 functions evaluations. Only one function evaluation per coefficient. Good!

And, again, since the the polynomial so constructed is not the same polynomial as the one obtained via L2 projection mentioned in paragraph 3 above, this interpolation procedure cannot be regarded as a projection! I guess you could call it an "approximate projection". It agrees quite closely with the L2 projection, and has essentially the same approximation power. This is why Chebyshev polynomials are so useful in practice for approximation, and why e.g. Legendre polynomials are much less useful (they do not have a convenient fast transform).

Anyway, I hope this helps! It's a beautiful subject and a lot of fun to work on.


> patterns like using indices instead of pointers which are counter to high performance

Using indices isn't bad for performance. At the very least, it can massively cut down on memory usage (which is in turn good for performance) if you can use 16-bit or 32-bit indices instead of full 64-bit pointers.

> "unsafe" (which is strictly more dangerous than even C, never mind Zig)

Unsafe Rust is much safer than C.

The only way I can imagine unsafe Rust being more dangerous than C is that you need to keep exception safety in mind in Rust, but not in C.


Not quite, you also need to keep pointer non-nullness, alignment and aliasing safety in Rust, which is very pervasive in Rust (all shared/mutable references) but very rare in C (the 'restricted' keyword).

In Rust, it's not just using an invalid reference that causes UB, but their very creation, even if temporary. For example, since references have to always be aligned, the compiler can assume the pointer they were created from was also aligned, and so suddenly some ending bits from the pointer are ignored (since they must've been zero).

And usually the point of unsafe is to make safe wrappers, so unafe Rust makes or interacts with safe shared/mutable references pretty often.


It's just hard for me to imagine someone accidentally messing up nonnullness or aliasing, because it's really in-your-face that you need to be careful when constructing a reference unsafely. There are even idiomatic methods like ptr::as_ref to avoid accidentally creating null references.

The points here aren't technically wrong, but it still feels like disabling DoH would be a reduction in security. For example:

> Cloudflare gets all your DNS queries.

That's true, but Cloudflare is more trustworthy than my ISP, and probably most people's ISPs.

> Complexity is the enemy of security.

That's true, but that's no reason to go from an imperfect solution to a nonsolution.

> there is DNS over TLS

That doesn't solve most of the issues that the author brought up.

> How does a modern company in the IT business earn money? By selling data.

Maybe I'm naive, but I thought they made money by using all the data they collect for better threat prevention, and from their paid services.


My ISP is bound by robust privacy, telecommunications interception and other legislation.

Cloudflare, on the other hand is based in a foreign jurisdiction that offers none of these protections.


> My ISP is bound by robust privacy, telecommunications interception and other legislation.

It really depends on which jurisdiction are you in, unfortunately. US ISPs are selling everything they can hover (including DNS information) to advertisers, and it is impossible to switch to another one unless you're lucky (because the monopoly is essentially maintained).


So is Cloudflare, which is a US ISP....

Cloudflare is not an ISP. They have other services they sell. Maybe they're selling your data, maybe not. I honestly have not read their agreements and terms, but it's not nearly as obvious that you're the product as something like Google

So this company based in the US which provides internet services is not an internet service provider.

Given that they are funded and run by the same forces american parastical capitalism provides I would trust them as much as I'd trust google or alphabet.

I'll continue to route my DNS to quad-nine over mullvad over my specifically chosen ISP, and everything on my network does that as I can easily intercept and redirect udp/53.

The weak point are treacherous devices which use DoH which is a constant fight to block.


They provide network services on the internet, but unless I'm missing some product they don't list on their website they do not provide actual basic IP internet connectivity for businesses, everything they sell is some service on top of your existing ISP services

And until TLS is made secure they'll continue to rape privacy by scraping your https traffic.

The most important part of DoH, etc is that it allows you to make a choice. You can choose a vendor in your country. As a Canadian, I might want to use the service offered by my national TLD operator https://www.cira.ca/en/canadian-shield/configure/firefox/

Many ISPs explicitly sell DNS data, and are also advertising vendors.

Cloudflare, on the other hand, doesn’t share or sell data and retains minimal data: https://developers.cloudflare.com/1.1.1.1/privacy/public-dns...


> The most important part of DoH, etc is that it allows you to make a choice.

So does UDP based DNS, and TLS based DNS. It’s all the same in that regard.


With insecure DNS, the choice isn't meaningful since your ISP will see all of the data no matter which DNS server you pick to use. And those kinds of ISPs will probably block DoT because they want to keep seeing it all, but they can't block DoH.

I put my DNS service on a non-standard port. I’m the only one using it so standards be damned. Windows doesn’t allow setting a nonstandard port for DNS, but pretty-much everything else does.

Do ISPs do deep packet inspection to get lookup data? Maybe, but it increases the cost of doing so and makes the business aspect of it less viable. Perhaps a minor win.


In addition, your ISP can also extract whichever metadata it wants from your communications, incl. a very likely perfect guess of the hostnames you visit at which times _even if you don't use DNS at all_, just by looking at IP traffic metadata such as addresses and packet sizes.

So you already have to trust your ISP anyway -- but there was no need to trust Cloudflare *. DoH to Cloudflare is almost certainly a net loss in privacy compared to using your ISP's DNS over clear text.

* Right until they became hosters of half of the WWW. So Cloudflare can pretty much also guess your activity even if you don't do DNS with them anyway.


> In addition, your ISP can also extract whichever metadata it wants from your communications, incl. a very likely perfect guess of the hostnames you visit at which times _even if you don't use DNS at all_, just by looking at IP traffic metadata such as addresses and packet sizes.

Big CDNs and ECH make that impossible.


Does it, really? Have you seen wireshark output lately? (the GUI can be configured to do reverse lookup on all IP address)

If I check up right now, form the top 10 links in HN right now, it is trivial to distinguish the top-level domain from just the IPv4 or IPv6 address. Heck, even _for this website itself_ the current IPv4 reverse DNS points to ycombinator.com. I don't even need to go into packet size heuristics, or the myriad of ad networks, etc.

Sure there are some instances where you will share the IP of the CDN. This has been seen recently e.g. in the recent article of the "LaLiga" blocks in Spain. But bigger sites cannot afford for this to happen, and even smaller sites tend to have at least one paid IP address for mail (reputation is a bitch, and Cloudflare doesn't have any).


> If I check up right now, form the top 10 links in HN right now, it is trivial to distinguish the top-level domain from just the IPv4 or IPv6 address.

Two of the top 10 links in HN right now (https://news.ycombinator.com/item?id=44215603 and https://news.ycombinator.com/item?id=44212446) are to different subdomains of github.io that resolve to the exact same IP addresses, so reverse DNS doesn't tell you which one is being visited.

And you can't even tell the TLD, because the TLD is "io", but the reverse lookup on the IPs will give you a TLD ending in "com".

> Heck, even _for this website itself_ the current IPv4 reverse DNS points to ycombinator.com.

That's because HN isn't behind the kind of CDN I'm talking about. But a lot are. Is your argument "since your ISP can see some of the sites you're going to, we should remove all protections and let them see all sites you're going to?"


I said top-level domain. Anyway, you have a better estimate, for the types of sites people here would visit? If HN itself isn't an example, then Github subdomains definitely ain't (not even close to the traffic of the main domain).

> I said top-level domain.

"io" and "com" are top-level domains, and in the example I gave, you can't even distinguish between them.


Well, I appreciate the correction: I meant second level (or whatever is most distinguishing for that TLD). However, even if what you say is true, you really cannot disprove my claim with one nitpick, you need to talk majorities. (And, in case it needs to be said: i really don't think the issue here is distinguishing activity to github.io vs github.com)

Okay, how about this then. Here's some of the IP addresses of posts on the HN front page right now:

  104.21.3.245
  104.21.68.247
  104.21.80.31
  104.21.95.131
  104.21.112.1
  104.26.4.133
None of them have reverse DNS records. Can you tell which is which?

So you take literally the worst possible set of IPs (all of them cloudflare), IPv4 only, and yet Copilot (!) is easily able to reverse 50% of them:

  104.21.3.245 -- trebaol.com
  104.21.80.31 -- diwank.space
  104.26.4.133 -- daringfireball.net 
  104.21.112.1 -- simonwillison.net , taras.glek.net
This was literally the worst example you could possibly do. I hope you kept which one was which, I'd like to know if Copilot was right.

In the meanwhile, from the current top #30 articles on HN (also via copilot script, but I removed non-cloudflare IPs):

  ycombinator.com -- no CDN
  letsbend.de -- no CDN
  grepular.com -- no CDN
  xania.org -- cloudfront
  github.io -- no common CDN
  owlposting.com -- AWS, but IPv4 remained static
  netfort.gr.jp -- no CDN
  simonwillison.net -- cloudflare, 104.21.112.1 fixed
  folklore.org -- azure, 13.107.246.1-255 range
  danq.me -- no CDN
  nature.com -- fastly, IPv4 remained static
  daringfireball.net -- cloudflare, 104.26.4.133
  ssp.sh -- no CDN
  trebaol.com -- cloudflare, 104.21.3.245
  glek.net -- cloudflare, 104.21.112.1
  gov.uk -- AWS, but IPV4 remained static
  phys.org -- no CDN
  diwank.space -- cloudflare, 104.21.80.31 
  free.fr -- no CDN   (my French ISP, btw)
  ericgardner.info -- AWS, but IPv4 remained static
  ghuntley.com -- fastly, IPv4 remained static
  paavo.com -- no CDN
  railway.com -- cloudflare, 104.18.24.53
  alloc.dev -- cloudflare , 188.114.96.2
Look at how many of them are self-hosted, have zero CDN, or otherwise return me always the same IP (even when I try from 3 different ISPs) which makes them trivial to reverse address. This is already a pretty huge success rate and all my context is that you browsed HN first (which I know, see first result on the list). Now imagine the tools a ISP will have at its disposal:

- IPv6

- Its Geo region will actually match yours

- Routing tables

- The patience to also include resources fetched from these pages in the analysis (i.e. page X always gets its JS from Y domain which results in a constant Z KB transfer).

- The rest of your browsing activity

- The rest of everyone's browsing activity including most popular _current_ hosts for each hostname.

Do you still claim that it is "impossible" to track your activity because of CDNs? I still bet you your ISP can do it with _100%_ accuracy.


They're not all running single IP ECH yet. I was just making the point that it's not as trivial as a reverse DNS lookup, as you said it was.

It took me the whole of one Copilot conversation to do the entire thing. Most of the top #30 results are in fact one reverse DNS away. The rest is not much more complicated.

They're never going to be "1 IP ECH" . That would be the end of the Internet as we know it.

If it ever happens that the majority of the WWW is 1 CDN, we have a bigger privacy problem than DNS. Much bigger.


> IP traffic metadata such as addresses and packet sizes.

Even if you use a VPN?


That just shifts the trust from your ISP to your VPN provider. Moreover if you're already using a VPN, your DoH requests to cloudflare is already anonymized.

If you are using WireGuard between endpoints your traffic if opaque, but yeah if/where it exits it becomes (depending on the encapsulated protocol) visible.

ISP regularly captures NXDOMAIN.

They know your government id when you subscribe to their service.

CloudFlare, otoh, never have your identity. They only have the metadata


Change it to mullivad like i did then?

> That's true, but Cloudflare is more trustworthy than my ISP, and probably most people's ISPs.

Based on what?


> Based on what?

The bar is real low, mostly for the fact that ISPs are mandated by law in most if not all countries to track traffic flowing through their pipes.

Cloudflare provides relatively better privacy guarantees for the public DNS resolvers it runs: https://developers.cloudflare.com/1.1.1.1/privacy/cloudflare...


CF certainly less trustworthy than my isp which is shibboleth compliant. Or my vpn provider.

CF issues are dealt with “hope to get a post on HN trending”.


In the UK you can typically pick from a dozen ISPs, some of which are more trustworthy

All of which have infrastructure already in place to hand over all DNS queries if requested by HMG.

And you don't believe that Cloudflare has a similar infrastructure in place? :-(

Cloudflare specifically has infrastructure to prevent that: https://developers.cloudflare.com/1.1.1.1/encryption/oblivio.... It requires some additional setuo, but for example if you're on an Apple device using Private Relay you are using it.

You're next argument might be "but how do you know the server is really using ODNS?" You don't. If your security threat profile doesn't allow for this, whatever you're doing shouldn't be using a public internet network anyway.


Can you also choose which company provides the physical infrastructure that connects to your home?

If you live in a city or other urban area, typically you have the option of the decoupled telco (BT Openreach) that more or less everybody has, the entity which bought all the cable television companies (Virgin Media) and usually a fibre-for-purpose Internet company that decided to do your city or region.

If you live in a rural area where people are co-operative, there might be a community owned fibre operator plus Opeanreach, otherwise just Openreach.

If you live somewhere very silly, like up a mountain or on your own island, your only practical option will be paying Openreach to do the work.

Edited to add, Notably: Only Openreach is usable by an arbitrary service provider. So if you want to pick your service provider separately, the actual last mile delivery will always be Openreach. And if they're small it won't just be last mile, Openreach also sell backhaul to get your data from some distant city to the place where the ISP's hardware is, you're buying only the, like, actual service. Which is important - mine means no censorship, excellent live support and competent people running everything, but the copper under the ground is not something they're responsible for (though they are better than most at kicking Openreach when it needs kicking)


CityFibre is only available through wholesale ISP's. Other smaller alt-nets (such as the one I work for - Netomnia (including Brsk/YouFibre)) is gearing up to provide wholesale access.

In the UK there are even aggregators like Fibre Café [1] that makes it easier for ISP's to connect through multiple networks.

[1] https://fibrecafe.co.uk/


If you are lucky, yes. For example, I have a choice between CityFibre (XGS-PON), Openreach (GPON) and Virgin Media (DOCSIS) as well as 2 different 5G networks. It is rare for a property to only be covered by a single wired network these days in the UK.

> That's true, but that's no reason to go from an imperfect solution to a nonsolution.

This is textbook politician's fallacy. Yes, it may be preferable to continue with a "non-solution" if the solution proposed is stupid enough.


No it's not. I'm saying don't let the perfect be the enemy of the good.

DoH does solve a problem for many people. Many large ISPs will sell your DNS requests, use them for targeted advertising, tamper with responses for various reasons, etc., and so DoH is an improvement over the status quo--not for everyone, but for many users, and I'd guess most users.

You're right, DoH might not be worth adopting if it were "stupid enough", but... it's not stupid enough.


Your ISP already has all this metadata and more from other sources, so it is pointless to switch to DoH in this case, and if you do you willingly give this metadata to Cloudflare, which (for the majority of users) may even be in a better position to do evil.

> Your ISP already has all this metadata and more from other sources

If you combine this with ECH and a good blocker, no they do not. That's exactly why Spain is blocking around 60% of the internet during football games now; the ISPs cannot tell which websites and subscribers are pirating football streams.


> Spain is blocking around 60% of the internet during football games now

[citation needed for the 60% figure]

Precisely due to these blocks is why I know that Cloudflare is NOT 60% of the WWW, not yet at least. Certainly, if Cloudflare was serving 60% of the Internet, I would consider switching my DNS to them. But that would be a privacy nightmare for another day (replacing federated ISPs with a single big centralized one? great idea /s). It is not yet the case as of today.

In fact, as of today, and even if you have a "good blocker", I, a total noob, have a high chance of reliably identifying which HN news item from the top #30 you clicked from just the addresses: https://news.ycombinator.com/item?id=44219061 . Imagine what the non-noobs at your ISP could do.


In the Politician's Fallacy, the chosen solution doesn't solve the problem. In this example, DoH solves many of the problems, perhaps not optimally, but better than the "do nothing" choice.

So it doesn't really solve the problem, and may generate more (privacy) problems of its own. "doing nothing" may be the better solution here, which was the entire point made in the original episode.

To save some googling the Politicians Fallacy is this one:

We must do something. This is something. Therefore, we must do this.


Does the linked study actually check that the LLM solves the task correctly, or just that the code runs and terminates without errors? I'm bad at reading, but the paper feels like it's saying the latter, which doesn't seem that useful.

> Half a million lines of code in a couple of months by one dev. Seriously.

Not that you have any obligation to share, but... can we see?


45 implementations of linked lists.. sure of it.

Can't now. Can only show publicly when it's released at an upcoming trade show. But it's a CAD app with many, many models and views.

I see this FOMO "left in the dust" sentiment a lot, and I don't get it. You know it doesn't take long to learn how to use these tools, right?

it actually does if you want to do serious work.

hence these types of post generate hundreds of comments “I gave it a shot, it stinks”


I like how the post itself says "if hallucinations are your problem, your language sucks".

Yes sir, I know language sucks, there isnt anything I can do about that. There was nothing I could do at one point to convince claude that you should not use floating point math in kernel c code.

But hey, what do I know.


Did saying to Claude "do not use floating point math in this code" not work?

Correct, it did not work.

I personally can't see this example working out. I'll always want to get some kind of confirmation of which files will be deleted, and at that point, just typing the command out is much easier than reading.

You can just ask it to undelete what you want back. Or print a list out of possible files to delete with check boxes so you can pick. Or one-by-one prompt you. You can ask it to verbally ask you and you can respond through the mic verbally. Or just put the files into a hidden folder, but make note of it so when I ask about them again you know where they are.

Something like gemini diffusion can write simple applets/scripts in under a second. So your options are enormous for how to handle those deletions. Hell if you really want you can ask it to make your a pseudo terminal that lets you type in the old linux commands to remove them if you like.

Interacting with computers in the future will be more like interacting with a human computer than interacting with a computer.


This. Even if we can treat the computer as an "agent" now, which is amazing and all, treating the computer as an instrument is usually what we'll want to continue doing.

In my uninformed opinion, the difficulty with code extraction (or going the other way, extracting a theorem prover representation from a program) is shared, mutable data. All the solutions I've seen (e.g., separation logic) feel clunky as heck.

> the fact that you have agree with the abstract model, the fact that the abstract model has to be a good abstract model of your system

Regarding seL4 in particular, they've plugged many of those sorts of holes.

https://trustworthy.systems/publications/nicta_full_text/737...


Rust elliminates shared mutable data. Except for interior mutability and unsafe raw pointers.

Even with that asterisk, that makes Rust much more potent in the formal methods space. This has been noted by graydon aswell. I can dig up the reference if someone cares.


Yup, and Rust does also have an official Rust Formal Methods Interest Group https://rust-formal-methods.github.io/ It's likely that there will be significant future developments in this space.

True! Rust proves that you can get very far while only rarely using unrestricted shared mutable data (UnsafeCell). Do you know anything that uses borrowing-like ideas in a theorem proving context?

I know of Prusti, which is made by Eth Zurich. They use a form of separation logic, their backend (viper) was initially aimed at other languages, mostly go. But now prusti is also aimed at Rust. From what I heard, Rust worked a lot better for them, because a lot of the annotation overhead of Viper was about excluding mutable aliasing. Which rust handily solves.

However, last time I checked, I think I saw that Prusti does not currently handle UnsafeCell at all. I don't think that's a fundamental limitation, just a matter prioritized development time.


There are so many Rust things that Prusti doesn't understand that applying it to any real world project is still extremely challenging.

Kani interacts a little better with real code (see https://model-checking.github.io/kani/rust-feature-support.h...), as long as you don't accidentally need e.g. randomness for HashMap DoS protection (https://github.com/model-checking/kani/issues/2423#issuecomm...).

Loops are an utter pain to prove with Kani, though: https://model-checking.github.io/kani/tutorial-loop-unwindin... -- I believe people are using Verifast for loop-heavy code instead: https://github.com/verifast/verifast (this hole patchwork of multiple tools mostly consumed as binary downloads is miserable)

Here's an attempt to gradually verify the Rust stdlib: https://github.com/model-checking/verify-rust-std


I love the usage model of Kani. I just dislike the underlying method. Bounded Model Checking cannot take enough advantage of the borrow checker massively curtailing mutable aliasing.

But the idea of incremental annotations that require only a little bit of the program to be annotated is great.

I think there's a niche for a deductive verification approach that allows for incremental verification.


Yeah. I'm more a working programmer than an academic, I fundamentally can't care about the fancy theory if the tool isn't usable in real code (I'm looking at you, `verus! { everything is here and utterly miserable to use }`).

Put differently: If the "academically superior" tooling isn't good enough, I'll be sticking to testing, fuzzers and sans-I/O style architecture with deterministic simulation testing.

The best thing the Rust project could do at this point is to come up with a precondition/postcondition/invariant/etc syntax that a majority of the academics agree with, with the power to express forall and exists and such. Something that is syntactically close to Rust, can be used to annotate normal code, and can be extended tool-by-tool with something that looks like standard attribute macros etc. Then get the academics to focus on the semantics and underlying verification mechanism, while the working programmer gets `debug_assert!` power out of that.


I'd like to add, on top of that, some predicates like 'this memory is initialized'. Potentially also, 'this pointer is valid'. If I get to dream I'd add 'this value is never referenced through a pointer' or 'this is the only expression that can change this field struct'.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: