Hacker Newsnew | past | comments | ask | show | jobs | submit | tgamblin's commentslogin

The more recent Lifschitz book is the easiest to learn from IMO:

- https://www.cs.utexas.edu/~vl/teaching/378/ASP.pdf

It starts with basics of using ASP and gives examples in clingo, not math.

The Potassco book is more comprehensive and will help you understand better what is going on:

- https://potassco.org/book/

Things I don't like include that it's more dense, doesn't use clingo examples (mostly math-style examples so you kind of have to translate them in your head), and while the proofs of how grounding works are interesting, the explanations are kind of short and don't always have the intuition I want.

I still think this is the authoritative reference.

The "how to build your own ASP system" paper is a good breakdown of how to integrate ASP into other projects:

- https://arxiv.org/abs/2008.06692

The Potassco folks are doing amazing work maintaining these tools. I also wish more people knew about them.

EDIT: I forgot to mention that specifically for games stuff like enclose.horse, look at Adam Smith's Applied ASP Course from UCSC:

- https://canvas.ucsc.edu/courses/1338

Forgot to mention that one... we use clingo in Spack for dependency solving and other applications frequently slip my mind.


Thank you.. Have noted these down.


I do for macOS and Linux :). Windows support is also coming along.

There isn’t anything particularly special about the HPC world other than the need for many different configurations of the same software for many different CPU and GPU architectures. You might want to have several versions of the same application installed at once, with different options or optimizations enabled. Spack enables that. Environments are one way to keep the different software stacks separate (though, like nix, spack uses a store model to keep installs in separate directories, as well).


> Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties from Prolog. It is often used as a query language for deductive databases.

https://en.m.wikipedia.org/wiki/Datalog


do you know what you are writing about? I mean have you actually done something with datalog? and then _which_ datalog? if yes, then you are probably someone working with it academically or the answer is no. because try to even set a toy project up with it (for the purpose of learning how to use it) and you'll quickly run into unmaintained interpreters, discussions of what datalog is and what not and you can choose between difficult to understand academic papers or simplistic introductions that lead you no where.


I have found two somewhat usable (your point still stands): soufflé (high performance but more limited) and DES, which works well for some simple personal data management, after some code massage (it’s written in Prolog). Any other recommendations? And since the prolog experts are here: what do you think about Ciao? Seems quite polished but also adventurous to (non-expert) me


Have you tried Datomic?


no.

https://github.com/Datomic/codeq : last update to that repo was 12 years ago.

it's JDK which I find unappealing.

also, how close is it to Datalog?

https://github.com/gns24/pydatomic : last update 11 years ago.

and that's representative of pretty much anything regarding Datalog.

So, I'll just stick to Prolog then.

---

have you?

would you recommend it?


Ok just to clarify, Datomic is a Clojure thing. It is free to use but closed source. It is an excellent database that is used, owned, and financed by Nubank, the largest and most rapidly growing bank in Brazil.

It is not Datalog syntax but heavily inspired by Datalog.

I'm throwing this in here just for clarification, I don't want to see Datomic as collateral damage in this conversation.


There is Datascript. I am not a Clojure guy, so it is not clear to me if it pulls datomic as a dependency.


Datascript is very similar to Datomic, except that it runs in ClojureScript and is an in-memory datastore with no concept of history or point-in-time queries. The schemas are also much looser than Datomic.

Otherwise, many syntax and semantics are similar.

No dependency on Datomic as Datomic is Java and DataScript is JavaScript.


Ah thank you.


I tried using Datomic Pro for a CMDB. I liked how logical queries were but I ended up going with Neo4j instead because finding paths between two nodes is incredibly useful in IT.

https://central.sonatype.com/artifact/com.datomic/local/1.0....


mhm ... sounds like you don't know what you are talking about if you conflate Neo4j/Cypher with Datalog ... because "in IT".


I'm fully aware they are very different things. I'm just saying I have tried using datomic and I really like it's query language but it cannot find a path between two objects like Neo4j which is a killer feature in a CMDB. My dream DB would be a hybrid of Neo4j and Datomic.

An example of where Neo4j really shines is I found a site with BGP route dumps. The file contains over 57 million very redundant Autonomous System paths that are just sequences of a IP prefix and the AS path it is reachable by. By loading each IP prefix and AS hop as

(Prefix)-[:ANNOUNCED_BY]->[:AS]-[:BGP_NEXT_HOP]->[:AS]

I can easily trace paths from one prefix to another by going

MATCH path = (p1:Prefix)-[:*]->(p2:Prefix) return path

which will return which AS announce the prefixes and all BGP paths between them. It really is very powerful.


Spack doesn't require any particular prefix... it has a deployment model like nix, and the store can go anywhere. Binary caches are relocatable.


It's disappointing that this response linked by OP was posted at all. And even more disappointing because context gets lost every time it shows up on HN.

The linked email is from an HPE/Cray employee interacting with the upstream gcc team, not from anyone in the U.S. government.

The U.S. Government, via lots of programs, national labs, etc. does pay people (and companies) to work on open source code. This has at various points included LLVM, clang, flang, gcc, and many other projects. We like it when things get upstreamed and we also contribute ourselves to these projects.

Certain companies' willingness to put in the work required to upstream has been an issue at times, but it is improving and it's something that we push on very hard.


> Maybe it's possible to have multiple _views_ of a filesystem, so via one API you "see" a flat namespace with hashed directories, and with another a conventional hierarchical one

Spack supports the notion of arbitrary "views" of the store, which can be defined declaratively [1]. Apparently we need to write more highlight blog posts and submit them here b/c people don't seem to know about this stuff :)

For example, if you want to make a view that included only things that depend on MPI (openmpi, mpich, etc.), but not things built with the PGI compiler, in directories by name, version, and what compiler was used to build, you could write:

    view:
      mpis:
        root: /path/to/view
        select: [^mpi]
        exclude: ['%[email protected]']
        projections:
          all: '{name}/{version}-{compiler.name}'
        link_type: symlink
That will get you symlinks to the prefixes of the packages that match the criteria, with fancy names.

If you wanted a different layout you might do:

    view:
      myview:
        projections:
          zlib: "{name}-{version}"
          ^mpi: "{name}-{version}/{^mpi.name}-{^mpi.version}-{compiler.name}-{compiler.version}"
          all: "{name}-{version}/{compiler.name}-{compiler.version}"
That puts all your zlib installs in a short name-version directory, most things in directories named by the package and version and a subdirectory for what compiler it was built with, and for packages that depend on MPI it also adds the MPI implementation (openmpi, mpich, etc.) and its version to that.

You can choose to map everything into a single prefix too:

    view:
      default:
        root: ~/myprefix
That is useful for what we call a "unified" environment where there is only one version of any particular package -- it basically converts a subset of a store into an FHS layout.

There are options for choosing what types of links to use and what types of dependencies to link in, e.g., you could exclude build dependencies if you didn't want the cmake you built with to be linked in.

[1] https://spack.readthedocs.io/en/latest/environments.html#fil...


> Apparently we need to write more highlight blog posts and submit them here b/c people don't seem to know about this stuff

100% you do. I find the term a bit odd, but I've been a "Linux professional" for coming up on 30 years now and I've never even heard of this stuff before.

There is _so much_ in the Linux world that is just badly explained, or not explained, and it's very hard to find cogent explanations. Trying to find them, summarise them, and in some cases, just write them myself is a large part of my job...

And it's a huge field.

One of the reasons some things dominate the headlines is that the creators spend a significant amount of time and effort on outreach. On just talking about what they are doing, why.

That's why there are tonnes of stories about GNOME, KDE, systemd, Flatpak, Snap, Btrfs, ZFS, etc. They talk. They explain.

It's also why there are next to none about Nix, Guix, Spack, and legions of others. They don't.

To pick a trivially small example: Nix talks about being declarative, but it doesn't explain what "declarative" means. And it mentions "purely functional" but it doesn't explain that either. Those things right there are worth about 1000-2000 words of explanation per word. Omit that, and the original message becomes worthless, because it becomes insiders-only.

As it happens through decades of reading about Lisp and things, I more or less get it, but not well, and I struggle to explain it.


This seems nicer and cooler than the Nix profile system, where there's just a single layout.

What have you found yourself (or customers or other users or whatever) using it for in practice?


FWIW this is what Spack does, and it uses a store layout like nix/guix. here are some chunks of the spack install tree path. All you really need is the hash, but we provide a bit more.

    $ spack find -p
    ...
    [email protected]             ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/cmake-3.27.9-x6g5fl54kgt4nqmgnajrtfycivokap2p
    [email protected]             ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/cmake-3.27.9-x4bznyafesvpdbplqhmnwwvy2zj5fdzs
    [email protected]            ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/coreutils-9.3-vlwyg7d3ysxfoom3erl6ai4yy7fuf2jn
    [email protected]               ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/curl-8.4.0-zjz6twa32vxnevika34kkrxjwo27z6vj
    [email protected]               ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/curl-8.6.0-da5tk7aqaqp6zdligjmahzwohzut65qc
    [email protected]            ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/diffutils-3.9-lau4vo7zlsktmbhyn3pf3x72nxhihsaq
    [email protected]  ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/double-conversion-3.3.0-zihvj7vzedapjprk76awr7qied3k45n6
    [email protected]               ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/emacs-29.2-x5v4wdhp4vfsuxquadackqckdvxrppwi
    [email protected]              ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/expat-2.5.0-hmdysf6hcrzuhcxu7fkecpvskzxecgps
    [email protected]          ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/findutils-4.9.0-atynmdjhdjpb2ipurlk7cxtdep77m7mu
    [email protected]               ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/fish-3.6.1-sg3ws3rmbfgdfyhvjnhd5agepguj4yf2
    [email protected]               ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/flex-2.6.3-quw6ammnjxfutqtqcifvjfsp5nyqpeab
    [email protected]          ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/freetype-2.11.1-ue2ofutkkawtzvcl47pqts4lasm5lgha
    [email protected]               ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/gawk-5.2.2-jld7vb24rls64nbzkyxcp4alluyedfzv
    [email protected]                ~/spack/opt/spack/darwin-sonoma-m1/apple-clang-15.0.0/gdbm-1.23-amtpuskagocirjhffgqrjas3bfbnzixi
    ...

You can also customize the install tree paths[1] with format strings; the default is:

    config:
      install_tree:
        root: $spack/opt/spack
        projections:
          all: "{architecture}/{compiler.name}-{compiler.version}/{name}-{version}-{hash}"
One reason `nix` uses `/nix/HASH` is because it results in shorter paths and avoids issues with long paths. We use sbang[2] to avoid the most common one -- that long shebangs don't work on some OS's (notably, most linux distros).

[1] https://spack.readthedocs.io/en/latest/config_yaml.html#conf...

[2] https://github.com/spack/sbang


Sounds fantastic!

Thanks for this -- I'll read up on it.


There is also `spack develop`, on which people are building hack/build/test loops. You can `spack develop` any package in spack, and easily build with a modified version of something.

See also https://github.com/sandialabs/spack-manager.


This is stolen from @Theophite, sadly now banned: https://images.app.goo.gl/Rv5SK1qqWXZVNSZ56


Still posting under @revhowardarson AFAIK


The article mentions the key detail: MD5 is broken for cryptography (collisions) but not for second preimage attacks. I was hoping there would be some discussion of just how much more difficult the latter is. It is extremely difficult.

Let’s ignore that no second preimage attack is currently known for MD5. The software the author links to has a FAQ that links to a paper that lays out the second preimage complexity for MD4:

https://who.paris.inria.fr/Gaetan.Leurent/files/MD4_FSE08.pd...

It takes 2^102 hashes to brute force this for MD4, which is weaker than MD5. A bitcoin Antminer K7 will set you back $2,000, and it gets 58 TH/s for sha256, which is slower than MD5 or MD4. Let’s ignore that MD5 is more complex than MD4, and let’s say conservatively that similar hardware might be twice as fast for MD5 (SHA256 is really only 20-30% slower on a cpu). It’ll take 2^102/58e12/2/60/60/24/365, or about 1.4 billion years to do a second preimage attack with current hardware. So you could do that 3 times before the sun dies.

If you want to reduce that to 1.4 years, you could maybe buy a billion K7’s for $2 trillion. And each requires 2.8kW so you’ll need to find 2.8 terawatts somewhere. That’s 34 trillion kWh for 1.4 years. US yearly energy consumption is 4 trillion kWh.

It will be a while, probably decades or more, before there’s a tractable second preimage attack here.

Yes, there are stronger hashes out there than MD5, but for file verification (which is what it’s being used for) it’s fine. Safe, even. The legal folks should probably switch someday, and it’ll probably be convenient to do so since many crypto libraries won’t even let you use MD5 unless you pass a “not for security” argument.

But there’s no crisis. They can take their time.


> The article mentions the key detail: MD5 is broken for cryptography (collisions) but not for second preimage attacks.

The problem with this argument is that people often don't properly understanding the security requirements of systems. I can't count the number of times I've seen people say "md5 is fine for use case xyz" where in some counterintuitive way it wasn't fine.

And tbh, I don't understand the urge of people to defend broken hash functions. Just use a safe one, even if you think you "don't need it". It doesn't have any downsides to choose a secure hash function, and it's far easier to do that than to actually show that you "don't need it" (instead of just having a feeling you don't need it).

For the unlikely event that you think that the performance matters (which is unlikely, as cryptographic hash functions are so fast that it's really hard to build anything where the diff. between md5 and sha256 matters), even that's covered: blake3 is faster than md5.


> I can't count the number of times I've seen people say "md5 is fine for use case xyz" where in some counterintuitive way it wasn't fine.

I can count many more times that people told me that md5 was "broken" for file verification when, in fact, it never has been.

My main gripe with the article is that it portrays the entire legal profession as "backwards" and "deeply negligent" when they're not actually doing anything unsafe -- or even likely to be unsafe. And "tech" apparently knows better. Much of tech, it would seem, has no idea about the use cases and why one might be safe or not. They just know something's "broken" -- so, clearly, we should update immediately or risk... something.

> Just use a safe one, even if you think you "don't need it".

Here's me switching 5,700 or so hashes from md5 to sha256 in 2019: https://github.com/spack/spack/pull/13185

Did I need it? No. Am I "compliant"? Yes.

Really, though, the main tangible benefit was that it saved me having to respond to questions and uninformed criticism from people unnecessarily worried about md5 checksums.


>And "tech" apparently knows better.

The tech community has a massive problem with Dunning-Kruger, and has for basically ever. Hell two decades ago when I was a young guy working in the field so did I.

I'm not sure if its because the field is basically a young man's game and that's inherent with relative youth, or if there's something deeper going on, but its hard to ignore once you notice it.

That said, the idea that you have a better handle of what's going on in the legal system and the needs/uses legal professionals have then actual people in the legal profession and academics in the legal field is a pretty big leap even with those priors.


> I can't count the number of times I've seen people say "md5 is fine for use case xyz" where in some counterintuitive way it wasn't fine.

Help us out by describing a time when this happened. MD5's weaknesses are easily described, and importantly, it is still (second) preimage resistant.

I agree that upgrade is likely your best bet. But I've found the other direction of bad reasoning is a more pernicious trap to fall into. "My system uses bcrypt somewhere so therefore it is secure" and the like is often used as a full substitute for thinking about the entirety of the system.


> MD5's weaknesses are easily described, and importantly, it is still (second) preimage resistant

Most devs have no idea what that means, but most devs still need to use hash functions. They need to use primitives that match their mental model of a hash function. Said model is https://en.m.wikipedia.org/wiki/Random_oracle

The usual answer here is "don't roll your own crypto", but in practice abstinence-only cryptography education doesn't work.


> Help us out by describing a time when this happened.

Linus Torvalds saying that SHA-1 is okay for git, while it is used for Git signatures as well. Signatures are a classic "you need collission resistance to have safe signatures, but people are often confused about it" case.


I might be mistaken, but wouldn't a git signature already be signing trusted things (i.e. the person making the original signature is trusted), making any attack enabled by the input hash function a second preimage attack (i.e. an attacker onky knows the trusted input, not anything private like the signing key)?

Hash collisions mean you can't trust signatures from _untrusted_ sources, but git signatures don't seem to fit that situation.


As you pointed out, signatures make content trusted, but only to the degree of the algorithm's attack resistance. I think it's also important to define trust; for our purposes this means: authenticity (the signer deliberately signed the input) and integrity (the input wasn't tampered with).

If an algorithm is collision resistant a signature guarantees both authenticity and integrity. If it's just second preimage resistant, signing may only guarantee authenticity.

Now, the issue with Git using SHA-1 is that an attacker may submit a new patch to a project, rather than attack an existing commit. In that case they are in control of both halves of the collision, and they just need for the benign half to be useful enough to get merged.

Any future commits with the file untouched would allow the attacker to swap it for their malicious half, while claiming integrity thanks to the maintainers' signatures. They could do this by either breaching the official server or setting up a mirror.

One interesting thing to note though: in the case of human readable input such as source code, this attack breaks down as soon as you verify the repo contents. Therefore it's only feasible longer term when using binary or obfuscated formats.


> And tbh, I don't understand the urge of people to defend broken hash functions. Just use a safe one, even if you think you "don't need it".

The ideal discourse would not imply a binary sense of "safety" at all, much less for a function evaluated outside the context and needs of its usage....


The thing is: We have a binary definition of safety for cryptographic hash functions, and it works well.

You can add a non-binary sense of safety to cryptographic hash functions, but it makes stuff a lot more complicated for no good reason. If you use the "preimage-safe-but-not-collission-safe" ones, you need to do a lot more analysis to show safety of your whole construction. You could do that, but it gives you no advantage.


Second preimage attacks aren't the only threat in a forensics environment.

Also, hand-wavy extrapolations from Bitcoin miners aren't a reliable estimate of how fast & energy-efficient dedicated MD5 hardware could become.


Which part was hand-wavy/unreasonable? Do you think that dedicated MD5 hardware could become billions or even millions of times more efficient within a decade? If so, why?


MD5 is already not "fine" or "safe, even" against malicious actors who might pre-prepare collisions, or pre-seed their documents with the special constructs that make MD5 manipulable to collision-attacks.

Even if your extrapolative method was sound, you've already got several factors wrong. The best SHA256 Bitcoin miners are today more than twice your estimate in hashrate, and on plain CPUs SHA256 is more like 4x slower than MD5. (Your smaller estimate of MD5's speed advantage is likely derived from benchmarks where there's special hardware support for SHA256, but not MD5, as common in modern processors.)

But it's also categorically wrong to think the CPU ratio is a good guide to how hardware optimizations would fare for MD5. The leading Bitcoin miners already use a (patented!) extra 'ASICBoost' optimization to eke out extra parallelized SHA256 tests, for that use-case, based on the internals of the algorithm. As a smaller, simpler algorithm – also with various extra weaknesses! – there's no telling how many times faster dedicated MD5 hardware, either for generically calculating hashes or with special adaptations for collision-search – might run, with similar at-the-gates, on-the-die cleverness.

Further, attacks only get better & theory breakthroughs continue. Since MD5 is already discredited amongst academics & serious-value-at-risk applications – and has been since 1994, when expert cryptographers began recommending against its use in new work – there's not much above-ground scholarly/commercial activity refining attacks further. The glory & gold has mostly moved elsewhere.

But taking solace in the illusory lack-of-attacks from that situation is foolhardy, as is pronouncing, without reasoning, that it's "probably decades or more" before second-preimage attacks are practical. Many thought that with regard to collision attacks versus SHA1 – but then the 1st collision arrived in 2017 & now they're cheap.

You can't linear-extrapolate the lifetime of a failed, long-disrecommended cryptographic hash that's already failed in numerous of its original design goals. Like a bridge built with faulty math or tainted steel, it might collapse tomorrow, or 20 years from now. Groups in secret may already have practical attacks – this sort of info has large private value! – waiting for the right time to exploit, or currently only exploiting in ways that don't reveal their private capability.

You are right that there's no present 'crisis'. But it could arrive tomorrow, causing a chaotic mad-dash to fix, putting all sorts of legal cases/convictions/judgements in doubt. Evidentiary systems should be providing robust authentication/provenance continuity across decades, as that's how long cases continue, or even centuries, for related historical/policy/law issues to play out.

Good engineers won't wait for a crisis to fix a longstanding fragility in socially-important systems, or deploy motivated-reasoning wishful-thinking napkin-estimates to rationalize indefinite inaction.


If I understand this correctly, the paper only shows a particular attack of complexity 2^102. Someone may find a different attack with much lower complexity. That's the usual way how cryptography gets broken -- people find better and better attacks, and suddenly the latest attack has low enough complexity to be practical.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: