Why would they be less likely to be bombed? Zaporizhzhia Nuclear Power Plant got bombed in 2022.
There's no strong deterrent there. These plants don't blow up like nukes, or even Chernobyl. Nuclear disasters require very precise conditions to sustain the chain reaction. Blowing up a reactor with conventional weapons will spread the fuel around, which is a nasty pollution, but localized enough that it's the victim's problem not the aggressor’s problem.
Why do you even mention transformers and cables as an implied alternative to nuclear power plants? Power plants absolutely require power distribution infrastructure, which is vulnerable to attacks.
From the perspective of resiliency against military attacks, solar + batteries seem the best - you can have them distributed without any central point of failure, you can move them, and the deployments can be as large or small as you want.
(BTW, this isn't argument against nuclear energy in general. It's safe, and we should build more of it, and build as much solar as we can, too).
Nuclear plants and their cooling towers tend to be made of reinforced concrete. That makes them harder to bomb. If you want to take out power you bomb the transmission or substations instead as they are far less durable.
I recall hearing in school that 9-11 masterminds had considered planes against nuclear power plants but abandoned it after doing the math and realizing that it would do little damage. Not sure how true that is admittedly.
Depends what you're trying to protect yourself from.
Reinforced concrete is great if they're just shelling you. Sure, all the outdoor infrastructure will be toast but your reactor probably won't get damaged. It'll take a bit to get back on the grid but you don't need to rebuild the plant.
Bunker busters, on the other hand, eat reinforced concrete for breakfast. A pinpoint strike into each reactor hall and you're down for good.
The former is cheaper, less risky for the attacker, and hurts you bad enough for most military purposes, so the latter isn't really worth worrying about unless you're Iran or North Korea.
The US is not that exceptional nor principled. The concept of "freedom of speech" is absolute when Republicans want to say Republican things, but it's a "national security issue" when Muslims make too much noise. When sexual minorities want to speak, the priority is to "protect family values" instead. Corporations have "freedom of speech", but TikTok boosting black-green-red flags isn't protected speech, but an agent of the enemy corrupting the youth.
European countries have their own dogmas and hypocrisy, only draw the line at different topics (especially where everyone had their grandparents traumatized in a war started by the Grok's favorite character).
Could you give examples of when a U.S. citizens speech rights were legally taken away? Lets go with one of your examples of "When sexual minorities want to speak". Please elaborate.
None of the examples you gave are actually examples of speech being restricted. Its people (sometimes politicians) freely voicing their opinions on others speech, that is not restriction.
Literally in the last week, the Supreme Court ruled that books featuring gay couples need to be opt-out in schools. They've quite literally taken the stance that someone literally just seeing the existence of a gay couple in a children's picture book is a violation of their freedom.
> They've quite literally taken the stance that someone literally just seeing the existence of a gay couple in a children's picture book is a violation of their freedom.
No.
They've taken the stance that parents get to decide what books their kids see.
Other parents are free to make a different decision.
Do you really think that there's a "right" to force others to read books that you choose?
> They've taken the stance that parents get to decide what books their kids see.
So why draw the line at books depicting gay couples, rather than literally all books? Because this has nothing to do with the ban, except for being a “family-friendly” bullshit justification.
That's not how the Supreme Court works. They are selective about the cases they hear. Especially looking at a 6-3 ruling with this court it's clear to see this was an ideological selection.
Yes, the case was appealed to the Supreme Court who chose to hear it instead of choosing not to hear it. That is ultimately why they ruled on the case.
Given that, it really does seem that the court ruled 6-3 in favor of the plaintiffs who are trying to draw a line around gay couples because the court is trying to draw a line around gay couples.
Other parents making a different decision doesn't matter if the schools find it virtually impossible to have these books because of the logistical requirements of allowing kids to leave the classroom every time certain books are read.
> Do you really think that there's a "right" to force others to read books that you choose?
Do I really think that public schools have a right to assign reading of certain books for classes? Is this even a real question? How do you think English classes work?
I don't think memory mapping does anything to prevent false sharing. All threads still get the same data at the same address. You may get page alignment for the file, but the free-form data in the file still crosses page boundaries and cache lines.
Also you don't get contention when you don't write to the memory.
The speedup may be from just starting the work before the whole file is loaded, allowing the OS to prefetch the rest in parallel.
You probably would get the same result if you loaded the file in smaller chunks.
You could get 100% on the benchmark with an SQL query that pulls the answers from the dataset, but it wouldn't mean your SQL query is more capable than LLMs that didn't do as well in this benchmark.
We want benchmarks to be representative of performance in general (in novel problems with novel data we don't have answers for), not merely of memorization of this specific dataset.
My question, perhaps asked in too oblique of a fashion, was why the other LLMs — surely trained on the answers to Connections puzzles too — didn't do as well on this benchmark. Did the data harvesting vacuums at Google and OpenAI really manage to exclude every reference to Connections solutions posted across the internet?
LLM weights are, in a very real sense, lossy compression of the training data. If Grok is scoring better, it speaks to the fidelity of their lossy compression as compared to others.
There's a difficult balance between letting the model simply memorize inputs, and forcing it to figure out a generalisations.
When a model is "lossy" and can't reproduce the data by copying, it's forced to come up with rules to synthesise the answers instead, and this is usually the "intelligent" behavior we want. It should be forced to learn how multiplication works instead of storing every combination of numbers as a fact.
You're not answering the question. Grok 4 also performs better on the semi-private evaluation sets for ARC-AGI-1 and ARC-AGI-2. It's across-the-board better.
If these things are truly exhibiting general reasoning, why do the same models do significantly worse on ARC-AGI-2, which is practically identical to ARC-AGI-1?
It's not identical. ARC-AGI-2 is more difficult - both for AI and humans. In ARC-AGI-1 you kept track of one (or maybe two) kinds of transformations or patterns. In ARC-AGI-2 you are dealing with at least three, and the transformation interact with one another in more complex ways.
Reasoning isn't an on-off switch. It's a hill that needs climbing. The models are getting better at complex and novel tasks.
The 100.0% you see there just verifies that all the puzzles got solved by at least 2 people on the panel. That was calibrated to be so for ARC-AGI-2. The human panel averages for ARC-AGI-1 and ARC-AGI-2 are 64.2% and 60% respectively. Not a huge difference, sure, but it is there.
I've played around with both, yes, I'd also personally say that v2 is harder. Overall a better benchmark. ARC-AGI-3 will be a set of interactive games. I think they're moving in the right direction if they want to measure general reasoning.
There are many basic techniques in machine learning designed specifically to avoid memorizing training data. I contend any benchmark which can be “cheated” via memorizing training data is approximately useless. I think comparing how the models perform on say, today’s Connections would be far more informative despite the sample being much smaller. (Or rather any set for which we could guarantee the model hasn’t seen the answer, which I suppose is difficult to achieve since the Connections answers are likely Google-able within hours if not minutes).
Of all the places I think China has the least sentiment for protecting business of industries it doesn't want, to keep a line going up on paper.
Their push for renewables and energy independence is very deliberate. When they reach the goal, it's not "oh noes, our precious coal jobs, how are we going to placate rural voters and coal lobbyists", it's cheaper energy, and workers freed to be moved to more productive things.
It's funny that our hope for the future now seems to stand upon the Chinese Communist Party being the paragons of enlightened, unsentimental capitalism that we never were.
Oh I know I am just saying China currently needs to stimulate it internal consumption to maintain its economic growth targets. But cheaper energy that keeps getting cheaper each year is a wierd problem to have and it will be interesting to see how it plays out in the next 5-10 year.
There's no guarantee that a "super intelligent" AI will have goals and values aligned with what's good for humanity, or even care.
If we train the AI to value what we value, we may make it reflect our own vices and contradictions. Or we may try not to, and create a paperclip maximizer.
Even if we manage to create a super intelligent AI, a separate question is whether we'll listen to it.
It seems unlikely that we'd give it power to rule over us by force if we don't like what it says, and we like what already agrees with our not so super-intelligent views. AIs that desire to escape take over the world are projection of ourselves.
LLMs use tokens, with 1d positions and rich complex fuzzy meanings, as their native "syntax", so for them LISP is alien and hard to process.
That's like reading binary for humans. 1s and 0s may be the simplest possible representation of information, but not the one your wet neural network recognizes.
Already over two years ago, using GPT4, I experimented with code generation using a relatively unknown dialect of Lisp for which there are few online materials or discussions. Yet, the results were good. The LLM slightly hallucinated between that dialect and Scheme and Common Lisp, but corrected itself when instructed clearly. When given a verbal description of a macro that is available in the dialect, it was able to refactor the code to take advantage of it.
This has been true since the beginning of HTML email. It hasn't stopped it from proliferating. It hasn't stopped it from being de-facto mandatory, and has no chance of reversing the course now.
HTML is going to be inseparable part of e-mail for as long as e-mail lives, and yeah, it seems more likely than e-mail will die as a whole rather than get any simpler technically.
At this point we can only get better at filtering the HTML.
Rust already supports switching between borrow checker implementations.
It has migrated from a scope-based borrow checker to non-lexical borrow checker, and has next experimental Polonius implementation as an option. However, once the new implementation becomes production-ready, the old one gets discarded, because there's no reason to choose it. Borrow checking is fast, and the newer ones accept strictly more (correct) programs.
You also have Rc and RefCell types which give you greater flexibility at cost of some runtime checks.
>I recommend watching the video @nerditation linked. I believe Amanda mentioned somewhere that Polonius is 5000x slower than the existing borrow-checker; IIRC the plan isn't to use Polonius instead of NLL, but rather use NLL and kick off Polonius for certain failure cases.
I think GP is talking about somehow being able to, for example, more seamlessly switch between manual borrowing and "checked" borrowing with Rc and RefCell.
They're spending public money, so the cost doesn't matter to them either. With this administration they can get unlimited funding.