More

chairhairair · on March 14, 2023

I am struggling to imagine the frame of mind of someone who, when met with all this LLM progress in standardized test scores, infers that the tests are inadequate.

These tests (if not individually, at least in summation) represent some of society’s best gate-keeping measures for real positions of power.

Analemma_ · on March 14, 2023

This has been standard operating procedure in AI development forever: the instant it passes some test, move the goalposts and suddenly begin claiming it was a bad test all along.

chairhairair · on March 14, 2023

You honestly believe you could hand write code to pass an arbitrary LSAT-level exam?

phphphphp · on March 14, 2023

You’ve added a technical constraint. I didn’t say arbitrary. Standardised tests are standard. The point is that a simple lookup is all you need. There’s lots of interesting aspects to LLMs but their ability to pass standardised tests means nothing for standardised tests.

chairhairair · on March 14, 2023

You think that it’s being fed questions that it has a lookup table for? Have you used these models? They can answer arbitrary new questions. This newest model was tested against tests it hasn’t seen before. You understand that that isn’t a lookup problem, right?

phphphphp · on March 14, 2023

The comment I replied to suggested that the author was fearful of what LLMs meant for the future because they can pass standardised tests. The point I’m making is that standardised tests are literally standardised for a reason: to test information retention in a standard way, they do not test intelligence.

Information retention and retrieval is a long solved problem in technology, you could pass a standardised test using technology in dozens of different ways, from a lookup table to Google searches.

The fact that LLMs can complete a standardised test is interesting because it’s a demonstration of what they can do but it has not one iota of impact on standardised testing! Standardised tests have been “broken” for decades, the tests and answers are often kept under lock and key because simply having access to the test in advance can make it trivial to pass. A standardised test is literally an arbitrary list of questions.

You’re arguing a completely different point.

chairhairair · on March 14, 2023

I have no idea what you are talking about now. You claimed to be able to write a program that can pass the LSAT. Now it sounds like you think the LSAT is a meaningless test because it... has answers?

I suspect that your own mind is attempting to do a lookup on a table entry that doesn't exist.

phphphphp · on March 14, 2023

The original comment I replied to is scared for the future because GPT-4 passed the LSAT and other standardised tests — they described it as “terrifying”. The point I am making is that standardised tests are an invention to measure how people learn through our best attempt at a metric: information retention. You cannot measure technology in the same way because it’s an area where technology has been beating humans for decades — a spreadsheet will perform better than a human on information retention. If you want to beat the LSAT with technology you can use any number of solutions, an LLM is not required. I could score 100% on the LSAT today if I was allowed to use my computer.

What’s interesting about LLMs is their ability to do things that aren’t standardised. The ability for an LLM to pass the LSAT is orders of magnitude less interesting than its ability to respond to new and novel questions, or appear to engage in logical reasoning.

If you set aside the arbitrary meaning we’ve ascribed to “passing the LSAT” then all the LSAT is, is a list of questions… that are some of the most practiced and most answered in the world. More people have written and read about the LSAT than most other subjects, because there’s an entire industry dedicated to producing the perfect answers. It’s like celebrating Google’s ability to provide a result for “movies” — completely meaningless in 2023.

Standardised tests are the most uninteresting and uninspiring aspect of LLMs.

Anyway good joke ha ha ha I’m stupid ha ha ha. At least you’re not at risk of an LLM ever being able to author such a clever joke :)

tannhauser23 · on March 14, 2023

You don't know how the LSAT works, do you? It's not a memorization test. It has sections that test reading comprehension and logical thinking.

phphphphp · on March 14, 2023

If a person with zero legal training was to sit down in front of the LSAT, with all of the prep material and no time limit, are you saying that they wouldn’t pass?

chairhairair · on March 14, 2023

We’re rapidly approaching problems (AP Calculus BC, etc) that are in the same order of magnitude of difficulty as “design and implement a practical self-improving AI architecture”.

Endless glib comments in this thread. We don’t know when the above prompt leads to takeoff. It could be soon.

plaidfuji · on March 14, 2023

And funnily enough, with the AI community’s dedication to research publications being open access, it has all the content it needs to learn this capability.

“But how did skynet learn to build itself?”

“We showed it how.”

James_Henry · on March 14, 2023

Since when was AP Calculus BC on the same order of magnitude as "design and implement a practical self-improving AI architecture"?

chairhairair · on March 14, 2023

Assuming the range of intelligence spanning all the humans that can pass Calculus BC is narrow on the scale of all possible intelligences.

It’s a guess, of course. But, the requisite concepts for getting Transformers working are not much broader than calculus and a bit of programming.

James_Henry · on March 14, 2023

Since when was "design and implement a practical self-improving AI architecture" on the same level as knowing "the requisite concepts for getting Transformers working"?

00F_ · on March 14, 2023

this is such garbage logic. the semantics of that comment are irrelevant. creating and testing AI node structures is well within the same ballpark. even if it wasnt, the entire insinuation of your comment is that the creation of AI is a task that is too hard for AI or for an AI we can create anytime soon -- a refutation of the feedback hypothesis. well, thats completely wrong. on all levels.

James_Henry · on March 15, 2023

Sorry, what is the "feedback hypothesis"? Also, despite my use of quotes, I'm not arguing about semantics.

dw_arthur · on March 14, 2023

We can't predict what is coming. I think it probably ends up making the experience of being a human worse, but I can't avert my eyes. Some amazing stuff has and will continue to come from this direction of research.

evouga · on March 14, 2023

I passed Calculus BC almost 20 years ago. All this time I could have been designing and implementing a practical self-improving AI architecture? I must really be slacking.

dwaltrip · on March 14, 2023

In the broad space of all possible intelligences, those capable of passing calc BC and those capable of building a self-improving AI architecture might not be that far apart.

00F_ · on March 14, 2023

hey, im very concerned about AI and AGI and it is so refreshing to read your comments. over the years i have worried about and warned people about AI but there are astonishingly few people to be found that actually think something should be done or even that anything is wrong. i believe that humanity stands a very good chance of saving itself through very simple measures. i believe, and i hope that you believe, that even if the best chance we had at saving ourselves was 1%, we should go ahead and at least try. in light of all this, i would very much like to stay in contact with you. ive connected with one other HN user so far (jjlustig) and i hope to connect with more so that together we can effect political change around this important issue. ive formed a twitter account to do this, @stop_AGI. whether or not you choose to connect, please do reach out to your state and national legislators (if in the US) and convey your concern about AI. it will more valuable than you know.

ryanwaggoner · on March 14, 2023

That's a pretty unfair comparison. We know the answers to the problems in AP Calculus BC, whereas we don't even yet know whether answers are possible for a self-improving AI, let alone what they are.

ignoramous · on March 14, 2023

> Endless glib comments in this thread.

Either the comments are glib and preposterous or they are reasonable and enlightening. I guess they are neither but our narrow mindedness makes it so?

7373737373 · on March 14, 2023

A few hundred people on Metaculus are predicting weakly general AI to be first known around September 2027: https://www.metaculus.com/questions/3479/date-weakly-general...

chairhairair · on March 14, 2023

When it does exactly that you will find a new place to put your goalposts, of course.

burnished · on March 14, 2023

No, the robot will do that for them.

cactusplant7374 · on March 14, 2023

Goalposts for AGI have not moved. And GPT-4 is still nowhere near them.

sebzim4500 · on March 14, 2023

Yeah, I'm not sure if the problem is moving goalposts so much as everyone has a completely different definition of the term AGI.

I do feel like GPT-4 is closer to a random person than that random person is to Einstein. I have no evidence for this, of course, and I'm not even sure what evidence would look like.

chairhairair · on Feb 14, 2023

Yes making ice-based self-replicating robots is “difficult”…

panick21_ · on Feb 14, 2023

We are also don't have much real research into actually trying it. And it doesn't have to all the way self-replicating. It more like using local materials to build the heavy parts of robots. Maybe those robots then couldn't build another one of themselves.

hef19898 · on Feb 14, 2023

Unless space ice has mechanical properties comparable to steel, I don't see the point in even trying...

dotnet00 · on Feb 14, 2023

While ice is a somewhat extreme example, the idea of bringing over the electronics and using local materials to put together whatever structural components are needed isn't that crazy. It'd save a lot of mass, and things like 3d printers can produce them with reasonable precision. There already is a decent amount of research into the prospect of 3d printing structures with Lunar or Martian regolith, so structural components for robots or machines don't seem too crazy.

panick21_ · on Feb 14, 2023

I don't remember the details and I don't remember the exact place where I have heard and interview with somebody that works on this stuff.

They wouldn't use pure ice. But in cold places, with ice mixed with some other materials you can actually make quite good materials. Consider that in most places gravity is much lower then on earth so it doesn't need to be carbon fiber to be useful.

hef19898 · on Feb 14, 2023

I said steel, not carbon fibre. And even in zero-g things still have mass to them that needs to be handled, don't they?

panick21_ · on Feb 14, 2023

I know what you said, I just used carbon fibre to make a point is. You don't always need the best materials to have something useful.

Yes things still have mass but if you are building a robot that moves around there is a big difference in what kind of quality structural materials you need for the robot to be viable.

Part of the research that would go into such project would be to look at what local resources are, and how to make them into useful materials. For example, using ice in combination with some filler material has been shown to be quite usable in cold temperatures.

The exact materials you would use depend on where you would want to use this kind of system. Maybe in the far future these kind of system would look around to analyses the environment and make smart choices about what materials to use to build themselves.

chairhairair · on Feb 3, 2023

You can use Flutter as an imperative framework. Just ignore widgets (the reactive top layer of Flutter) and use the underlying imperative elements directly.

zerr · on Feb 3, 2023

Well, I'd say the main value proposition of Qt/Delphi/wxWidgets/WinForms/etc... is a rich set of widgets/controls including its event handling and layout mechanisms. Is such stuff available in the imperative impl of Flutter?

chairhairair · on Jan 19, 2023

1. Humans have not been eating cows produced by modern methods for millions of years. Cow diet, genetics, and lifecycle is quite unnatural at this point. (This is the overwhelming bulk of cow consumption, of course your local micro farm produces cows more similar to ancient cows)

2. Humans haven’t been eating cow (or meat generally) on this scale ever before in history.

3. Even despite the above, we genuinely don’t know to what extent baseline human health is dependent on traditional diet. It’s not impossible that there exists a modern radical diet that greatly improves health and longevity without including any “natural” foods.

chairhairair · on Jan 18, 2023

We will continue to debate whether these programs are "just programs" up to and beyond the point they grow capable of reducing civilization to dust.

The linked blog post and the associated paper are important from the perspective of alignment, not philosophy of mind.

chairhairair · on Jan 17, 2023

Why would running a full browser be relevant to knowing HTTP statuses of requests?

remixz · on Jan 17, 2023

Our product works by taking a screenshot using a headless Chrome instance. In this case, it's helpful because we can look at not just the status code of the HTTP request to the page itself, but also any resources the page may fetch. This is particularly useful for SPAs, since they may return a 200 for the page itself, but an API call they make might return a non-200 when logged out.

chairhairair · on Jan 4, 2023

Discussion about the relevance of Liu Cixin’s The Three-Body Problem trilogy to US defense strategy with respect to Chinese perspectives.

Two things I found interesting from reading this journal article:

1. There exists an academic community of US defense strategists that publish open access material regarding strategy. I didn't know such a community existed.

2. That community sees value in understanding potential Chinese perspectives regarding military strategy through published popular fiction.