I played with this and it is super interesting (almost made me register a couple domain names!) That said, to me, it once again reinforces the belief that a large factor in GPT-3 amazingness is the taste in prompting/filtering that humans apply to it, that is, it produces a ton of crap that does not catch our eye that we just silently ignore and discard, but we will amplify and share amongst ourselves the output that is interesting, which imputes a huge selection bias external to the intelligence of the model itself that we may perceive as the model's.
Computers aren't creative; the excitement about GPT-3 is humans projecting something into its output or filtering the small number of bits that appear to make sense, as you say.
Neural language models are just recycling bits that humans have said before, they address well the "how to say" part of NLG (Natural Language Generation) but fail with respect to the "what to say" part.
I was starting to write a response, and realised how patterned the debate is.
We've been discussing the topic of machine intelligence since the start of CS and points of contention tend to show up in the same places.
Anyway, it occured to me that if given 10 tries, GPT-3 probably produces my comment adequately. I have an uneasy suspicion that it might take fewer tries than that... especially if prompted with "netcan, please respond."
Anyway, the upshot is that for someone like me, on the "if you can't tell the difference" side of the cliche... your side's insistence that machine's can't be creative makes me doubt my own creativity, to the extent that the argument is convincing.
I'm on the "if you can't tell the difference..." side, too, but the issue is the "10 tries" and the prompting. GPT-3 has no ability to revise or reconsider or judge anything it writes. We're still safe, IMO, until the main loop goes
>generate 10 "netcan comments"
>drop 5 least conformant results
>reword remaining comments several times
>reassess rewordings, select best of each set
>select 1 most conformant "netcan comment"
Right now (AFAIK, anyway) all GPT-3 does is the first one. Think about how a human is creative -- lots of drafts, lots of dead ends, lots of borrowing, lots of revision. I appreciate that "reflection on its own output" is waaaay out of scope for a glorified, omni-contextual Markov chain, but I think we're safe, both from the threat of GPT-3 being "creative" in the same way we are (and therefore, maybe "alive" in the same way we are) and the threat of our own mental processes being revealed to be as merely computational as those of GPT-3.
Oh I don't disagree that the "best of gpt-3" on twitter is automated hyperbole machine.^ But I think at least in HN, a lot of people have looked at somewhat more objective sources too. The standard way to prove a trick shot is to film yourself do two or three in a row. I agree these are tricks, and a lot are unproven. Some are proven though, FWIW.
Anyway, if the damn thing can do me 10%-60% of the time in a non trivial context... IDK, maybe this just becomes the default standard for "non trivial." and it's no big deal. It is disconcerting though. In any case, assuming GPT-3 progresses... there might be uses for software that does just the first of those things.
All my above comment really means for sure is that NLP can now participate, simulate and perhaps instigate flame wars better. Flame wars were always one of the easy targets, so that's not much of a standard. Who knows though, maybe the road to AGI is a gradual refinement of a flamewar bot to a dialectic philosopher. Cheating on essays is gonna start getting fun.
^If a person uses the Tom Sawyer fence trick to automate a task, is that automation? If the machine does it, is the turn tables?
Absolutely. The "problem" essentially goes away with a cyborg approach: let GPT-3 generate its 10 netcan comments, then have a human do a little refining, maybe reframe the prompt, edit the result a little, and bam! Much scarier! Not that a sufficiently determined human couldn't ape someone's comment style, but GPT-3 is a force multiplier in the same way that sockpuppet management software is for the professional sockpuppeteer -- scales better, more effective, just makes everything easier.
I've heard serious speculation that GPT-3 (or certainly its successors) might find utility for writers as a combination of a ghostwriter and Github Copilot.
>^If a person uses the Tom Sawyer fence trick to automate a task, is that automation? If the machine does it, is the turn tables?
It's not here yet, but the grim day on which I no longer communicate with anyone on the Internet for fear they're not actually a living conversational partner approaches. Even those I can cryptographically prove are people that I know will be suspects: "euurgh, I'm too busy to talk to this guy today, he's so boring. I'll just feed GPT-10 all our old conversations, tell it to be me, aaaand..."
Messaging apps already do a bare version of suggesting auto-replies. It's mostly it's a texting-while-driving aid, so they keep it concise and uncomplicated. Code completion already exists. Spam. Gradual steps have places to start stepping.
this is the only way to reliably use GPT-X in a production setting. you have to postprocesses the responses with a second model, until you find one that is 1) reasonable 2) coherent 3) related to the topic of the prompt
That fact that humans can be creative does not mean that all humans are. Nor does it mean its actually all that common. True creativity is very rare. Even people we normally describe as "creatives" are mostly working off of convention and are influenced by others.
I've come to accept that 99% of what humanity does is derivative and mediocre. That's ok.
> humans projecting something into its output or filtering the small number of bits
I think this has value and is an interesting form of machine assisted creation.
There is something similar, but more simplistic (no AI, just old-school RNGs), going on in generative music. For modular synthesizers, some of the more popular modules are random sequence generators [0]. These allow the musician to generate random elements of the music, say melody or drum pattern, but curate it. You might generate dozens of elements, carefully tweaking parameters, before you hear "the right one" – which you then actually use in the music.
I guess my point is, curation of generated material is a form of creation! The generator is perhaps not the creator, but it is useful to develop better generators.
[0] : such as “Turing Machine” from Modular Thing and “Marbles” from Mutable Instruments.
I wonder how much of that is due to the task GPT-3 was trained for? Perhaps the "what to say" part is working perfectly, but what it wants to say is as generic a text stream as possible. It already seems like GPT-3 has more "knowledge" than is obvious at first glance, but you need to prompt it correctly.
I agree it's fragile and dreamlike - but it is in my opinion genuine understanding. If you have or ever do have children you can observe them pass through a similar kind of stage, fractured incomplete understanding, dreamlike, yet still understanding. It crystallises slowly overtime. I predict the same will happen as we expand the models and add more complexity.
Your statement that computers aren't creative is not true in general and it's especially not true for GPT-3. It can generate text that appears random at first glance but which has a structure hidden within. As I wrote above, there is no way to prove conclusively whether any given sentence generated by GPT-3 was actually written by GPT-3. But if we assume that all sentences were indeed written by GPT-3 then some of them must be very good indeed because many people find them interesting and worth reading. That means that GPT-3 has demonstrated creativity.
I feel like this implies that creativity is the creation of the interpretation of a work rather than literally its creation and I don't really agree with that conclusion.
The caveat being that what catches our eye when flipping through GPT-3 output is more often comically absurd [0] instead of meaningful or 'intelligent.'
0. dishwitter
dish·wit·ter
a person who eats hot wax or other food
"she was a dishwitter"
It;s true that selection bias is a factor, and the way GPT-3 is distributed tends to aggravates that.
Also, this is a human impersonator. It's not surprising that we find it personable and interesting. We find ceramic dolls personable and interesting. People can't really be trusted to evaluate human-looking things well. It's just to triggering to the instinctive biases of a social mammal. A gorilla impersonator would probably amaze gorillas in much the same way.
That said, I do think there's something to these nlp systems that we didn't have before.
The poke-poke stage of examining novel tech tends to be quirky, and most almost-insights are bogus. Stuff that's cool but useless is as compelling as useful stuff, at first. This isn't a long stage though. We usually gain a grounded understanding of a technology only when we find a use for it. Technology being used for something is what makes it technology, for some definition of.
Prompting & filtering by humans is how GPT-3 is "operated." That it's merely augmenting the creativity of the prompter is a philosophical concern. The practical one is that it needs prompting and filtering. Also, it's trivial to automate some kinds of prompting and filtering.
There's a subset of words produced by the site that trigger obvious and amusing ideas of what they could mean in the reader, but the generated definitions have nothing to do with that, and usually do not make much sense (besides being sometimes grammatically incorrect). It's evident that the software has no semantic understanding of what it generates. The rare occasions where the definitions appear to (almost) make creative sense are just statistical flukes.
That said, if I go through this for 5 minutes and pick the 10 best words they will be much better than the 10 best words I could make up in 5 minutes by myself.
I feel you could use it for like 10 seconds to get some inspiration and then trivially come up with something much better. It’s kind of like brainstorming with somebody who speaks before they think.
This isn't even 'GPT-2'! Since that usually refers to GPT-2-1.5b. It's GPT-2-117M... distilled, so an even smaller version of the smallest GPT-2. Amazing it works so well anyway.