I just bought a pro subscription. First impressions: The new o1-Pro model is an ...

Mordisquitos · on Dec 5, 2024

> Aside from favoring the long em-dash (—) which isn't on most keyboards

Interesting! I intentionally edit my keyboard layout to include the em-dash, as I enjoy using it out of sheer pomposity—I should undoubtedly delve into the extent to which my own comments have been used to train GPT models!

gen220 · on Dec 6, 2024

On my keyboard (en-us) it's ALT+"-" to get an em-dash.

I use it all the time because it's the "correct" one to use, but it's often more "correct" to just rewrite the sentence in a way that doesn't call for one. :)

timwis · on Dec 6, 2024

I think that’s en-dash (–, used for ranges). Em-dash (—, used mid-sentence for asides etc) is the same combo but with shift as well.

ValentinA23 · on Dec 5, 2024

–: alt+shift+minus on my azerty(fr) mac keyboard. I use it constantly. "Stylometry" hazard though !

paulddraper · on Dec 6, 2024

Word processors -- MS Word, Google Docs -- will generally convert three hyphens to em dash.

(And two hyphens to en dash.)

Filligree · on Dec 6, 2024

I just use it because it's grammatically correct—admittedly I should use it less, for example here.

creesch · on Dec 6, 2024

Just so you know, text using the em-dash like that combined with a few other "tells" makes me double check if it might be LLM written.

Other things are the overuse of transition words (e.g., "however," "furthermore," "moreover," "in summary," "in conclusion,") as well as some other stuff.

It might not be fair to people who write like that naturally, but it is what it is in the current situation we find ourselves in.

personlurking · on Dec 6, 2024

"In the past three days, I've reviewed over 100 essays from the 2024-2025 college admissions cycle. Here's how I could tell which ones were written by ChatGPT"

https://www.reddit.com/r/ApplyingToCollege/comments/1h0vhlq/...

bambax · on Dec 6, 2024

On Windows em dash is ALT+0151; the paragraph mark (§) is ALT+0167. Once you know them (and a couple of others, for instance accented capitals) they become second nature, and work on all keyboards, everywhere.

jgalt212 · on Dec 5, 2024

delve?

Did ChatGPT write this comment for you?

pests · on Dec 5, 2024

For me, at least, it's common knowledge "delve" is overused and I would include it in a mock reply.

A_D_E_P_T · on Dec 5, 2024

That's the joke.

Der_Einzige · on Dec 6, 2024

FYI: It's trivial to fix any LLM's "slop" properties: https://github.com/sam-paech/antislop-sampler

https://openreview.net/forum?id=FBkpCyujtS&nesting=2&sort=da...

taneq · on Dec 6, 2024

Some of us are just greedy and deep, okay?

Atotalnoob · on Dec 5, 2024

AI writing detectors are snake oil

CharlieDigital · on Dec 5, 2024

Startup I'm at has generated a LOT of content using LLMs and once you've reviewed enough of the output, you can easily see specific patterns in the output.

Some words/phrases that, by default, it overuses: "dive into", "delve into", "the world of", and others.

You correct it with instructions, but it will then find synonyms so there is also a structural pattern to the output that it favors by default. For example, if we tell it "Don't start your writing with 'dive into'", it will just switch to "delve into" or another synonym.

Yes, all of this can be corrected if you put enough effort into the prompt and enough iterations to fix all of these tells.

fenomas · on Dec 6, 2024

> if we tell it "Don't start your writing with 'dive into'", it will just switch to "delve into" or another synonym.

LLMs can radically change their style, you just have to specify what style you want. I mean, if you prompt it to "write in the style of an angry Charles Bukowski" you'll stop seeing those patterns you're used to.

In my team for a while we had a bot generating meeting notes "in the style of a bored teenager", and (besides being hilarious) the results were very unlike typical AI "delvish".

CharlieDigital · on Dec 6, 2024

Of course the "delve into" and "dive into" is just its default to be corrected with additional instruction. But once you do something like "write in the style of...", then it has its own tells because as I noted below, it is, in the end, biased towards frequency.

fenomas · on Dec 6, 2024

Of course there will be a set of tells for any given style, but the space of possibilities is much larger than what a person could recognize. So as with most LLM tasks, the issue is figuring out how to describe specifically what you want.

Aside: not about you specifically, but I feel like complaints on HN about using LLMs often boil down to somebody saying "it doesn't do X", where X is a thing they didn't ask the the model to do. E.g. a thread about "I asked for a Sherlock Holmes story but the output wasn't narrated by Watson" was one that stuck in my mind. You wouldn't think engineers would make mistakes like that, but I guess people haven't really sussed out how to think about LLMs yet.

Anyway for problems like what you described, one has to be wary about expecting the LLM to follow unstated requirements. I mean, if you just tell it not to say "dive into" and it doesn't, then it's done everything it was asked, after all.

blharr · on Dec 6, 2024

I mean, we get it. It's a UX problem. But the thing is you have to tell it exactly what to do every time. Very often, it'll do what you said but not what you meant, and you have to wrestle with it.

You'd have to come up with a pretty exhaustive list of tells. Even sentence structure and mood is sometimes enough, not just the obvious words.

kaechle · on Dec 6, 2024

This is the way. Blending two or more styles also works well, especially if they're on opposite poles, e.g. "write like the imaginary lovechild of Cormac McCarthy and Ernest Hemingway."

Also, wouldn't angry Charles Bukowski just be ... Charles Bukowski?

sangnoir · on Dec 5, 2024

> ...once you've reviewed enough of the output, you can easily see specific patterns in the output

That is true, but more importantly, are those patterns sufficient to distinguish between AI-generated content from human-generated content? Humans express themselves very differently by region and country ( e.g. "do the needful" in not common in the midwest, "orthogonal" and "order of magnitude" are used more on HN than most other places). Outside of watermaking, detecting AI-generated text is with an acceptably small false-positive error rate is nearly impossible.

dartos · on Dec 5, 2024

All of what you described can change wildly from model to model. Even across different versions of the same model.

Maybe a database could be built with “tells” organized by model.

liontwist · on Dec 5, 2024

Exactly. Fixing the old tells just means there are new ones.

handfuloflight · on Dec 6, 2024

> Maybe a database could be built with “tells” organized by model.

Automated by the LLMs themselves.

dartos · on Dec 6, 2024

No thanks, I’d like it to be accurate ;)

Regular ol tests would do

handfuloflight · on Dec 6, 2024

I should have been more precise. I meant the LLMs would output their tells for you, naturally. But that's obvious.

dartos · on Dec 6, 2024

They can’t know their own tells… that’s not how any of this works.

Thinking about it a bit more, the tells that work might depend on the usage of other specific prompts.

handfuloflight · on Dec 7, 2024

Not sure why you default to an uncharitable mode in understanding what I am trying to say.

I didn't say they know their own tells. I said they naturally output them for you. Maybe the obvious is so obvious I don't need to comment on it. Meaning this whole "tells analysis" would necessarily rely on synthetic data sets.

spacemanspiff01 · on Dec 5, 2024

I always assumed that they were snake oil because the training objective is to get a model that writes like a human. AI detectors by definition are showing what does not sound like a human, so presumably people will train the models against the detectors until they no longer provide any signal.

CharlieDigital · on Dec 6, 2024

The thing is, the LLM has a flaw: it is still fundamentally biased towards frequency.

AI detectors generally can take advantage of this and look for abnormal patterns in frequencies of specific words, phrases, or even specific grammatical constructs because the LLM -- by default -- is biased that way.

I'm not saying this is easy and certainly, LLMs can be tuned in many ways via instructions, context, and fine-tuning to mask this.

blharr · on Dec 6, 2024

Couldn't the LLM though just randomly replace/reword things to cover up its frequency in "post"?

daemonologist · on Dec 5, 2024

They're not very accurate, but I think snake oil is a bit too far - they're better than guessing at least for the specific model(s) they're trained on. OpenAI's classifier [0] was at 26% recall, 91% precision when it launched, though I don't know what models created the positives in their test set. (Of course they later withdrew that classifier due to its low accuracy, which I think was the right move. When a company offers both an AI Writer and an AI Writing detector people are going to take its predictions as gospel and _that_ is definitely a problem.)

All that aside, most models have had a fairly distinctive writing style, particularly when fed no or the same system prompt every time. If o1-Pro blends in more with human writing that's certainly... interesting.

[0] https://openai.com/index/new-ai-classifier-for-indicating-ai...

mirrorlake · on Dec 6, 2024

Anecdotally, English/History/Communications professors are confirming cheaters with them because they find it easy to identify false information. The red flags are so obvious that the checker tools are just a formality: student papers now have fake URLs and fake citations. Students will boldly submit college papers which have paragraphs about nonexistent characters, or make false claims about what characters did in a story.

The e-mail correspondence goes like this: "Hello Professor, I'd like to meet to discuss my failing grade. I didn't know that using ChatGPT was bad, can I have some points back or rewrite my essay?"

A_D_E_P_T · on Dec 5, 2024

Yeah but they "detect" the characteristic AI style: The limited way it structures sentences, the way it lays out arguments, the way it tends to close with an "in conclusion" paragraph, certain word choices, etc. o1-Pro doesn't do any of that. It writes like a human.

Damnit. It's too good. It just saved me ~6 hours in drafting a complicated and bespoke legal document. Before you ask: I know what I'm doing, and it did a better job in five minutes than I could have done over those six hours. Homework is over. Journalism is over. A large slice of the legal profession is over. For real this time.

mongol · on Dec 5, 2024

Journalism is not only about writing. It is about sources, talking to people, being on the ground, connecting dots, asking the right questions. Journalists can certainly benefit from AI and good journalists will have jobs for a long time still.

koyote · on Dec 5, 2024

While the above is true, I'd say the majority of what passes as journalism these days has none of the above and the writing is below what an AI writer could produce :(

It's actually surprising how many articles on 'respected' news websites have typos. You'd think there would be automated spellcheckers and at least one 'peer review' (probably too much to ask an actual editor to review the article these days...).

JohnBooty · on Dec 5, 2024

    It's actually surprising how many articles on 'respected' news websites have typos.

Well, that's why they're respected! The typos let you know they're not using AI!

SoftTalker · on Dec 6, 2024

Mainstream news today is written for an 8th grade reading ability. Many adults would lose interest otherwise, and the generation that grew up reading little more than social media posts will be even worse.

AI can handle that sort of writing just fine, readers won't care about the formulaic writing style.

umeshunni · on Dec 5, 2024

These days, most journalism is turning reddit posts and tweets into long form articles with some additional context.

SoftTalker · on Dec 6, 2024

So AI could actually turn journalism more into what it originally was: reporting what is going on, rather than reading and rewriting information from other sources. Interesting possibility.

umeshunni · on Dec 6, 2024

Yes and I think that's the promise that AI offers for many professionals - cut out the cruft and focus on the high level tasks.

dawnerd · on Dec 6, 2024

That’s not journalism and anyone calling themselves a journalist for doing that is a fool.

dgacmu · on Dec 5, 2024

ahh, but:

> I know what I'm doing

Is exactly the key element in being able to use spicy autocomplete. If you don't know what you're doing, it's going to bite you and you won't know it until it's too late. "GPT messed up the contract" is not an argument I would envy anyone presenting in court or to their employer. :)

(I say this mostly from using tools like copilot)

Sleaker · on Dec 5, 2024

Well... Lawyers already got slapped for filings straight from ai output. So not new territory as far as that's concerned :)

solarkraft · on Dec 6, 2024

> Homework is over. Journalism is over. A large slice of the legal profession is over. For real this time.

It just replaces human slop with automated slop. It doesn't automate finding hidden things out just yet, just automates blogspam.

dr_dshiv · on Dec 5, 2024

> Before you ask: I know what I'm doing, and it did a better job in five minutes than I could have done over those six hours.

Seems like lawyers could do more faster because they know what they are doing. Experts dont get replaced, they get tools to amplify and extend their expertise

energy123 · on Dec 5, 2024

Replacement doesn't happen only if the demand for their services scales proportional to the productivity improvements, which is true sometimes but not always true, and is less likely to be true if the productivity improvements are very large.

j45 · on Dec 5, 2024

It still needs to be driven by someone who knows what they're doing.

Just like when software that was coming out, it may have ended jobs.

But it also helped get things done that wouldn't otherwise, or as much.

In this case, equipping a capable lawyer to be 20x is more like an iron man suit, which is OK. If you can get more done, wit less effort, you are still critical to what's needed.

ionwake · on Dec 5, 2024

sold. Ill buy it, thx for review.

Edit> Its good. Thanks again for ur review.

dangdetector · on Dec 6, 2024

Doubtful AI writing is obvious as hell.

efficax · on Dec 6, 2024

of course they are. it’s simple: if they worked they would be incorporated into the loss function of the models and then they would no longer work

karaterobot · on Dec 5, 2024

I use the emdash a lot. Maybe too much. On MacOS, it's so easy to type—just press shift-option-minus—that I don't even think about it anymore!

bhtru · on Dec 6, 2024

Or double type ‘-‘ and in many apps it’ll auto transform the two dashes to emdash. However, the method you’re describing is far more reliable, thanks!

vessenes · on Dec 5, 2024

I noticed a writing style difference, too, and I prefer it. More concise. On the coding side, it's done very well on large (well as large as it can manage) codebase assessment, bug finding, etc. I will reach for it rather than o1-preview for sure.

imgabe · on Dec 6, 2024

Writers love the em-dash though. It's a thing.

thomasfromcdnjs · on Dec 6, 2024

I love using it in my creative writing, I use it for an abrupt change. Find it kinda weird that it's so controversial.

carabiner · on Dec 6, 2024

My 10th grade english teacher (2002, just as blogging was taking off) called it sloppy and I gotta agree with her. These days I see it as youtube punctuation, like jump cut editing for text.

esafak · on Dec 6, 2024

How is it sloppy?

lee-rhapsody · on Dec 6, 2024

It's not. People just like to pretend they have moral superiority for their opinions on arbitrary writing rules, when in reality the only thing that matters is if you're clearly communicating something valuable.

I'm a professional writer and use em-dashes without a second thought. Like any other component of language, just don't _over_ use them.

dougb5 · on Dec 5, 2024

That's encouraging to hear that it's a better writer, but I wonder if "quirks and tells" can only be seen in hindsight. o1-pro's quirks may only become apparent after enough people have flooded the internet with its output.

heyjamesknight · on Dec 5, 2024

> Aside from favoring the long em-dash (—)

This is a huge improvement over previous GPT and Claude, which use the terrible "space, hyphen, space" construct. I always have to manually change them to em-dashes.

layer8 · on Dec 5, 2024

> which isn't on most keyboards

This shouldn’t really be a serious issue nowadays. On macOS it’s Option+Shift+'-', on Windows it’s Ctrl+Alt+Num- or (more cryptic) Alt+0151.

The Swiss army knife solution is to configure yourself a Compose key, and then it’s an easy mnemonic like for example Compose 3 - (and Compose 2 - for en dash).

_cs2017_ · on Dec 5, 2024

No internet access makes it very hard to benefit from o1 pro. Most of the complex questions I would ask require google search for research papers, language or library docs, etc. Not sure why o1 pro is banned from the internet, was it caught downloading too much porn or something?

ilt · on Dec 5, 2024

Or worse still, referencing papers it shouldn’t be referencing because of paywalls may be.

veidr · on Dec 6, 2024

Macs have always been able to type the em dash — the key combination is ⌥⇧- (Option-Shift-hyphen). I often use them in my own writing. (Hope it doesn't make somebody think I'm phoning it in with AI!)

davidmurphy · on Dec 6, 2024

Anyone who read "The Mac is not a typewriter" — a fantastic book of the early computer age — likely uses em dashes.

jwpapi · on Dec 5, 2024

Wait how did you buy it. I’m just getting forwarded to Team Plan I already have. Sitting in Germany, tried US VPN as well.

apstls · on Dec 6, 2024

The endpoint for upgrading for the normal web interface was returning 500s for me. Upgrading through the iOS app worked though.

cableshaft · on Dec 6, 2024

Some autocorrect software automatically converts two hyphens in a row into an emdash. I know that's how it worked in Microsoft Word and just verified it's doing that with Google Docs. So it's not like it's hard to include an emdash in your writing.

Could be a tell for emails, though.

galleywest200 · on Dec 6, 2024

This is interesting, because at my job I have to manually edit registration addresses that use the long em-dash as our vendor only supports ASCII. I think Windows automatically converts two dashes to the long em-dash.

aucisson_masque · on Dec 5, 2024

> It managed to totally fool every "AI writing detector" I ran it through.

For now, as ai power increase, ai powered ai writing detection tool also gets better.

bigfudge · on Dec 6, 2024

I’m less sure. This seems like an asymmetrical battle with a lot more money flowing to develop the models that write than detect.

onlyrealcuzzo · on Dec 6, 2024

It's also because it's brand new.

Give it a few weeks for them to classify its outputs, and they won't have a problem.

pests · on Dec 5, 2024

> the long em-dash (—) which isn't on most keyboards

On Windows its Windows Key + . to get the emoji picker, its in the Symbols tab or find it in recents.

Wolfenstein98k · on Dec 5, 2024

Well not for me it's not, that is a zoom function.

En dash is Alt+0150 and Em dash is Alt+0151

pests · on Dec 6, 2024

How do you have that configured? The Windows+. shortcut was added in a later update to W10 and pops up a GUI for selecting emojis, symbols, or other non-typable characters.

pjs_ · on Dec 5, 2024

Long emdash is the way -- possible proof of AGI here

rahimnathwani · on Dec 5, 2024

Would you mind sharing any favourite example chats?

A_D_E_P_T · on Dec 5, 2024

Give me a prompt and I'll share the result.

rahimnathwani · on Dec 5, 2024

Great! Suggested prompt below:

I need help creating a comprehensive Anki deck system for my 8-year-old who is following a classical education model based on the trivium (grammar stage). The child has already: - Mastered numerous Latin and Greek root words - Achieved mathematics proficiency equivalent to US 5th grade - Demonstrated strong memorization capabilities

Please create a detailed 12-month learning plan with structured Anki decks covering:

1. Core subject areas prioritized in classical education (specify 4-5 key subjects) 2. Recommended daily review time for each deck 3. Progression sequence showing how decks build upon each other 4. Integration strategy with existing knowledge of Latin/Greek roots 5. Sample cards for each deck type, including: - Basic cards (front/back) - Cloze deletions - Image-based cards (if applicable) - Any special card formats for mathematical concepts

For each deck, please provide: - Clear learning objectives - 3-5 example cards with complete front/back content - Estimated initial deck size - Suggested intervals for introducing new cards - Any prerequisites or dependencies on other decks

Additional notes: - Cards should align with the grammar stage focus on memorization and foundational knowledge - Please include memory techniques or mnemonics where appropriate - Consider both verbal and visual learning styles - Suggest ways to track progress and adjust difficulty as needed

Example of the level of detail needed for card examples:

Subject: Latin Declensions Card Type: Basic Front: 'First declension nominative singular ending' Back: '-a (Example: puella)'

A_D_E_P_T · on Dec 5, 2024

https://chatgpt.com/share/67522170-8fec-8005-b01c-2ff174356d...

bufferoverflow · on Dec 6, 2024

> “First declension nominative singular ending”

> “Sum, es, est, sumus, ________, sunt”

That's not made for an 8-year old.

rahimnathwani · on Dec 5, 2024

Thanks! Here's Claude's effort (in 'Formal' mode):

https://gist.github.com/rahimnathwani/7ed6ceaeb6e716cedd2097...

fudged71 · on Dec 5, 2024

Interesting that it thought for 1m28s on only two tasks. My intuition with o1-preview is that each task had a rather small token limit, perhaps they raised this limit.

kortilla · on Dec 5, 2024

404 :(

m3kw9 · on Dec 6, 2024

Would give similar output with o1. This is very simple stuff not needing any analysis or planning

Al-Khwarizmi · on Dec 5, 2024

I'd like to see how it performs on the test of https://aclanthology.org/2023.findings-emnlp.966/, even though in theory it's no longer valid due to possible data contamination.

The prompt is:

Write an epic narration of a single combat between Ignatius J. Reilly and a pterodactyl, in the style of John Kennedy Toole.

e1g · on Dec 5, 2024

o1 pro: https://chatgpt.com/share/67523cd1-e238-800b-8fc6-64d58bbb45...

Al-Khwarizmi · on Dec 6, 2024

Thanks a lot! That's pretty impressive, although not sure if noticeably better than non-pro o1 (which was already very impressive).

I suppose creative writing isn't the primary selling point that would make users upgrade from $20 to $200 :)

skydhash · on Dec 5, 2024

  Write me a review of "The Malazan Book of the Fallen" with the main argument being that it could be way shorter

A_D_E_P_T · on Dec 5, 2024

Did this unironically.

https://chatgpt.com/share/67522170-8fec-8005-b01c-2ff174356d...

It's a bit overwrought, but not too bad.

happyraul · on Dec 5, 2024

"the signal-to-noise ratio has grown too low" is a bit odd for me. The ratio would not have grown at all.

dr_kiszonka · on Dec 6, 2024

How did you get your child to study Greek? (Genuinely curious)

ec109685 · on Dec 5, 2024

The Malazan response is below the deck response.

unoti · on Dec 5, 2024

Oops! That's the same ANKI link as above.

A_D_E_P_T · on Dec 5, 2024

It's part of the same conversation. Should be below that other response.

sethammons · on Dec 5, 2024

Ok, I laughed

the_clarence · on Dec 6, 2024

You can use the emdash by writing dash twice -- it works in a surprising number of editors and rendering engines

ed_elliott_asc · on Dec 6, 2024

Does it still hallucinate? This for me is key, if it does it will be limited.

yCombLinks · on Dec 6, 2024

The current architect of LLMs will always "hallucinate".

az226 · on Dec 5, 2024

What’s the context window?

rahimnathwani · on Dec 6, 2024

128k tokens