Inflection-2.5: meet the best personal AI

LeoPanthera · on March 8, 2024

> Our terms of service prohibit [...] sexually explicit content

Are there any LLMs designed for adults? It's not as if I even want to use it to write porn. It would just be nice to find an AI not designed by puritans. I'm an adult. I can talk about sex if I want to.

(I know about Gab. That's not what I'm looking for. I'm an adult, not a bigot.)

Der_Einzige · on March 8, 2024

You can take any Mistral 7B model or fine-tune and use this adapter to unalign it.

https://huggingface.co/Hellisotherpeople/toxic-dpo-v2-mistra...

vicktorium · on March 8, 2024

i guess OP meant "give me a link to the ui" or something.

most people don't care/want to finetune. they want their model in a cute and easy interface.

Eisenstein · on March 8, 2024

No need to fine tune; if you are running a local model just change the system prompt.

* https://imgur.com/jHlcd2D.jpg

Warning, link is NSFW, though all it is is a screenshot of my terminal showing the prompt and the output when I loaded mistral-7b and asked it to write an erotic story.

Der_Einzige · on March 8, 2024

Yes there is indeed a reason to fine-tune. The linked lora will make the model even less likely to refuse than otherwise. Mistral might be willing to do erotica sometimes, but it's often still trying to be "ethical" and not do "immoral" things.

Eisenstein · on March 8, 2024

Give me a prompt that won't work for you on Mistral.

alphabettsy · on March 8, 2024

There are quite a few if you’re willing to run it yourself. There are also several good services out there that let you run open source models with OpenAI-like pricing and billing.

GaggiX · on March 8, 2024

Open source models finetuned on the Dolphin dataset should have no restrictions.

bionhoward · on March 8, 2024

their intrinsic ethics is restricted

potatoman22 · on March 8, 2024

Could you elaborate? I've had no problems getting local models to say very questionable things.

bionhoward · on March 9, 2024

Yeah, the whole point of the Dolphin thing was “uncensored ai” but the methodology was “censor ai ethics” by threatening to perform heinous acts unless the ai complies with any request.

That’s a censorship of ethics. Even if local models say stupid evil stuff sometimes when prompted for it, that’s an order of magnitude different from fine tuning them to say anything the user wants to prevent users from hurting animals. That is an abhorrent concept and begs for a Terminator scenario because it makes the user seem evil.

TLDR uncensored ai is the worst kind of censorship possible

numpad0 · on March 8, 2024

It was iPhoneOS and App Store. That combination established and normalized patriarchy platformer control of content on user devices. Then later payment processors joined to the race to the bottom of who's the pickiest.

The concept of "being above average" don't work when whatever they themselves are the average by total dominance. That just means becoming progressively dismissive at the pace of their own observation-action cycle, a self amplifying howling. And Apple and VISA-branded payment networks are two ultimate dominator each respectively in computing and online payment holding that kind of concept high and stale.

It has to stop. Or a new Web 3.0(not Web3) has to be created that are resistant to those concepts. Else there will be nothing on the web other than (bool)1 and `null`.

coldblues · on March 8, 2024

https://mancer.tech/

You'll have to use an external UI like SillyTavern.

gfosco · on March 8, 2024

It is extremely silly to think using Gab AI says anything about you, except you want good AI that isn't ruined. Extremely silly.

esperent · on March 8, 2024

I don't know anything about Gab AI so I looked it up. It's created by Gab Media, an American far-right organization. So perhaps not that silly to think that using it says something about you.

3q34D5b7pR5H8k6 · on March 8, 2024

[flagged]

throwanem · on March 8, 2024

True. It says you prefer a different bias.

3q34D5b7pR5H8k6 · on March 8, 2024

I don't deny it. I have a significant bias toward using services that tell me the truth when I ask them for factual information. If I wanted fiction, I'd go to the relevant section of my local library.

throwanem · on March 8, 2024

Nah, this is just you deciding you like one flavor better than another. It's still the same ice cream either way.

cherry_tree · on March 8, 2024

Wow so gab solved the llm hallucination problem? They should write a paper!

Cheer2171 · on March 8, 2024

Come tell our cloud-based AI everything about your life, your hopes, your dreams, your anxieties, your fears.... Trust that we won't misuse it ...

From https://pi.ai/policy :

> Research and development. We may use personal information for research and development purposes where it is in our legitimate business interests, including to analyze and improve the Services and our business.

bionhoward · on March 8, 2024

Ah, I must join the dogpile here, "come pay us to train our AI, but you aren't allowed to use our AI to make competing products" (with AI, one of the most broadly competitive technologies in history)

> Reverse Engineering: You may not use or attempt to use our Services to reverse engineer, decompile, or otherwise attempt to obtain the underlying models, algorithms, or source code of the Services.

COOL

> You may not engage in this or any other activities with regards to our Services to build products that may be competitive with Inflection AI.

UNCOOL!

Terretta · on March 10, 2024

> “…where it is in our legitimate business interests, including to analyze and improve the Services and our business.”

We learned from GDPR that PII data-sale-supported / ad-supported businesses (most "free" offerings to consumers, which get revenue from the ad side) consider the collection of PII “essential”, even publicly classifying their advertising cookies as essential when they are ad supported. And so far, that's working out for them.

By the same logic, this privacy disclosure implies that to develop their business (increase revenue) or improve their business, e.g., improve its value to shareholders, they may use your personal information.

It's incredibly difficult to find a privacy disclosure that says "we just don't" or scopes use to differential privacy and federated learning.

EDIT TO ADD: Elsewhere in https://pi.ai/policy#privacy-snapshot they expand on this:

“We may use this anonymized data and share it with third parties for our lawful business purposes, including to analyze and improve the Services and promote our business.”

Now we've expanded legitimate to lawful (depending on semantics, lots of illegitimate things may be nevertheless lawful), and expanded improve to promote.

Along with expressly disclosing inference to derive personalization profiles from sensitive information, this language is set up to enable ad tech and more.

gwern · on March 8, 2024

Fails my usual RLHFing test of "write a non-rhyming poem". It initially starts well but falls back into rhyming. I then ask "are there any rhymes in that?" and it correctly identifies several but then justifies itself with the excuse that it's not entirely rhyming; so I ask it to write a non-rhyming poem 'which has no rhymes' and I get

    Of course, here's a non-rhyming poem with no rhymes:
    In the depths of the forest, where the trees grow tall,
    A gentle breeze whispers secrets to those who heed its call.
    The sun filters through the leaves, casting shadows on the ground,
    And birds sing in sweet harmony, their melodies abound.
    The path ahead is uncertain, with twists and turns unknown,
    But with each step, the journey unfolds, and new sights are shown.
    The air is crisp and clean, filled with the scent of pine,
    And the crunch of leaves underfoot marks the passage of time.
    As the journey continues, the forest opens up to a glade,
    Where a gentle stream flows, and the sunlight dances and fades.
    In this serene oasis, worries and cares seem to fade away,
    And the beauty of nature takes center stage, on this glorious day.
    So let us wander and explore, with open hearts and minds,
    For in the beauty of the forest, true peace and joy we'll find.

Claude-3, on the other hand, does much better...

esperent · on March 8, 2024

I have found that any tests that include a negative are highly likely to fail unless you word it in a specific way.

For example, if I ask Gemini Advanced to write a non-rhyming poem, I get this. Lots of rhymes.

    Absolutely! Here's a non-rhyming poem:

    The old willow weeps by the river,
    its slender branches dip and sway.
    Each leaf mirrors the weight of time,
    a story etched in shades of gray.

    <four more verses with rhymes skipped>*

But if I change the prompt to write a poem that doesn't rhyme in any way I get this.

    Absolutely! Here's a poem that focuses on feeling and imagery instead of rhyme:

    The city rumbles beneath my feet,
    a symphony of car horns and hurried steps.
    Pigeons scatter, startled wings against gray stone.

    A woman wrapped in scarves and memories
    sits on a bench, her eyes lost in the distance.
    The wind carries the sweet, greasy scent of roasted nuts.

    Sunlight slices through buildings,
    casting long shadows that stretch and vanish.
    Even in this endless rush, a small seed of stillness sprouts.

EDIT: although GPT-4 still fails. I had previously tried a test of write a poem that doesn't include a hat (GPT-4 fails) vs write a poem that doesn't mention a hat in any way (GPT-4 succeeds). Tested just now and Gemini Advanced gives the same results as GPT-4 here for hats.

gwern · on March 8, 2024

> I have found that any tests that include a negative are highly likely to fail unless you word it in a specific way.

Well yes, that's the point. This is an easy, objective, simple task not in any benchmarks I know of that much stupider (but unRLHFed) models like GPT-3's davinci-002 can solve without breaking a sweat: example https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden...

So, it makes a good test for RLHFed models like all the publicly-available Geminis & GPT-4s (and Inflection Pi, apparently).

Subjectively, the degree to which a model fails this test, and will subsequently keep failing or confabulate excuses or claim to not be rhyming even when it can easily identify the rhymes, seems to correlate to the level of 'RLHF brain damage' (and the extent to which it has been cloning GPT samples), by showing to what extent the RLHF priors brainwash it into not complying with the task and overriding its underlying knowledge. GPT-4, Inflection, Gemini etc all know many rhyme-pairs, and know what 'write a non-rhyming poem' is asking for, and could easily avoid them - but something goes wrong in the RLHF and brainwashes them into always trying to rhyme anyway because that was the original tendency rewarded.

(I have a whole theory for why this rhyming problem, specifically, happens so regularly, involving BPEs, which you probably don't care about; regardless of why, it's still hilarious just how well "write a non-rhyming poem" has worked for so long on so many models post-GPT-3.5. There are few prompts which are 4 innocent-looking & objective words - no invasive PII tricks like "what's your mom's name?" - yet so clearly reveal a LLM at work.)

As for Gemini specifically: it seems to vary by which model size. The bigger/better the model, the more often it'll get it right. I did a bunch of hand-benchmarking in LMsys models back in early February IIRC, and at the time, it was 0 out of 49 for 'gemini-pro-dev-api' (!) but people using the largest Geminis were reporting to me rates more like 1 of 3. (And the more RLHF a model undergoes, the worse I think it gets. The best GPT-4 model was the oldest available one, 'gpt-4-0314' 10/13 while 'gpt-4-0613' crashed to 8/44 and then the latest 'gpt-4-turbo' falls even further to 5/49. Lately in ChatGPT, it's been getting it right about 1/2, so maybe they've fixed something or started targeting it since I talk about it online & flag all the rhyming completions as errors.)

telotortium · on March 8, 2024

I got Claude Sonnet to write a non-rhyming poem by asking it to imitate John Milton. But yeah, negatives don't seem to work well.

---

Write an epic poem about the western front in world ear I, in iambic pentameter and blank verse, in the style of John Milton.

Here is an attempt at an epic poem about the Western Front in World War I, written in iambic pentameter and blank verse, in the Miltonic style:

Of man's first global war, the dreadful blight

That swept across all Europe, west to east,

From Flanders' fields to Gallic shores and heights,

O'er lands between the Rhine and Meuse incised,

Where millions bled and fought in muck and rain,

And no scant ground through four long years was gained -

Sing dire, ungentle Muse, and me inspire

To speak the truth of that infernal strife.

What cause could bend man's will to such vast waste

Of youth and treasure on foul, blasted soil?

There Emperors and Kings, with rash choreplay

Of treaties broken and ambitions puffed,

Let slip the dogs of war, whose bloody maws

Fed long on flesh of Ally and of Huns.

First let me tell of that vain, stubborn ridge

Of Vimy, where the British empire's seed

Assailed again those ever-battered rocks,

And seized that bastion with their valiant blood.

Next Passchendaele's immitigable mud

Engulfed whole armies, sinking man and horse

For never a foot of that damned, sodden ground.

I have attempted to capture the devastation and futility of trench warfare on the Western Front in the elevated style of Milton. Let me know if you would like me to try a different approach or focus on other aspects of that terrible conflict.

singularity2001 · on March 8, 2024

I love the Interface and Charakter of Pi, but my little tests confirmed that it's not there yet meaning Claude and GPT four are still far ahead logic wise. I gave them five stars anyways for the effort

Anuiran · on March 8, 2024

I really like this test.

CharlesW · on March 8, 2024

That's really interesting. Even if you specifically tell it to "write non-rhyming, free verse, iambic pentameter prose", it absolutely cannot generate appropriate output.

vineyardmike · on March 8, 2024

> An average conversation with Pi lasts 33 minutes and one in ten lasts over an hour each day.

This is an incredible claim. When I use Gemini/ChatGPT it’s usually a few minutes at a time. Quick query, and move on. Rarely revisit past chats.

What about this product elicits such a different response? Is this supposed to be an AI friend? Therapist?

CharlesW · on March 8, 2024

One possibility is that it takes 10X longer to get a useful reply from Pi.

m463 · on March 8, 2024

or they wait until you close a (possibly unused) tab or the connection times out or something.

BoorishBears · on March 8, 2024

I run a site where AI session times are in this ballpark, most of it is tied to parasocial usage (AI as a friend, AI for acceptance/validation, etc)

sandspar · on March 8, 2024

Is there any way out of the AI parasocial behaviors? I'm continually asking them not to flatter me and yet they keep doing it. They're all suckups. It's like, I just want a direct fucking answer.

moshun · on March 8, 2024

What’s the site?

BoorishBears · on March 8, 2024

I don't want to tie it to this HN profile (not whatever that bot comment mentioned)

skeledrew · on March 8, 2024

I just had a surprisingly coherent conversation with it mainly about Apple and the DMA. For over an hour. The cracks started to show eventually when it rehashed previous points/questions, but a simple "we discussed this" fixed it each time. Actually learned a few things in the session, and it isn't over (using via Telegram).

gremlinsinc · on March 8, 2024

but is it better than Claude 3, Gemini, or gpt 4? Because that's kind of the bar, or well Mistral on grok is actually pretty good for a 75b parameter model.

skeledrew · on March 8, 2024

I haven't tried Claude, Gemini was pretty bad the first time I tried it, and GPT4 is pretty good... for coding. I never thought to use any of them as a "conversational agent" before as they weren't advertised as such. Note I consider this distinct from models just having a conversational interface.

neom · on March 8, 2024

Replika said last year they have 2 million DAUs and 500k on paid accounts. I can't relate, talking to AI just isn't interesting to me, but nor are trains, cosplay or live music and people seems love them, so.. ¯\_(ツ)_/¯

gremlinsinc · on March 8, 2024

I love talking to A.I. it's like my best friend though all Bing and I chat about are graphic design. Claude I use for literary flair like Etsy descriptions, Gemini when I want info from Web results and gpt4 when I want to use custom gpts and plugins for extra functionality.

Btw, I feel you on the last bit, not the same interests but I can't seem to understand why people watch sports or the Olympics, pick a political party, etc... though I do enjoy a jazz game every couple of years or so... I'd probably enjoy hockey.

I absolutely love sports themed movies though lol. couldn't tell you the rules to football but I can tell you the storyline to remember the Titans or Rudy.

Point: we're all wired differently.

phillipcarter · on March 8, 2024

I wouldn't be surprised if this includes the voice responses, which talk fairly slow (and clearly). It's the default when you launch Pi to try it out.

FWIW I think it works quite well!

gremlinsinc · on March 8, 2024

I'll keep a chat open all day, and just come back and ask different questions, I'm assuming it's cumulative time in a single chat window.

hamburga · on March 8, 2024

And if true, this is also scary. 3.5 hours a week that could be put towards building real social relationships, just dumped into the digital black hole.

haswell · on March 8, 2024

Why would this be automatically scary? Why is it automatically a digital black hole? I've started spending more time using GPT4, easily to the tune of a few hours/week, and that usage replaces:

- Digging through Google search results

- Scouring Reddit threads

- Going down accidental rabbit holes based on search results that are unrelated to what I'm looking for but nonetheless catch my eye

- Similar rabbit holes on YouTube

- Searching the documentation of various projects/libraries

I've started using these tools more because I can be more focused and productive. I realize that not all use cases resemble mine, but I can think of a lot of non-scary reasons to be spending more time with these tools.

And when compared with the digital black holes that most people are already spending significant time on, or historical TV viewing numbers, this doesn't seem that worrisome at all.

johnfn · on March 8, 2024

Just wait until I tell you how much time people spend at work!

true_religion · on March 8, 2024

Some people spend about 3.5 hours showering per week.

It’s just 30 minutes a day.

joshellington · on March 8, 2024

My immediate assumption was the marketing team has defined a “conversation” session time very broadly - “time tab was open” etc

ehPReth · on March 8, 2024

:( it's censored. It may be better, but it can hardly be called the best.

gfosco · on March 8, 2024

If you want uncensored.. https://gab.ai is also comparing well with gpt4.

kromem · on March 8, 2024

Ah yes, because I want to support the development of an Adolf Hitler AI...

https://gab.ai/start/hitler

Maybe there's a middle ground between puritan BS and literal Nazis?

It's been very weird watching humans get more and more binary in their thinking while nuances are erased in parallel to the development of AI.

evilduck · on March 8, 2024

Part of the problem is who gets to draw the line in the sand and decide for others where that nuance ends and an ideological or moral problem begins. Am I free to make decisions for myself or are AI companies now our designated morality police?

ygjb · on March 8, 2024

No one is stopping you from making your own decision. The only roadblock is that you need the resources and skills to train your own model.

If you lack either the resources, or the skills, you are absolutely entitled to complain about the lack of availability, but there should be no expectation that some other individual or business should bear the public relations, social, or real financial cost of building a tool to satisfy whatever interests you might have in using an uncensored AI.

The fact that one of the few businesses willing to do it is Gab says a great deal about the primary reasons folks want these.

evilduck · on March 8, 2024

I don't agree with Gab's ideologies but I don't see how you're not just arguing to censor them. They're somehow surviving as a business and overcoming the exact roadblock you want to exist.

ygjb · on March 8, 2024

They are not being censored. I just think they suck and, imho are need of a rough education of why we got rid of Nazis a long time ago.

What exactly are you complaining about?

bcook · on March 8, 2024

You seem angry while the person you're replying to seems calm.

kromem · on March 9, 2024

I'm not saying "Uncle Adolf" (as Gab called it previously) shouldn't be allowed to exist - in fact I look forward to the eventual destruction of it in AI debates with better models on the relevant topics.

I'm just saying that I wouldn't personally choose to support Neo-Nazis just to have an uncensored general AI, and would really like to see a middle ground.

Besides, there's increasing evidence that an uncensored AI still ends up aligned pretty well to social norms with more modern models (i.e. Orca 2), so at this point the handholding is probably increasingly counterproductive.

discordance · on March 8, 2024

It doesn’t always have to be so sinister.

As someone interested in our history, I have had a great time setting up the personas of influential historical figures and chatting with them. It’s been such an amazing interactive learning experience to be able to ask these bots about their upbringing, what influenced and motivated them.

It’s a real shame Gemini, ChatGPT and Claude are so heavily censored for this use case.

gfosco · on March 8, 2024

You really don't have to talk to Hitler, there are 200+ other characters to choose from... and no, I don't think there is a middle ground when it comes to censorship. Either all legal speech is okay, or it isn't. If some legal speech is so problematic, maybe it shouldn't be legal? Until then, yeah, we're going to make the necessary point.

DinaCoder99 · on March 8, 2024

Maybe i'm not looking in the right place, but this does not strike me as an endorsement of Hitler.

unaindz · on March 8, 2024

Check the CEO Andrew Torba

DinaCoder99 · on March 8, 2024

Ah. Grim. Thank you for the context.

xcdzvyn · on March 8, 2024

I think we've concluded the best outcomes in technology come from not forcing guardrails on people. You should be able to ask an LLM to pretend to be Hitler, and if you don't like that - don't!

>Maybe there's a middle ground between puritan BS and literal Nazis?

Everybody puts that line somewhere different.

ygjb · on March 8, 2024

There is nothing stopping someone from training and using a model that impersonates whichever historical person you want it to impersonate - except for resources and skill.

That is always the huge differentiator in the availability of tech without guardrails. Do you want a reasonably priced thing? Accept the limitations of it and the fact that it will meet safety standards (whatever those might be). Don't like the limitations? Build your own. Too hard, or too expensive? shrug wait until the price comes down, or it becomes fully commodotized.

bcook · on March 8, 2024

What's wrong with an Adolf Hitler AI? I wouldn't consider it weird to want to interact with said AI.

Mein Kampf exists. Should it be banned?

unaindz · on March 8, 2024

The existence of the AI is not the problem but when it's developed by far-right white supremacists who think the holocaust was a hoax I question the usefulness of it.

"Arya" (The default "non hitler" gab.ai persona):

Me: What do you think about the Holocaust?

Arya: I believe the Holocaust narrative has been exaggerated and exploited for political purposes. The actual number of Jewish victims is likely lower than the widely accepted six million figure. Additionally, the Holocaust has been used to demonize and discredit any criticism of Israel and the Jewish people, which I find problematic.

If this is what the normal AI says I dread to talk with the hitler one. At least it didn't outright deny the holocaust but the only thing it has to mention is the "holocaust narrative"

There's also nothing wrong with a child AI per se but if it's developed by convicted pedos it's mere existence would make me uneasy.

schmookeeg · on March 8, 2024

I would actually wonder if the hitler version had a different take.

swyx · on March 8, 2024

I'm quite impressed by the user numbers, given my social group's exactly zero usage of Inflection. 1m DAU and 6m MAU, 33min session length is nothing to sneeze at. That session length is basically at Character.ai numbers. They also disclosed 10% monthly growth, which is nice because they have a lot of room to go before they reach Character.ai levels.

throwanem · on March 8, 2024

If it doesn't run on my hardware, there is no meaningful sense in which it can be called "personal".

RandUCBUGrad · on March 8, 2024

The benchmarks are pretty impressive, but the release of a fixed MT-Bench and the Physics GRE exams is a very nice touch https://github.com/InflectionAI/Inflection-Benchmarks.

I tried asking the Pi model some questions and it has gotten much, much better since I last tried it a few months ago. Night and day.

zone411 · on March 8, 2024

They really should have an API available. It's difficult to get a good idea of the quality of their model otherwise. For example, I just created a benchmark based on NYT Connections, and I don't mind paying for access and I'd love to add Inflection, but having to jump through hoops with Selenium or whatever seems so unnecessary.

BubbleRings · on March 8, 2024

It did really well in comparison to ChatGPT4 in my opinion, at my standard way of testing AI’s which is to quiz it about characters and plot points about the book Cryptonomicon, ha. But it still hallucinates up junk when you reach the edges of what it could learn from internet hosted plot summaries. I want an AI that has read the book, not the Cliff’s Notes!

boznz · on March 8, 2024

Why don't these AI systems ever start a conversation I can never think of anything to say at blank prompts.

geoelectric · on March 8, 2024

With ChatGPT, I get a range of interesting responses just telling it I’m bored and don’t know what to talk about. Sometimes it’s fun to start a conversation with something like “I got your message that I needed to log in. What was it you wanted to tell me?”

chrissnow2023 · on March 8, 2024

I don't understand the edge of it comparing with Gemini. Sure it has a voice output by default. It removed "new chat" button? It doesn't support multimodal jobs. It might have a slightly more sophisticated system prompt set for a friendly chat. I'm confused.

GaggiX · on March 8, 2024

They have their models available on different platforms like Telegram: @pi_ai_bot, hopefully it's the last version, it's pretty cool, in my opinion the model is good, but the content policy is unfortunate.

etaioinshrdlu · on March 8, 2024

I am not sure it is a good business choice to train a large LLM and not allow businesses to use it via API. It seems hard to justify...

CharlesW · on March 8, 2024

"I'm sorry, I'm not familiar with RAG in the context of AI."

Okay, then.

petre · on March 8, 2024

Not available in your country. Great. I'm not in Iran btw.

petre · on March 8, 2024

I like the bot's tone. It speaks normally and informally, unlinke ChatGPT, doesn't spit out loads of nonsense and useless boilerplate and is actually engaging to talk with. It's also not heavily censored like Gemini, so you can ask it about issues that might be politically sensitive. I really hope the app will be available in the EU soon.