Mistral Agents

rodoxcasta · 2024-08-07T20:10:21 1723061421

Wait, this 'Agents' thing seems to be just a way to couple a system prompt and temperature to a model, that's it?

What's the difference from sending the system prompt in the api call, as usual?

Edit: Oh, missed that: "We’re working on connecting Agents to tools and data sources."

refulgentis · 2024-08-07T20:13:05 1723061585

There's this massive gap between those who can call API and those who can't. If you can't, then you get the same aspirational-AGI chat UI as everyone else.

I agree with the implied statement that 'Agents' doesn't feel right. Reminds me more of the projects that put the model in a loop.

It does feel to me to be a really tough thing to name & market, I'm about to release an app for this across all providers, I call it "Scripts" with "Steps" like chat, search, retrieval, art...

DebtDeflation · 2024-08-07T20:49:04 1723063744

I implemented a number of enterprise Conversational AI tools for customer service back before the GenAI craze started and we used to just call it service orchestration and data/application integration. The chatbot was used to figure out what the customer wanted to do and then from there it was just about automating some business workflow. Customer wants to pay their bill, the bot needs to pull their current balance, get their payment information, process the payment. Customer wants to return a product, the bot needs to retrieve the order info, initiate an RMA, process a refund, etc. These were all well established business process that the bot would execute by making API calls or kicking off an RPA routine. The "agent" talk sounds to me like "let the LLM figure out what it needs to do and then do it" which I'm not even sure is the right approach for most enterprise use cases, it's how you get people tricking chatbots into selling them a new car for $1.

TechDebtDevin · 2024-08-07T21:08:52 1723064932

Why is tool picking such a hard functionality for these vendors to implement.

Seems like a lot of the heavy lifting will come from 3rd parties making their APIs compatible with llms.

There should be some sort of extension type app where people can build extensions or "tools" for llms and share them (I guess openAI sort or attempts to do this). Say I want to build one for Toast to order food. I can collect the info needed to run that tool (toast account info or whatever) and an API key for an appropriate llm and then use this configuration info for Toast to build out a middleware that can use natural langauge to build out an order and send the request to Toast via some function call.

This seems very doable and I don't understand why there aren't a million of these "tools" already built into some LLM centric tool aggregator/ web store. What is the hold up? Is it just 3rd parties not wanting to hand out API access for things that require payment to applications controlled by llms? Would these 3rd parties rather have their own assistant tool they run? I'd imagine that some central llm-extension aggregator could have a central mechanism for payment methods that the llm had access to that could be used to implement safegaurds.

Or is it simply that any assistant type tool that could be easily generalized like ordering food, booking a flight or inputing calender events is simply easier to handle doing yourself than asking an llm to do for you?

wkat4242 · 2024-08-08T06:21:25 1723098085

A lot of models are hit and miss when it comes to invoking tools. I have llama 3 8b with a weather tool but half the time it will just hallucinate giving me made up info instead of running the tool.

I imagine the big sites have similar issues and it undermines customer trust when they're given false information.

Oras · 2024-08-07T20:04:17 1723061057

Genuine question, are there any examples of agents in production?

colkassad · 2024-08-07T20:35:17 1723062917

Depends on your definition. I created a mapping application that allows one to navigate and style the map with natural language (more or less) as well as some prototype database interaction. When the user inputs a prompt, it gets sent to an "agent" whose sole purpose is to send a request to the API with a custom system prompt with few-shot examples stating something along the lines of "determine which agent should handle this request...only respond with one of [NavigationAgent, StyleAgent, ...]". When the response comes back, the prompt is then sent to the proper agent to handle the request. Each agent has function definitions for properly returning parameters to use to manipulate the map. I don't use any special libraries like langchain or anything, it's just regular API calls organized into classes that have specific system prompt behavior defined, function definitions, and some user prompt context ingestion when required (e.g. the current extent of the map).

Oras · 2024-08-07T20:37:01 1723063021

Why can’t this be done with functions? I don’t see why you need the complexity and unpredictability of using “agents” to do that.

I might be missing something?

colkassad · 2024-08-07T20:44:50 1723063490

I call them agents just because. I'm just doing function calling with custom system prompts for the most part. It's hardly complex, I wrote most of it in a few days.

EDIT: I should add that the first step is used to cut down on the number of function definitions I need to send to the model on each user prompt. Navigating a map can be done with as few as four function definitions but styling a map gets out of control fast (google "Mapbox Style Specification" if you want to see why).

dsissitka · 2024-08-07T21:19:04 1723065544

Not quite sure if this is what you're looking for but Amazon has hidden their description/review search box behind a "Ask Rufus about this product" box. For example, go here and Ctrl+F for "Looking for specific info?":

https://www.amazon.com/PolyScience-Temperature-Controlled-Co...

dudus · 2024-08-07T20:45:31 1723063531

Doctors are already using agents for scheduling. These agents can access calendar and talk to patients to arrange scheduling, changes, conflicts etc

pton_xd · 2024-08-07T21:29:02 1723066142

This use-case actually makes a ton of sense. How many other low-hanging fruit applications for agents are out there like that?

Although now that I think about it, a lot of doctors practices have a MyChart-style portal where you can schedule an appointment yourself. Why does an LLM need to be involved in that process? I guess for people who still want to schedule over the phone, the LLM agent makes sense. Kind of, assuming you don't have any special case problems. Which patients most likely do, if they're calling in. Is an LLM actually a good solution here?

dudus · 2024-08-11T06:05:28 1723356328

The LLM works great for this use case for doctors in Brazil where a lot of the scheduling happens in WhatsApp. Hooking up a WhatsApp bot to chatgpt and Google calendar is incredibly powerful

Areibman · 2024-08-07T21:08:20 1723064900

Yes, here's a directory of many of them https://staf.ai

simonw · 2024-08-07T20:27:35 1723062455

Depends what you mean by "agents".

Oras · 2024-08-07T20:41:09 1723063269

The article is about LLM agents, so I’m asking in context.

To clarify more, I see frameworks like CrewAI and similar, with tools even from Microsoft to define these “agents” quickly. But when I tried them, I noticed they are no more than chain of thought CoT functions to ask/extract/generate based on user input and functions output.

As such, they can be quite unpredictable, hence my question of examples of LLM agents being used in production. I just don’t see their value, but I might be missing something so wanted to see examples to understand more.

simonw · 2024-08-08T01:54:47 1723082087

That’s the problem: the term “LLM agents” doesn’t have a clear, unambiguous meaning either.

ilaksh · 2024-08-07T21:15:12 1723065312

The SOTA models have excellent instruction following capability and the ability to output in any format you want including JSON.

That's all you need from the model to be able to use it in an agent. Tell it to output commands in a given JSON format.

I assume that Mistral's API already allowed you to define the system prompt, right?

geepytee · 2024-08-07T20:07:42 1723061262

This is basically Mistral's attempt at custom GPTs?

simonw · 2024-08-07T20:19:37 1723061977

I've been complaining about how vague and loosely defined the term "agents" is like a broken record for months. This is not going to help.

finikytou · 2024-08-07T20:24:23 1723062263

it is quite simple to explain. it is a while(1) and some if.

simonw · 2024-08-07T20:28:20 1723062500

The problem is that if you ask two people you're likely to get two different answers. And those people probably incorrectly think that their version of a definition for "agents" is the same as everybody else's.

TeMPOraL · 2024-08-07T20:48:38 1723063718

Yeah, but there is still no while(1) here.

pradn · 2024-08-07T20:56:54 1723064214

One sad part of the GenAI wave happening right now is that we're past the golden age of open APIs.

It's hard to read data with widespread anti-abuse checks (CAPTCHAs), lack of open-format data (RSS support being spotty), and restricted APIs (ex: Twitter API). Companies have all the incentives to prevent bot use, and select for human eyeballs.

If we had a Yahoo Pipes sort of golden age, GenAI agents would have a vaster playground to play in, and would be more useful for us.

Consider building an agent for choosing what to do on weekends for a group of friends. The agent would need to keep state for past activities (X, Y, and Z went upstate to Storm King last week) and users' preferences (ex: liking dosas or Calder, dietary restrictions). This part is easy enough - you could just keep a notebook that's passed as context. Older context gets simply deleted or condensed into high level points.

But would it be easy for the agent to:

1) Look up nearby restaurants and events? (Perhaps Resy/OpenTable allow listing restaurants, but it's likely they have tons of anti-abuse tech. Is there even a place where you could see a list of public events - Google pays a third-party for this feed.)

2) Actuate on behalf of the user? (Do Resy and OpenTable allow authority delegation so the agent could book restaurants for users? There's no standard way to do this across venue types - concerts, museums, cooking classes. Is it realistic for agents to click through these sites on their own?)

exe34 · 2024-08-07T21:09:59 1723064999

seems to be a monetisation problem. everybody wants their cut. so any super agent needs to figure out how to pay them.

we could imagine a data/api marketplace, where such an agent could pay for the data and subscriptions.

pradn · 2024-08-08T19:09:44 1723144184

You are correct - if a workflow/agent company starts bundling data and API access - they'll multiplicatively increase the capability of their agents.

LLMs themselves are becoming a commodity, plus or minus prompt-following/format-following growing pains. In a year or two, we'll have pretty decent general LLMs that can make use of databases and tools/APIs.

It's a race to see who can integrate all these things in a good way. It really is an execution problem, not an idea problem - it's so obvious.

tanelpoder · 2024-08-07T21:14:55 1723065295

This could/should be a direction for companies like Zapier...

Edit: or Stripe.

qeternity · 2024-08-07T20:06:22 1723061182

Since we've apparently moved from calling everything a Copilot to calling everything an Agent, this seems much closer to OAI's GPT Store than anything that is truly agentic.

baxtr · 2024-08-07T20:40:04 1723063204

Finally, finally we have a true and worthy successor to “AI” as buzzword. It’s “Agents” ladies and gentlemen.

Make sure to put it into your pitch as often as possible.

8338550bff96 · 2024-08-07T21:07:08 1723064828

Computers, desktops, and now we have these things called "Virtual Machines".

Meaningless buzzword central.

<<eats gallery peanuts>>

xnx · 2024-08-07T21:09:20 1723064960

Agent Intelligence or "AI"

flessner · 2024-08-07T21:22:19 1723065739

Doesn't "AI" already stand for Apple Intelligence, so that might be a bit confusing...

xnx · 2024-08-07T21:46:24 1723067184

Anything Intelligence

reducesuffering · 2024-08-07T20:08:23 1723061303

Because Copilot implies a human working in tandem with AI. An agent is an autonomous process. The goal of agents is to remove any human agency from the process. Need software done? It won't be a human's job. It will be the agent's in an iterative loop.

qeternity · 2024-08-07T21:06:42 1723064802

I understand why. Most products claiming to be agents today are simply prompts.

voiper1 · 2024-08-07T20:07:23 1723061243

but worse! there's no tools or RAG/data yet...

jsemrau · 2024-08-07T20:26:13 1723062373

No memory, no reasoning, no planning.

rvnx · 2024-08-07T20:41:24 1723063284

We just have to wait that LLaMA does it, then suddenly they will have it.

Mistral is like FitGirl Repacks for LLaMA.

htrp · 2024-08-07T20:25:19 1723062319

don't forget about agentic workflows

voiper1 · 2024-08-07T20:06:58 1723061218

>Agents help you create custom behaviour and workflows with a simple set of instructions and examples. So, it's just custom instructions baked in? I hope at least it's harder for them to get overwritten by the user?

>We’re working on connecting Agents to tools and data sources... So tools and RAG for data sources aren't available yet.

Way behind GPTs/assistants. What's the point of this yet?

eitally · 2024-08-07T20:10:22 1723061422

To me, this looks like a direct competitor to AI21 Labs: https://www.ai21.com/

toomuchtodo · 2024-08-07T20:28:49 1723062529

OpenAI is supposedly working on agents as well: https://news.ycombinator.com/item?id=41125900 (subthread)