More

mcapodici · 2025-09-20T02:33:24 1758335604

The way you worded tbat is good and got me thinking.

What if instead of just lots of text fed to an LLM we have a data structure with trusted and untrusted data.

Any response on a call to a web search or MCP is considered untrusted by default (tunable if you also wrote the MCP and trust it).

The you limit tbe operations on untrusted data to pure transformations, no side effects.

E.g. run an LLM to summarize, or remove whitespace, convert to float etc. All these done in a sandbox without network access.

For example:

"Get me all public github issues on this repo, summarise and store in this DB."

Although the command reads public information untrusted and has DB access it will only process the untrusted information in a tight sandbox and so this can be done securely. I think!

simonw · 2025-09-20T14:10:20 1758377420

"Get me all public github issues on this repo, summarise and store in this DB."

Yes, this can be done safely.

If you think of it through the "lethal trifecta" framing, to stay safe from data stealing attacks you need to avoid having all three of exposure to untrusted content, exposure to private data and an exfiltration vector.

Here you're actually avoiding two out of them: - there's no private data (just public issue access) and no mechanism that can exfiltrate, so the worst a malicious instruction can do is cause incorrect data to rewritten to your database.

You have to be careful when designing that sandboxed database tool but that's not too hard too get right.

sebastiennight · 2025-09-20T09:48:24 1758361704

You definitely do not need or want to give database access to an LLM-with-scaffolding system to execute the example you provided.

(by database access, I'm assuming you'd be planning to ask the LLM to write SQL code which this system would run)

Instead, you would ask your LLM to create an object containing the structured data about those github issues (ID, title, description, timestamp, etc) and then you would run a separate `storeGitHubIssues()` method that uses prepared statements to avoid SQL injection.

mcapodici · 2025-09-20T11:58:27 1758369507

Yes this. What you said is what I meant.

You could also get the LLM to "vibe code" the SQL. Tbis is somewhat dangerous as the LLM might make mistakes, but the main thing I am talking about hete is how not to be "influenced" by text in data and so be susceptible to that sort of attack.

mcapodici · 2025-08-10T04:27:38 1754800058

Problem is if people are vibecoding with these tools then the capability "can write to local folder" is safe but once that code is deployed it may have wider consequences. Anything. Any piece of data can be a confused deputy these days.

mcapodici · 2025-08-10T04:22:34 1754799754

The lethal trifecta is a problem problem (a big problem) but not the only one. You need to break a leg of all the lethal stools of AI tool use.

For example a system that only reads github issues and runs commands can be tricked into modifying your codebase without direct exfiltration. You could argue that any persistent IO not shown to a human is exfiltration though...

OK then you can sudo rm -rf /. Less useful for the attacker but an attack nonetheless.

However I like the post its good to have common terminology when talking about these things and mental models for people designing these kinds of systems. I think the issue with MCP is that the end user who may not be across these issues could be clicking away adding MCP servers and not know the issues with doing so.

Terr_ · 2025-08-10T08:13:03 1754813583

Perhaps both exfiltration and a disk-wipe on the server can be can be classed under "Irrecoverable un-reviewed side-effects."

mcapodici · 2025-08-06T10:14:28 1754475268

Feature flag can (usually does??) have another meaning, as a technical feature rollout mechanism, so you can roll back quickly without a deployment. A way to manage risk on teams that make hundreds of commits/deploys a week. You can then often send a certain % of traffic through the feature or not to look for early warnings.

I don't like feature flags for config. Just build a config system into the product UI - a settings page basically! That might have some bits you configure as the site owner, and some that are self-service.

mcapodici · 2025-07-24T22:20:51 1753395651

There are other benefits over a test.

The compiler tests the type is correct wherever you use it. It is also documentation.

Still have tests! But types are great.

But sadly, in practice I don't often use a type per ID type because it is not idiomatic to code bases I work on. It's a project of its own to move a code base to be like that if it wasn't in the outset. Also most programming languages don't make it ergonomic.

mcapodici · 2025-07-14T11:46:03 1752493563

Languagetool is an open source tool you can run as a local spelling and grammar checker. It's different to Grammarly - less AI and more rules based. I often use both tools at the same time. I wrote a quick intro on how to self host this - https://martincapodici.com/2025/05/10/check-your-writing-usi....

illiac786 · 2025-07-15T09:14:24 1752570864

LanguageTool only works in the browser though, as far as I can tell?

amake · 2025-07-16T02:37:26 1752633446

There is a local-only, no-signup-required executable available at https://languagetool.org/download/

They don't advertise this because they are trying to push their paid online services. I have complained about this, but they didn't seem to care.

mcapodici · 2025-06-28T10:16:58 1751105818

I agree with the ideas at a high level, but not sure if we can tag people as “Junior” and “Senior” and make these broad strokes about how they think.

We should think of it in terms of “Theory Builders” and “Just get it done-ers”, and think of them as states of mind, rather than a character trait, or something linked to years of experience.

You may have a theory builder straight out of university (after all many go on to do a PhD straight away!), or a theory builder who has the mindset and just came in from a different profession. Or an 8 year old theory builder! You may have someone with 10 years experience writing code who still slings code.

You may also have one person who was a Theory Builder on Monday, and became a "Get it done-er" by Friday due to a deadline.

ffsm8 · 2025-06-28T12:04:46 1751112286

Or the person that starts off in "get it done" mode because it looks trivial, notices that it's not and then takes a few steps back to think it through first.

Honestly, these opinions are almost always grounded in people not being honest with themselves, feeling superior to their colleagues and coming up with a character trait and argument why they're just fundamentally better

Sometimes they even are, at least to a degree. No idea wherever it's true in this case, as I know nothing about Christian Ekrem beyond this article.

fredfish · 2025-06-29T11:32:33 1751196753

The article is about Senior Engineers where time spent is a huge factor in the distinction. It would be more that theory building becomes a tuned skill for an engineer over time as a fundamental result of their job than whether they use it every day, started with it as their primary method, etc.

the_real_cher · 2025-06-28T12:33:00 1751113980

I personally think one does a lot of the theory building while you're getting it done because you're building something new and can't predict the kinds of issues that youll encounter.

Any sort of software that's architected only in flowcharts and uml by 'pure architects' are absolutely worthless to anyone but business people.

gorjusborg · 2025-06-28T14:20:48 1751120448

I agree that there needs to be a feedback loop including the system and decision makers (I also have a distrust of non-contributing 'architects').

However, just because you can 'get things done' in the current system doesn't imply you have a good enough theory for maintaining it sustainably. I've often seen self proclaimed 10x coders who trade healthy shared theory for mean time to deployment too aggressively.

They are fast, get praise and pay, then move on before the negative effects of their short term strategy becomes clear.

Another job of 'senior' devs is to point out to the business when this is happening.

mjklin · 2025-06-28T12:37:56 1751114276

Among magazine staff there’s a saying about “senior editors”: senior to whom, editor of what?

mcapodici · 2025-06-07T04:58:42 1749272322

Thats when you ask the user to add another airport with the same name and -2 at the end. Add a "has moved" field!

sgarland · 2025-06-07T12:50:30 1749300630

Now you have to specify whether or not it’s moved during queries (and what if it moves again?) There’s probably a more elegant way I’m not thinking of, but standard created_at and updated_at fields would work: if a given date is <= the move date, it’s the original airport, else the new one. Rinse and repeat if it moves again.

mcapodici · 2025-05-11T09:37:39 1746956259

I am trying to understand the tea example.

When someone says "I am making tea", to me, they mean "I have a plan! The execution of that plan has begun. The goal is to make tea." and in the context, because that's the answer to the question, they are also saying "The reason the stove is on is because I am executing that very plan."

Is this just idiomatic English? If we went formal logic on every sentence it would be a verbose world. Maybe other languages are more explicit.

kragen · 2025-05-11T09:42:51 1746956571

That seems like a reasonable analysis to me, and you can apply it equally well to the op-amp or the thermostat.

mcapodici · on July 29, 2024

It is here because this is for professionals not an average family PC.

> Many of these tools are now widely used by graphics designers, animated film creators and videogame developers worldwide.

LoganDark · on July 29, 2024

As a professional I usually use a 3090. I'm sure if I worked in fields that heavily relied on this sort of thing I'd have a 4090 or two no problem, but it's not that safe of a base assumption to make.