Hacker Newsnew | past | comments | ask | show | jobs | submit | freddref's commentslogin

How is Devin different from cursor?

I recently used cursor and it has felt very capable in implementing tasks across files. I get that cursor is an IDE but it's ai functionality feels very agentic.. where do you draw the line?


Cursor Composer (both "normal" and "agent" mode) fit the colloquial definition of agent, for sure.


I had to look up MCST: it means Model-Centric Software Tools, as opposed to autonomous agents.

Devin is closer to a long-running process that you can interact with as it is processing tasks, whereas Cursor is closer to a function call: once you've made the call, the only think you can do is wait for the result.


It stands for Monte Carlo search tree.

Ie. Better outputs from models, not external tooling and prompt engineering.

https://github.com/zz1358m/MCTS-AHD-master


Thanks for the correction, I guess I was lured by yet another LLM confabulation


Two lads set themselves up in the business of selling conkers one year.

Any accidentally dropped conkers were stamped on by any and all in the vicinity.

A conker that survived to the next year was considered "seasoned", although many's the wizened tippex-covered lump of questionable provenance appeared under this explanation.


Any stats available on accuracy?


I'm curious here too, I only flipped through your channels for a minute, but found something interesting immediately.

I go to youtube and seem to run out of quality quickly. I even went as far as crawling the HN frontpage for videos - see hacker news TV - https://xiliary.com/bck/hn-tv.html


Can you say more about flow?

I'm curious about practical use cases, could you share some examples?


PS acquired git prime and rebranded it as Flow, and then basically just kinda ignored it tbh. Basically it analyzed git data and gave info like velocity, how much work was new features vs refactors, etc. Very simplified explanation but for places where code-level analysis is used for performance tracking it can be useful. The problem imo is that while that kind of info is certainly interesting, in most cases it's not really that useful or actionable in practice.


I would guess there's a sizable chunk of people who but these coins are not investors as such.

The long tail of people buy the long tail of coins.

I bought coins when I first discovered them, driven by curiosity mostly.


Humans probably have about the same error rate. It's easy to miss a comma or quote.

These systems compete with humans, not with formatters.


A system of checks and balances overseen by several humans can have orders of magnitude lower error rates, though.


A system of checks and balances also costs orders of magnitude more money.


this is my fear regarding AI - it doesn't have to be as good as humans, it just has to be cheaper and it will get implemented in business processes. overall quality of service will degrade while profit margins increase.


You probably also need that for the AI as well though


The point was that for many tasks, AI has similar failure rates compared to humans while being significantly cheaper. The ability for human error rates to be reduced by spending even more money just isn't all that relevant.

Even if you had to implement checks and balances for AI systems, you'd still come away having spent way less money.


How about self driving cars as public transport?


Cars are carry less people per area of road than buses do and it's even worse for self driving cars that are empty half the time.


My car is empty 99% of the time, it could be very highly utilised by the general public.


Given the current lower bound of one passenger in personal vehicles the process for using driverless cars needs to incentivize car pooling to the point that the average occupancy of traveling cars is considerably above 1 to make up for all the cars taking up road space while empty.

There is a possibility of municipalities having many mid to large vans that do routes at higher frequencies because they don't need drivers.


Doesn’t solve the underlying problems that are caused by having car-centric cities. Walkable cities where most people’s needs are meet within walking distance and the mass transit for times you need to further is the real solution.


Agree on walkable, although I really like the idea of personal public transport that would be door-to-door and on-demand. I expect it would distribute cities more, and alleviate the hub-and-spoke model that public transport is sometimes built to, e.g. Dublin, Ireland.


Which would still cause traffic issues, wasting public land on building more roads and wasting energy and resources. Plus propping up the auto industry that caused the problem in the first place.


What if you don't want to live in a city as dense as that?


youtube as curated by HN https://xiliary.com/bck/hn-tv.html


I am amazed at the number of engineers I've worked with who jump straight into the full implementation. I've done it myself on a few occasions thinking "how hard can this be".

Building the "toy" first is great.

In my experience, about one third of the time the toy is all you need, you can stop there, and what you were going to build fully would be over-engineering. About one third of the time, building the toy tells you you're taking the unworkable approach as you mentioned. And the other third of the time you can extend the toy.


I have a directory on my drive called `sandbox` where I basically throw together small toys for anything with some unknown complexity. Anything from how an ORM might model a specific type of relationship (throw together a basic replica with a Docker compose file) to replicating the deployment model (poking AWS Copilot, for example) to testing out some tooling flow (e.g. local build process change).

The main thing with a toy model is speed. You can build/deploy/test with a smaller scope and progressively scale it up some reasonable scale of the full thing (whatever you're testing for) and you can iterate your testing faster. But many times, the key issues show up quite early in the process of grokking the toy model.


I too have this "sandbox" directory. I use it for what you say. But also for troubleshooting, bugreports or a stackoverflow question.

Just throw up a new project, hack around in it for an hour, and most often the problem/bug in my original code becomes apparent because of the isolation. I'll easily write four such sandbox projects per week.


'scratch' for me. Everything from quick regex tests to full library rewrites tend to start there.


> In my experience, about one third of the time the toy is all you need, you can stop there

In my experience, about two-thirds of the time management sees the toy and ShipsIt thinking that's all you need.


Nothing wrong with that if the toy meets all of the functional and non-functional requirements.

If you find offense to this, the easiest way to mitigate is with process and practice: sandbox code goes into a dedicated "Sandbox" mono-repo and if it's suitable for production, you rebuild it appropriately in a production repo.


Exactly. That's why I don't build the toy anymore: Too many broken promises of "Yes, we won't put it into production until it's ready", and then my team is left maintaining a system in production that had no business of ever being in production.


> In my experience, about one third of the time the toy is all you need, you can stop there, and what you were going to build fully would be over-engineering.

On the flip side, the cases where the toy ends up being promoted to production service end up being riddles with technical debt, missing features, and buggy behavior that jeopardizes the whole project, also happen.

Survivorship bias is also a major problem. It's easy to presume that the winning bet you took is the right path.


This scenario emerges as often when starting from the toy as it does starting from the "correct" full implementation.


Sometimes the toy becomes the full implementation.


yes hopefully, that's the best scenario.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: