Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is there any evidence this works better than Claude 3.5?


I work with a team at Nubank that has been using Devin. I would say that it doesn't quite make sense to compare it to Claude 3.5, because Devin isn't really like Copilot; it's more like an assistant to which you can assign a project. We're using it only for particular use cases, but for those particular use cases it's like having a superpower.


Based on this, what is the outlook for software dev generally, and junior and mid level devs?


More specifically: What kind of advice does GP have for Computer Science students in school right now?

I've been frankly terrified of the pace of LLM development since 2022.


Do you have any examples of the kinds of projects you would assign it to?


The reason it makes sense to compare them is there are problems that Claude 3.5 (or o1) can’t solve. Can Devin solve them? If yes, it’s easily worth the $500. If no, it’s a harder sell.


> We're using it only for particular use cases

Can you share concrete examples?


I can’t really be too specific. But I can say that at least one pattern of problem it tackles very effectively is: “we’re migrating from X to Y, and it’s going to touch a ton of files, and the nature of that migration is much more involved than what we can reasonably hope to accomplish with sed and a bash script.”


I tasked Devin with writing a project proposal (in a topic I am not going to disclose here) with multiple documents including feasibility analysis, grant applications, legal analysis and post-implementation training materials and it was almost perfect at it.


Amazing claims, if only it could be publicly shared and scrutinized.


i use this every day and a lot of the magic is in the workflow and agent layer -- claude 3.5 can generate a snippet of code for you but it isn't going to open a browser, read api docs, actually make calls to the api, debug, run the code and make sure it builds and works, etc


Anthropic and OpenAI have certainly been working on this behind the scenes, while they try to see how much better they can get models, they will let others pay for the current state until they find it valuable. The shift we are seeing now is already happening, and they are taking an even larger macroscopic approach by creating computer/tool use, along with the context protocol, so that when it's released it will work with almost any IDE and system...


Why wouldn't it? Just give it a shell tool. (Something like claude.vim, perhaps.)



People are saying it’s apples and oranges, but with Computer Use taken into account, this seems like a fair question.

https://docs.anthropic.com/en/docs/build-with-claude/compute...


I wish they offered a computer use reference implimentation on Windows instead of a linux docker container.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: