Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have an old Django site I'm maintaining for a long-time customer of mine. They often want to make small changes - things that are only a few lines of code, but would take an hour to just spin up the system, remind myself how it works, commit, push, update the server and all that.

Last week I've moved the whole infrastructure to Railway, and taught the customer to use Jules. They make their own PRs now, and Railway spins up an environment with the changes, so the customer can check it themselves. It works like 75% of the time, and when it doesn't, the customer see that it doesn't before it even reaches me. Only if they're happy with the changes, I step in to review the code and press merge. It's been a such a huge time saver so far.





Do they still pay you the same amount?

I can't speak for the OP, but I have customers I still support, because they supported me many years ago when I was a teenager taking my first steps into industry.

Does it make me money? Barely a cent. But I can spare a hour or two a year for the guy who gave me a leg up and trusted a teenager who probably shouldn't have been trusted. And I like the feeling of having something I worked on still going strong 20+ years later, when so much of my later work has been thrown away by the endless corporate rewrite treadmill.


Same situation, 10+ yrs deep with my first client, project still chugging along while I tackle bigger fish.

Can't justify spending much time on it now but a DIY no/low code solution for them isn't a bad idea.


I think that he meant in the sense that now AI it's making all the changes

They've always paid me per hour. The fewer hours, the better for me: just like the sibling post, I'm also not in it for the money. I care for both the customer and the project, and I'm happy that we've found a way to get the development going again with really minimal effort from my side.

How expensive are the API charges? Seems like it might be a bit too easy for a customer to rack up a big bill testing out minor changes if things weren't configured correctly.

Literally free. No API - the reason I went for Jules instead of Claude Code / Gemini CLI for example is specifically because of it's relatively polished web-interface, which I assumed that my customer would appreciate. They're using their own Google account and the daily tasks free limit seem to be more than enough for them.

There is a free plan with 15 tasks/sessions. It doesn’t count tokens AFAIK. There would obviously be a runtime limit of some sorts for sure. But it’s not the same as API keys and token situation

The free tier is 15 tasks per day (of gemini-2.5-pro) which is EXTREMELY generous. I've had plenty of tasks run for 1-2 hours. I do think that after 1 or 2 hours it's told it needs to wrap up and just present what it's done; I couldn't get it to keep going longer than 2 hours. But Jules is very slow as it seems to be batch processing on spare capacity, so 15+ hours a day is not quite as absurd as it sounds.

I haven't tried Jules in a couple weeks, but the UI/UX had a lot of issues such as not being given any progress updates for very long times. The worst thing was not being able to see what it was doing and correct it: you only see the state of files (without a usable diff viewer, WTF) at the last point that the agent decided to show you anything (the last time it completed a todo list item I think, and I couldn't get it to update the state when asked, though it will send a PR if you ask), and gemini-2.5-pro can often try really stupid things as it tries to debug. I've also been impressed at its debugging abilities a number of times.

Still, I found Jules far more usable than Gemini CLI (free tier), where Gemini just constantly stops for no reason and needs to be told to continue, and I exhausted the usage limit in minutes.

Aside from the unlimited free tier, probably the best part of Jules are its automated code reviews. Once, I was writing up some extensive comments on its code and then unexpectedly a code review was dropped in the conversation which gave exactly the same feedback I was writing. Unfortunately if it never reaches the point of submitting for review, it doesn't get an automated review. It does often ask for feedback before it's done, which is nice. So probably I needed to prompt better.


> I've had plenty of tasks run for 1-2 hours.

I think they throttle it - they note it is an asynchronous service

I agree that is is generally a pretty useful service.


I wonder if on Google's end it's basically a low-priority job that runs whenever a region has idle GPUs.

I hope they don't store any user data in their app. Trusting LLMs blindly is a bad idea.

There is a human being (GP) reviewing the proposed code before merging. I wouldn't describe that as trusting the LLM blindly.

No, there is not

Yes, there is. From the OP:

"Only if they're happy with the changes, I step in to review the code and press merge."


Ok, thanks, I misunderstood that.

So presumably it spins up a review app from the PR for the customer to review, really smart actually.

Jules has access to the codebase, not the database. It doesn't see any user data.

I was talking about potential security problems introduced in the code by LLMs.

It's pretty easy to introduce something like IDOR when asking LLMs to write the code.


I review the PRs Jules makes just like I review any PR.

This is the original poster, you downvoters. I think we can assume he knows what he gave access to.

How do you handle the customer database? Do you push this in its entirety to the VM?

No, Jules was able to usually edit the code blind and get things working. If they didn't, the customer saw it on the automatic environment created for the PR, told Jules and Jules fixed it. I think I saw one task or maybe two in which Jules actually ran the HTTP server, set up Postgres, ran all the migrations and created a superuser, only to then write some Playwright code that it used to login and take some screenshots.

In other words, so far it didn't feel like including a database will provide us with much, but I am playing with the idea of creating a tiny mock database and including it in the repo, as the real database is around 15GB and contains passwords and names.


That's honestly incredibly cool, could I perhaps encourage you to write a blog about the details with some examples on what the PR requests from your customer looks like.

That's an interesting idea! It's been just a little bit over a week now that we're doing it, but maybe by the end of the month I'll have some more conclusions to share.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: