Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

i agree that scalable infrastructure is needed to manage a production pipeline, as others have explained well.

i found this article was a useful reminder, because sometimes a job doesnt require a fully grown infrastructure. i commonly get these requests that dont overlap with existing infrastructure and wont need any followup. in that particular case a hadoop cluster, heck even loading into a pg db would be wasted effort.

but i wouldnt want to manage our clickstream analytics pipeline with shell scripts and cron jobs.

is there any lightweight tooling out there that can schedule/run basic pipeline jobs in a shell environment?



Airflow? It might not be what you consider lightweight, though.


Manta? Definitely not lightweight.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: