One thing I often do with CSVs is sum up all or specific rows in a column.
For example maybe you're doing end of year taxes and now you have this large CSV export from your bank or payment provider with multiple categories and you want to get the totals for certain things.
In a GUI tool it's really easy to sort by a column and drag your mouse to select what you want and see it summed in real time.
Oftentimes things aren't clean enough to have 100% confidence that you can solve this with an automated script because maybe something is spelled slightly different but it's really the same thing. This feels like one of those things where spending a legit 10-15 minutes once a year to do it manually is better than trying to account for every known and unknown edge case you could think of. The stakes are too high if you get it wrong since it's related to taxes.
Has anyone found a really good standalone basic spreadsheet app that "just works" which isn't Microsoft Excel that works on Windows or Linux? I don't know why but Libre and Open Office both struggle to parse columns out in certain types of CSVs and the sorting behavior is typically a lot worse than Google's spreadsheet app but I'd like to remove some dependence on using Google.
If you want to use the CLI, Visidata might be what you want. It does have a bit of a learning curve. Beyond that, I've found it quite handy to do quick data explorations. e.g. there are shortcuts for histograms, filtering, x-y plots, etc.
You can but it's back to depending on code to get the totals. This is one spot where IMO being able to visualize the data by seeing the rows and have immediate feedback on the sum is useful. You can continue dragging or use CTRL + SHIFT clicking to select more stuff as needed while letting your brain decide what should be grouped together.
With a SQL / code approach you have to account for these things without being able to see them and then adjust the code afterwards to include your custom groupings. It ends up taking more time. If the categories didn't change every year it would for sure be worth it to code up a solution since you'll know the edge cases by looking at the existing CSV but it's a moving target because it could change next year.
Have you looked at data wrangling software like Easy Data Transform, Knime or Alteryx? It is specifically written for cleaning, merging, summing and reshaping tabular data. They can all handle CSV and Excel files (plus a lot more).
Nope, but I just tried. I went to their site and noticed there's no binaries for Windows. The site is also served over HTTP (not HTTPS) and the default experience for apt installing it on Ubuntu 22.04 didn't work due to a bunch of packages no longer existing:
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/evince-common_42.3-0ubuntu3_all.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/p/poppler/libpoppler118_22.02.0-2ubuntu0.2_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/p/poppler/libpoppler-glib8_22.02.0-2ubuntu0.2_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/g/ghostscript/libgs9-common_9.55.0%7edfsg1-0ubuntu5.5_all.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/g/ghostscript/libgs9_9.55.0%7edfsg1-0ubuntu5.5_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/libevdocument3-4_42.3-0ubuntu3_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/libevview3-3_42.3-0ubuntu3_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/e/evince/evince_42.3-0ubuntu3_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/w/webkit2gtk/libjavascriptcoregtk-4.0-18_2.42.1-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/w/webkit2gtk/libwebkit2gtk-4.0-37_2.42.1-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 91.189.91.83 80]
For example maybe you're doing end of year taxes and now you have this large CSV export from your bank or payment provider with multiple categories and you want to get the totals for certain things.
In a GUI tool it's really easy to sort by a column and drag your mouse to select what you want and see it summed in real time.
Oftentimes things aren't clean enough to have 100% confidence that you can solve this with an automated script because maybe something is spelled slightly different but it's really the same thing. This feels like one of those things where spending a legit 10-15 minutes once a year to do it manually is better than trying to account for every known and unknown edge case you could think of. The stakes are too high if you get it wrong since it's related to taxes.
Has anyone found a really good standalone basic spreadsheet app that "just works" which isn't Microsoft Excel that works on Windows or Linux? I don't know why but Libre and Open Office both struggle to parse columns out in certain types of CSVs and the sorting behavior is typically a lot worse than Google's spreadsheet app but I'd like to remove some dependence on using Google.