Do you get upset when you see one of create_engine("postgresql://"), create_engine("mysql://"), or create_engine("sqlite://")?
The author is just adding a new backend for a common protocol in the data world. Pandas, Polars, Dask, DuckDB, etc. all support this protocol and the type of people who want to access a dedicated CSV data archive would probably much rather keep their current client API and just add a new connection URI string vs. adding an entire client API for just one data source (or dealing with making requests and passing the data into the dataframe).
I didn't say it was uncommon, just that it's a bad idea.
There's no need whatsoever for a separate client API. There could be a convention like:
from csvbase import loader as csvloader
df = pd.read_csv(csvloader("calpaterson/onion-vox-pops"))
The user wouldn't have to know anything but what to import to fetch a certain thing, and it's explicit about what's coming from there. There's also less risk of mistyping the URL string ("oops, I just typed cvsbase and accidentally loaded a list of CVS drugstores"), and code completion can tell you that csvloader() fetches things through the csvbase module.
You're missing the point. It's not about the use of bespoke URIs. This is the debate on "convention over configuration" revisited. Conventions are cute and always look clever while you're reading documentation. But they're a pain during maintenance, as they force others who don't know them to first go read the doc, instead of simply inferring from the previously researched and clearly stated declarations how all the wires connect.
When I see `pd.read_csv("csvbase://")` during debug, I wonder how pandas knows to speak to csvbase (as the article anticipates). Nothing is imported. Nothing is configured. Things just speak to one another. So, can I also call pd.read_csv("other_csv_server://") like this? When I replace pandas with koalas, will koalas.read_csv("csvbase://") also work? How the wires connect between pandas and csvbase is hidden. Unless you know that the two are obeying some implicit lower layer (the fsspec standard), this becomes a mystery. Mysteries are the last thing you want when debugging.
I don't know which `create_engine()` function you're alluding to. The one I know and have used comes from SQLAlchemy. How it works has always been obvious. I've never seen any mention of fsspec. I looked at its code and it's predictably just a convenient syntax to specify connection information in a single string. The string is simply parsed to extract connection attributes, which are then relayed to the lower DBAPI. There's no mystery involved.
The author is just adding a new backend for a common protocol in the data world. Pandas, Polars, Dask, DuckDB, etc. all support this protocol and the type of people who want to access a dedicated CSV data archive would probably much rather keep their current client API and just add a new connection URI string vs. adding an entire client API for just one data source (or dealing with making requests and passing the data into the dataframe).