Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With this sort of perspective, why use a database at all? You can store literally any digital object as some sequence of bytes, and that's what file systems are for.

At some point, you actually like for the software you use to actually have meaningful features.



Querying filesystems is extremely primitive when compared to what sqlite can do.


So is storing dates as integers, and json as strings.


But dates are integers (all databases store dates as an integer that is an offset from some fixed date), and JSONs are strings. SQLite has the functions to work with these datatypes (dates, JSON, etc...) but fundamentally they are not different from integers or strings.


And integers are bytes and strings are bytes and tables are bytes and indexes are bytes. If you're not happy with databases being intelligent, why stop with integers and strings, when you can do everything you want with bytes?! Memcached, or LevelDB, or even a file system is the ultimate database if you follow your arguments to their logical conclusion.

But data representation isn't data. You can't look at a serialized sequence of bytes and know what it represents without context - without a serialization scheme. For relational databases, the table definition is the serialization scheme - it is the context. For example, you can't know what date an integer is representing without contextual information like the offset. By storing dates as dates in your database, that context is baked in.

It is helpful to have that full context inside your database because it allows you to operate that database more efficiently (by carefully indexing on the properties of the data, not the data representation), but it also allows you to use that database in ways that are not tightly coupled to your application, such as analytics, because you are storing data itself and not just the bare minimum required to represent that data.


Because sqlite let's you index and query the data without having to write your own layer to do that.


Having rich data types allows a lot of things without having to write your own layer. Spatial indexes are extremely efficient, and I can't query geometries efficiently when I'm storing geometries as a blob. The same goes for dates, json, XML, ranges, etc.

The problem with reserving specialized logic for the application layer is that it limits you to simplistic indexing schemes and you end up doing excessive IO and filtering in memory to get what you actually want.

The idea that databases shouldn't have specialized datatypes is really only an idea that works in simplistic crud apps. The world is much bigger than that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: