Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> And doing it like you suggest would mean overview grids would have to do 50+ subqueries for each row, and loading a record would mean hundreds of queries.

Only if you insist on loading the data into a flat record type with all of those fields. You have a dynamically evolving datatype, so your class should also be dynamic:

    public enum FieldId
    {
        Field1,
        Field2,
        //...
    }

    public class YourRecord
    {
        public Dictionary<FieldId, string> TextValues { get; set; }
        public Dictionary<FieldId, int> IntValues { get; set; }
        public Dictionary<FieldId, DateTime> DateValues { get; set; }
        // ...
    }
At most the number of subqueries increases by the number types your class needs. SQL has a fixed number of datatypes, and most apps use a small subset of these (dates, text, integral, floating point and decimal types, that's typically 5 at most). You could collapse these all into a single table with optional columns for each value too and then it's only one query.

> And insertion performance would crater I assume, since the DB now needs to do hundreds of inserts per record rather than a single row. We do have some cases where we get 100k inserts per hour once a day or so.

That's a fairly small amount of data, I don't really see an issue here.

Also, you're neglecting the fact that since the fields are independent, adding a field value doesn't involve locking and rewriting a large 400 column row to disk, but only appending a tiny 4 column row to a separate distinct table.



> At most the number of subqueries increases by the number tables.

Fair enough. Application logic will want to access the data as a flat record, but that could be handled through getters. We have views which are also used by customers and our reporting tool, but the customers we're moving to API access and reporting could probably be done with something better if starting from scratch.

> That's a fairly small amount of data, I don't really see an issue here.

Well it turns 100k inserts into tens of millions. On tables where users need to work without slowdown while this is going on.

It probably works if you throw enough hardware at it, but currently we get by with quite modest DB hardware.

That said, how do you create compound indexes over these fields? Say you need to index one date and one varchar column? Such demands can arise down the line, often hard to predict up front.


> Well it turns 100k inserts into tens of millions.

I don't think "number of inserts" is the right metric because the total amount of bytes being written is almost the same, it's just written in different areas and still mostly contiguously. I think "number of distinct tables being written" is a better metric. Assuming all 400 columns become 400 records in the data type tables, say evenly divided among the 4 most common data types (int, decimal, text, date), that would be more like (4 or 5) x 100k = 400k-500k. I would still hesitate to naively compare it this way without a benchmark though, because with 4 or 5 tables being written there's also less contention than there is when everyone is writing to 1 table.

Regarding indexes, you can index the field tables but obviously this applies to the whole table. Without more understanding of your domain I can't really say if this breaks down. There's also the possibility that this one table is serving two competing goals, eg. perhaps the user should be able to add data incrementally (so this pattern applies), but then at some point when all of the data is filled it should be migrated to an actual table with all non-null columns that can be indexed and queried/processed as you usually do.

In any case, what I've sketched out isn't really new, it's been used at least since the 90s under "Dynamic Object Model" and "Entity-Attribute-Value" [1,2], so if they were using on hardware in the 90s I can't imagine the pattern would be unusable on modern hardware.

[1] https://www.cs.sjsu.edu/~pearce/oom/patterns/new/Riehle.pdf

[2] https://softwarepatternslexicon.com/patterns-sql/4/5/




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: