Hacker News new | past | comments | ask | show | jobs | submit login

Something we discovered years ago in relation to joins in all of the SQL DBMS systems we used (from the lowly MS-Access to ORACLE and SQL-Server). The order in which you create the joins in the SQL determines how bad the cartesian product cardinality is. The DBMS's seemed incapable of rearranging the joins internally to get the smallest number of records.

We had quite a few instances where by a judicious change in the join structure of the SQL meant a many order of magnitude change is speed due to the decrease of the cardinality.

Our process was to test different orders of joins in problematic SQL and see what effects would be produced by the DBMS. It usually required we knew how many tuples were in each of the tables and how they would be restricted. The upshot was that it should have been possible for the DBMS to do its own analysis to produce the best result. We found none of them capable of this task in any meaningful way.




That's strange. While the cartesian product is not commutative (because of the order in which items appear in the final product), the cardinality of the result result set should be independent of the order in which records are joined because the cardinality is obtained by multiplying the cardinality of each constituent set.

https://en.wikipedia.org/wiki/Cartesian_product#Cardinality




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: