Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The "Why?" questions are getting downvoted, but to dissect the why section from the page a little...

"designed to be small and efficient" – adding a layer on top of SQL is necessarily less efficient, adding a layer of indirection on underlying optimisations means it is likely (but not guaranteed) to also generate less efficient queries.

"make developing queries simple" – this seems to be just syntactic preference. The examples are certainly shorter, but in part that's the SQL style used.

I think it either needs more evidence that the syntax is actually better or cases it simplifies, the ways in which it can optimise that are hard in SQL, or it perhaps needs to be more honest about the intent of the project being to just be a different interface for those who prefer it that way.

It's an interesting exercise, and I'm glad it exists in that respect, and hope the author enjoyed making it and learnt something from it. That can be enough of a why!



The main goal was to help security engineers / analysts, who _loathe_ sql (for better or worse).

I tend to think this is a little more user friendly, personally, and it's nice to give some open-source competition to the major languages that are used in security (SPL, Sumologic, KQL, and ES|QL).

We were surprised that there weren't syntactic competitiors (i.e. -- while prql has some similar goals, the syntax and audience in mind were very different)


How's perf for the compiled queries? The first thing I see in the examples is what appears to be a CTE-by-default approach that, in most (all?) engines, means the generated query ultimately runs over an unindexed (and maybe materialized!) intermediary resultset.


The join syntax in particular seems really clunky compared to what it could have been - it's unclear if they support joining on different column names (i.e. fooid = foo.id or even fooid = parentid) which would be really restricting if unsupported. It'd also be nice if they used the USING/ON distinction that SQL has instead of just supporting USING but calling it ON since that's a weird thing to mentally translate.

Pipe-driven SQLes have been interesting in the past but I much prefer when you start with a big-blob of joins to define the data source before getting into the applied operations - the SQL PQL produces looks like it would have really large issues with performance due to forcing certain join orders and limiting how much the query planner can rearrange operations.


> I much prefer when you start with a big-blob of joins to define the data source before getting into the applied operations

That's essentially the model we've chosen for XTQL, with the addition of a logic var unification scope for even more concise joins: https://docs.xtdb.com/intro/what-is-xtql#unify

Also, anyone interested in this post-SQL space would probably enjoy this recent paper: https://www.cidrdb.org/cidr2024/papers/p48-neumann.pdf


Higher level languages often have opportunities for additional optimisation over the straightforward implementation in the target language. This is because the semantics of the high-level language offer guarantees that the target doesn't, or because the intent is more clearly preserved in the source language.

Whether or not this is true for PQL/SQL, I don't know enough to say. But I do know that I don't write SQL at a high-enough level to be sure that a wrapper couldn't compile to something more efficient than what I produce, especially for complicated queries.


When dealing with SQL, these higher level languages (which are converted to sql, such as FetchXML and Pql) are traditionally quite limited by what use cases they are suitable for. They are not suitable for complex joins and grouping particularly across large tables and resultsets for example. Not particularly suitable for joining remote data sources particularly of large result sets.

They are suitable for Reads on tables that are tuned specifically for the desired use cases. The amount of optimization required in the conversion should be limited intrinsicly by this assumption and the higher level language should be strict enough that an optimization related to deciding A/B SQL Approach in the conversion to SQL is not required (because results will be unpredictable due to table sizes etc)


SQL also has a lot of edge cases where, because half the syntax got bolted on after the language was initially finalized, a conceptually simple request can turn into a multi-level mess of a query.


> adding a layer on top of SQL is necessarily less efficient

It's not if it's compile time


Adding a compile step to SQL is more efficient than not adding a compile step to SQL. Whether it's workflow efficiency, simpler tooling, or actual runtime cost, a process that goes straight to SQL is necessarily more efficient than one that goes via something else and ends up with SQL.

That cost may be offset by other benefits, but that isn't obviously true in this case.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: