Takes a lot more than that, LINQ providers work by accepting a LINQ Expression Syntax tree instead of an opaque function, which allows providers to inspect and traverse the Expression's AST and translate it into the data source it's implementing.
This Expression AST is constructed by the compiler, not something that can be tacked on by a library later.
Yes, but I think the point is practically every high level language can already do this pretty trivially.
If it's scripted you can typically just get a string representation of the function.
If it's Java, JAR inspection/dynamics have been a thing for a long time. And in other languages, they usually directly support metaprogramming (like Rust) and plugging code into the compilation logic.
If it were trivial you'd see LINQ-like providers implemented in "practically every high level language".
Source code of the function means you have to implement the parser/lexer to convert it into a usable AST which is bad for both runtime performance and library size.
Very much doubt this is available in Java, which Java ORM lets you use native Java language expression syntax to query a database?
You're replying to a thread about what it takes to implement a LINQ provider, which was dismissed as every high level language implements it with iterables, then proceed to give non-equivalent examples.
IQueryable<> manipulation has other tools available to it than brute-force iteration, like streams do. Streams may be the closest thing java has, but it's still a fundamentally different thing.
Wait what? Am I gonna include a source code parser and AST analyser to my JavaScript library for example, to examine the provided expression source and do this? This reads like the infamous Dropbox comment from when it first got released.
You could also bundle your JS. Or pretend like any number of other solutions like caching parsed ASTs exist instead of being as obtuse as possible, or something idk
Having used it since its inception, I've come to the conclusion that the SQL translator is kind of a misfeature. It creates so many weird bugs and edge-cases and tedium.
I love LINQ, I love having a typesafe ORM as a standard feature of C#, but the convenience of being able to reuse my Pocos and some expressions for both in-memory and in-SQL don't outweigh the downsides.
If I were designing SQL/LINQ today, I'd keep the in-memory record classes and in-database record classes distinct and use some kind of codegen/automapping framework for keeping them synched up. Maybe allow predicate operators to return things other than booleans so we could make `a == b` return some kind of expression tree node.
For ad-hoc queries using anonymous classes? Support defining an interface inline in a generic so you can say
public T MyQuery<interface {string Firstname{get;set;}; string Lastname{get;set:}} T>();
Like, to elaborate, if you were doing some kind of JSON-based codegen (alternately you could do something where you have a separate hand-written POCO Model assembly and use reflection against it to generate your DbModel classes so it's still Code First). Yes, I know MS tried and abandoned this approach, I used LinqToSQL and EF3.5 and whatnot and suffered all that pain.
like, your master datatable file would be something like
```cs
public class DataRecordsNamespace.DbPerson : DbRecord {
public DbPerson() { throw ThisIsAFakeClassException(); }
public DbInt PKID{
get => throw ThisIsAFakeClassException();
set => throw ThisIsAFakeClassException();
}
public DbNVarChar {
get => throw ThisIsAFakeClassException();
set => throw ThisIsAFakeClassException();
}
}
public partial class PocosNamespace.Person {
public AutoGenerated<int> PKID{ get; init; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
public class MyDbModel : DbModel {
public DbTable<DbPerson> Persons => DoSomeLazyStuff();
}
public static class MyDbContextExtensions {
public static List<Person> Resolve(this DbQuery<DbPerson> dbPersons)
{
//call code to execute the actual query.
}
}
```
Am I making sense? Then you wouldn't have the problem of "oops I used an untranslateable method or member of Person", because MyDbModel can't have any of those. You'd lose the ability to to switch from whether a query is in-memory or in-database just by removing the ToList(), but I'd argue that's a misfeature, and better-handled by having some kind of InMemory implementation. Like, having DbQuery have a simple `.ToLocalMemory()` function that is a hint that the next part should be done locally instead of in the database would be a better way to do that. Then you could still do
Guess everyone has their preferred style, I personally avoid code-gen data models like the plague and much prefer code-first libraries.
Here's how you'd do something similar in our OrmLite ORM [1]:
public class Person
{
[AutoIncrement]
public int Id { get; set; }
public string? FirstName { get; set; }
[Required]
public string LastName { get; set; }
}
Create Table:
var db = dbFactory.Open(); // Resolve ADO.NET IDbConnection
db.CreateTable<Person>(); // Create RDBMS Table from POCO definition
Execute Query:
// Performs SQL Query on Server that's returned in a List<Person>
var results = db.Select<Person>(x => x.FirstName.StartsWith("A") && x.LastName == "B");
// Use LINQ to further transform an In Memory collection
var to = results.Where(MemoryFilter).OrderBy(MemorySort).ToList();
Everything works off the POCO, no other external tools, manual configuration mapping, or code gen needed.
This would fail at run-time instead of compile-time.
That's why I'd rather see the DB classes auto-generated with a mapper to convert them. Having the "master" be POCOs instead of JSON/XML/YAML/whatever isn't something I'm convinced on in either direction, but imho the in-database classes being not real POCOs is the important part because it reduces the the problem of somebody writing Person.MyMethod() and then blowing up because it's not a SQL function.
How would you perform this regex query with your code generated solution? What would have to be code generated and what would the developer have to write?
As there's a lot more features available in different RDBMS's than what's available in C# expression syntax, you can use SQL Fragments whenever you need to:
Yes, it's a trivial example. I'm not looking to support it, I'm looking to catch it at compile-time.
if "Person.FirstName" is a string, then that encourages users to use string-operations against it, which will fail if this expression is being translated to SQL for executing in the DB.
if "Person.FirstName" is some other type with no meaningful operations supported on it (which will get converted into a string when the query is executed) then it prevents many many classes of logic errors.
Saw EF now supports custom SQL queries, so been considering that once we've moved to MSSQL (old db server isn't supported by EF).
We're quite accustomed to writing our own SQL select statements and would like to continue doing that to have known performance, but the update, insert and delete statements are a chore to do manually, especially for once you're 4-5 parent child levels deep.
We're not doing that where we come from. All child tables have the main id so we can load the data for all child rows with just one query per child table, and we load everything at once.
We were planning on sticking with this, it has worked well so far, but good to know to avoid getting tempted by the alternative.