The article says DON'T sanitize when putting it into the database. I think contextual escaping counts as "sanitizing input", so the solution of "don't try to sanitize input" is undermined.
If the user says his name is "Bob'; drop tables students --", that is what you should store on your database. Unless, of course it's not a valid name for the rest of the system.
That's so old and obvious advice that I'm surprised people keep posting here and upvoting. And even more surprised when people keep disagreeing here.
The confusion is what is input and what is output. The string "Bob'; drop tables students --" should not be sanitized/encoded on *input* to the application. However, if you're not using parameterized queries, it should be encoded on *output* to the database.
Data should only be sanitized in transit and not stored in an sanitized form. That's what the article is really saying.
I interpreted the message as not sanitizing inputs at the point they are received, a la PHP magic quotes. Instead, escape at the output (the output to the database engine).
The article doesn't explicitly say the words "outputting SQL to the database engine", but that's because the focus is on XSS attacks and the part about SQL injection is just an aside. Clearly it's what they were trying to imply with language like this:
> The only code that knows what characters are dangerous is the code that’s outputting in a given context. And of course use your SQL engine’s parameterized query features so it properly escapes variables when building SQL: ... This is sometimes called “contextual escaping”.
The "context" is that you are outputting to the database engine.
> your SQL engine’s parameterized query features so
> it properly escapes variables when building SQL
This is wrong. Parameterized queries do not build an SQL string by escaping the input. The input is actually sent to the database separately from the SQL.
Well, in all sane implementations, anyway. PHP has an PDO::ATTR_EMULATE_PREPARES option that does build SQL from a parameterized query. And, of course, Wordpress has $wpdb->prepare() that returns an SQL string with the parameter escaped. Also, so far as I know, one cannot run a prepared statement from the SQLite CLI, so no parameterized queries there either:
>This is wrong. Parameterized queries do not build an SQL string by escaping the input. The input is actually sent to the database separately from the SQL.
Your blanket observation is not necessarily true of all databases or database drivers. You found three counter-examples yourself, but there's no reason to not consider them "sane". It's not less correct than for databases that do support prepared statements in the driver protocol.
Sure, maybe it does not literally send a substituted SQL string, but in order to send the parameters "separately" from the query, do they not still eventually get concatenated into a single binary string of some form to be sent across the wire? In spirit I think the same arguments apply there, it's just that the format of the data is not strictly SQL. It's actually the wire format of the database protocol.
You are correct that the parameters go across the wire, obviously, but I've never heard of an attack in which the parameters caused any type of compromise in the wire protocol. I would highly appreciate examples if any exist.
It probably wouldn't result in an attack (unless you were dealing with a really sophisticated attacker), it's just necessary for correctness. Which is also true of all these examples: for example, people won't appreciate having backslashes wrongly inserted around legitimate characters of their names or other personal information, or having the software fail to process their request due to the characters in their name. It's not just a security concern.
In the general case there are certainly many examples of security vulnerabilities created by wrong serialization of data into the wire protocols of services, but maybe not specifically for this situation of query parameters. But maybe there are, I have no idea really. Either way, it's not the application developer's responsibility at that point, it's the responsibility of the people who developed the database driver.
For a long while, input sanitization in the web world was about modifying inputs to strip the problem areas. As such many consider escaping and sanitization to be completely different practices.
It seems like this article is using this differentiation. In my experience, it's very common. It's not worth arguing about.