The article says DON'T sanitize when putting it into the database. I think conte...

marcosdumay · on Jan 13, 2022

If the user says his name is "Bob'; drop tables students --", that is what you should store on your database. Unless, of course it's not a valid name for the rest of the system.

That's so old and obvious advice that I'm surprised people keep posting here and upvoting. And even more surprised when people keep disagreeing here.

ehutch79 · on Jan 13, 2022

If you're storing "Bob'; drop tables students --" in the database, you had to have sanitized your inputs, or there would be no students table.

The article title says NOT to sanitize inputs. perhaps it's that nuance doesn't fit in a headline, but eh...

wvenable · on Jan 13, 2022

The confusion is what is input and what is output. The string "Bob'; drop tables students --" should not be sanitized/encoded on *input* to the application. However, if you're not using parameterized queries, it should be encoded on *output* to the database.

Data should only be sanitized in transit and not stored in an sanitized form. That's what the article is really saying.

IshKebab · on Jan 13, 2022

No you don't. You use a parameterized query: execute("INSERT INTO foo VALUES (?)", user_input)

shawnz · on Jan 13, 2022

I interpreted the message as not sanitizing inputs at the point they are received, a la PHP magic quotes. Instead, escape at the output (the output to the database engine).

marcosdumay · on Jan 13, 2022

> a la PHP magic quotes

Up to this day, the official way to deal with XSS in .Net is by doing sanitization at the receiving point. I imagine the article is directed at that.

shawnz · on Jan 13, 2022

That sounds pretty terrible, do you have an example of some docs which demonstrate that practice?

gkoberger · on Jan 13, 2022

No where in the article do they use "output" to mean from the database engine; they use it to mean "outputting HTML".

shawnz · on Jan 13, 2022

The article doesn't explicitly say the words "outputting SQL to the database engine", but that's because the focus is on XSS attacks and the part about SQL injection is just an aside. Clearly it's what they were trying to imply with language like this:

> The only code that knows what characters are dangerous is the code that’s outputting in a given context. And of course use your SQL engine’s parameterized query features so it properly escapes variables when building SQL: ... This is sometimes called “contextual escaping”.

The "context" is that you are outputting to the database engine.

dotancohen · on Jan 13, 2022

  > your SQL engine’s parameterized query features so
  > it properly escapes variables when building SQL

This is wrong. Parameterized queries do not build an SQL string by escaping the input. The input is actually sent to the database separately from the SQL.

Well, in all sane implementations, anyway. PHP has an PDO::ATTR_EMULATE_PREPARES option that does build SQL from a parameterized query. And, of course, Wordpress has $wpdb->prepare() that returns an SQL string with the parameter escaped. Also, so far as I know, one cannot run a prepared statement from the SQLite CLI, so no parameterized queries there either:

https://stackoverflow.com/questions/20065990/how-to-prepare-...

Arnavion · on Jan 13, 2022

>This is wrong. Parameterized queries do not build an SQL string by escaping the input. The input is actually sent to the database separately from the SQL.

Your blanket observation is not necessarily true of all databases or database drivers. You found three counter-examples yourself, but there's no reason to not consider them "sane". It's not less correct than for databases that do support prepared statements in the driver protocol.

shawnz · on Jan 13, 2022

Sure, maybe it does not literally send a substituted SQL string, but in order to send the parameters "separately" from the query, do they not still eventually get concatenated into a single binary string of some form to be sent across the wire? In spirit I think the same arguments apply there, it's just that the format of the data is not strictly SQL. It's actually the wire format of the database protocol.

dotancohen · on Jan 13, 2022

You are correct that the parameters go across the wire, obviously, but I've never heard of an attack in which the parameters caused any type of compromise in the wire protocol. I would highly appreciate examples if any exist.

shawnz · on Jan 13, 2022

It probably wouldn't result in an attack (unless you were dealing with a really sophisticated attacker), it's just necessary for correctness. Which is also true of all these examples: for example, people won't appreciate having backslashes wrongly inserted around legitimate characters of their names or other personal information, or having the software fail to process their request due to the characters in their name. It's not just a security concern.

In the general case there are certainly many examples of security vulnerabilities created by wrong serialization of data into the wire protocols of services, but maybe not specifically for this situation of query parameters. But maybe there are, I have no idea really. Either way, it's not the application developer's responsibility at that point, it's the responsibility of the people who developed the database driver.

bcrosby95 · on Jan 13, 2022

For a long while, input sanitization in the web world was about modifying inputs to strip the problem areas. As such many consider escaping and sanitization to be completely different practices.

It seems like this article is using this differentiation. In my experience, it's very common. It's not worth arguing about.