Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even more evil would have been to replace this line

    (void)SYS_landlock_create_ruleset;
with this:

    (void)SYS_landloсk_create_ruleset;


For those squinting, the "landlock" regular "c" is replaced with a Cyrillic U+0441.


Yip, this was impressive. Only a copy & paste into VSCode (Windows) revealed the rectangle around the "c" in the second line.

Impressive.


Is there a GCC option to error on non-standard English characters?


Disclosure: I've got zero C/C++ on my resume. I was asked to diagnose a kernel panic and backport kernel security patches once, but it was very uncomfortable. ("Hey, Terr knows the build system, that's close enough, right?")

That said, perhaps something like disabling the default -fextended-identifiers [0], and enabling the -Wbidi-chars [1] warning.

[0] https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html...

[1] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#inde...


Cool, the latter was added to fix CVE-2021-42574: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103026


Not yet. I was working on the standardization for C23, but this was postponed to C26, and has not much support. MSVC and SDCC liked it, CLANG and GC not so much.

Here is my bad variant of the feature, via confusables: https://github.com/rurban/gcc/tree/homoglyph-pr103027

The better variant would be to use my libu8ident, following UTR 39. I only did that for binutils.


Error-ing is the point here and what the period achieved. It’s a feature detection snippet so if it fails to compile the feature is disabled.


It seems like there should be a way to catch these types of “bugs” - some form of dynamic analysis tool that extracts the feature detection code snippets and tries to compile them; if they fail for something like a syntax error, flag it as a broken check.

Expanding macros on different OSes could complicate things though, and determining what flags to build the feature check code with — so perhaps filtering based on the type of error would be best done as part of the build system functionality for doing the feature checking.


I'd prefer if the ecosystem standardized on some dependency management primitives so critical projects aren't expected to invent buggy and insecure hacks ("does this strong parse?") in order to accurately add dependencies.


It would be interesting to see what the most common compile feature checks are for, and see what alternative ways could be used to make the same information available to a build system — it seems like any solution that requires libraries being updated to “export” information on the features they provide would have difficulties getting adoption (and not be backwards compatible with older versions of desired dependencies).


From my experience looking at rust builds, Pareto applies here: most checks are trivial and used by a lot, and a handful are super complex and niche.


> if they fail for something like a syntax error, flag it as a broken check.

A syntax error might be exactly what they’re looking for e.g. they’re feature testing a new bit of syntax or a compiler extension.

> so perhaps filtering based on the type of error would be best done as part of the build system functionality for doing the feature checking.

Which would require every compiler to have detailed, consistent, and machine-readable failure reporting.


At least for newer C++ standards it seems like there is decent support for feature test macros, which could reduce the need for a feature check involving compiling a snippet of test code to decide if a feature is available: https://en.cppreference.com/w/cpp/feature_test

Handling the output from several of the most recent GCC and Clang versions would probably cover the majority of cases, and add in MSVC for Windows. If the output isn’t from a recognized compiler or doesn’t match expectations, the only option is falling back to current behavior. Not ideal, but better than the current status quo…


That would only work for projects that care only about current compilers, where C in general has more desire to support niche compilers.

A mitigation here would be to make result of autoconf only provide instructions for humans to change their build config, instead of doing it for them silently. The latter is an anti-pattern.

FWIW, the approach you propose is how the UI tests for rustc work, with checks for specific annotations on specific lines, but those have the benefit of being tied to a single implementation/version and modified in tandem with the app. Unless all compilers could be made to provide reasonable machine readable output for errors, doing that for this use case isn't workable.


Universal support of SARIF by compilers would be most of that.


Well - that would be exactly the point of the attacker. GCC errors out, and it does not matter whether this is because the intentionally typoed header does not exist or because non-English characters are not allowed. Errors encountered during compilation of test programs go to /dev/null anyway, as they are expected not to compile successfully on some systems - that's exactly the point of the test. So no, this option would not have helped.


Probably the code review tools should be hardened as well, to indicate if extended identifiers had been introduced to a line where there wasn't any. That would help catching the replacement of a 'c' character with a Russian one.

Btw, the -fno-extended-identifiers compiler parameter gives an error if UTF-8 identifiers are used in the code: <source>:3:11: error: stray '\317' in program float <U+03C9><U+2083> = 0.5f;


> Probably the code review tools should be hardened as well, to indicate if extended identifiers had been introduced to a line where there wasn't any.

Maybe in the future more languages/ tools will have the concept of per-project character sets, as opposed to trying to wrangle all possible Unicode ambiguity problems.

I suppose then the problem is how to permit exceptions when integrating with some library written in another (human) language.


Or we could just accept English as the lingua franca of computing and not try to support anything other than ASCII in source code (at least not outside string constants). That way not only do we eliminate a whole class of possible exploits but also widen the number of people who can understand the code and spot issues.


The -fno-extended-identifiers option seems to do something in this area, but I don't know if it is sufficient. But it may block some characters which are used in standard English (for some values of standard).


> But it may block some characters which are used in standard English

So what? Source code was fine with ASCII for a long time, this push for unicode symbols is a recent endeavor and IMO a huge mistake not just because of the security impliciations.


I mean, GCC erroring is how the exploit works here. cmake tries to compile the source code: if it works then the feature is available, if it fails then the feature is not available. Forcing the build to fail is what the attacker wants.


Putting random Unicode confusables in source code would be far easier to consider malicious


Except many people have a policy of ASCII only in source code and therefore would catch it immediately.


Another point where I appreciate the Rust compiler:

    warning: the usage of Script Group `Cyrillic` in this crate consists solely of mixed script confusables
     --> src/lib.rs:1:4
      |
    1 | fn SYS_landloсk_create_ruleset() {
      |    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
      |
      = note: the usage includes 'с' (U+0441)
      = note: please recheck to make sure their usages are indeed what you want
      = note: `#[warn(mixed_script_confusables)]` on by default

https://play.rust-lang.org/?version=stable&mode=debug&editio...


The character was in a string, not directly in what was being compiled. The contents of the string failing to compile was the point, as landlock was then disabled.

From what I understand, this landlock disabling wasn't relevant to the sshd attack. It appears it was setting up for something else.


With the . the author can claim it was unintentional. It's impossible (or at least very hard) to claim the Cyrillic c is unintentional.

To me, that makes the . more evil.


Well, on Russian keyboards, they are on the same key.


Good point - it might be plausible if the author actually were Russian, but they aren't.


Or are they?


Even someone provably not russian can just claim that they were learning the language (or a different one that also uses cyrillic).


Plausible deniability


also applies for c/с - the с (Cyrillic s) is on the same key as the Latin c on a Russian keyboard. bottom row starts with zxc/ячс respectively.


After the concept of such attacks had blow up idk. 1?2?3? years ago a lot of places now warn at least when mixing "languages"/ranges or sometimes on any usage of non us-ascii printable characters.

So stuff like that is increasingly more reliable caught.

Including by 3rd parties doing any ad-hoc scanning.

Through adding `compiles check` fail with syntax errors definitely should be added tot he list of auto scan checks, at least in C/C++ (it's not something you would do in most languages tbh. even in C it seems to be an antipatters).


Nearly every modern IDE and diff viewer instantly highlights this though? I doubt this would get far.


compiler would perhaps "see" it ?


Sure; but so long as the compiler error is silently discarded (as it is here), the configure script will assume landlock isn't available and never use it.


A misplaced punctuation has some plausible deniability. Like the author could say he was distracted and fat fingered nonsense, honest mistake.

A utf character from a language that has zero chance of being mapped to your programmer's keyboard in the middle of a code line, that would be obvious intentional tampering or at the very least raise some eyebrows.


> A utf character from a language that has zero chance of being mapped to your programmer's keyboard in the middle of a code line, that would be obvious intentional tampering or at the very least raise some eyebrows.

I accidentally get Cyrillic "с" in my code few times a year. It's on the same key as Latin "c", so if I switch keyboard layout, I don't notice until the compiler warns me. Easy to do if I switch between chats and coding. Now my habit is to always type a few characters in addition to "c" to check what layout I'm _really_ using.

Granted, it's easier with a one-letter variable called "c", but with longer names I can easily see myself not typing "c" the first time (e.g. not pressing hard enough), starting build, chatting in some windows, getting back, facepalming, "fixing" the error by adding "c" or "с" depending on my keyboard layout.


even worse, I have Punto switcher that automatically switches language when I start typing. With default config, it changes latin c to russian one because russian language includes word "c" while it's non-sense in English.

and since it's the only Cyrillic character that's placed on the same key as the same-looking english character, I don't even see the problem with my eyes when the autoreplacement fires


I mean, I agree that the punctuation mishap is a better cover story, but why would any particular language have “zero chance” of being mapped to the programmer’s keyboard?


> why would any particular language have “zero chance” of being mapped to the programmer’s keyboard?

You left out a very important qualifier: in the middle of a code line

The chances of someone accidentally fat-fingering a keyboard layout switch, typing one character, then switching back again without noticing while typing a line of code are very slight indeed.

Plus the fact that you'd need to speak a language written in that alphabet to have a plausible reason for having the keyboard layout enabled in the first place.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: