How the XZ Backdoor Works

188 points22
smitelli8 hours ago

> The file, supposedly a corrupted XZ file, is actually a valid XZ stream with some bytes swapped — for example, 0x20 is swapped with occurrences of 0x09 and vice versa.

My under-caffeinated brain can't unsee that as a silly reference to the old tabs vs. spaces squabble. Maybe it's simply a rational obfuscation choice, or maybe it's indeed a masterstroke of satire.

acdha7 hours ago

I think it’s a very shrewd use of that history: their goal was to get a potential reviewer to ignore it and pretty everyone qualified to look at that code has seen those specific values enough to likely tune it out as more tabs vs. spaces noise or perhaps think the original test case started with a file corrupted by some whitespace conversion process, which would be a very easy explanation since very similar things have legitimately happened to other projects.

Whoever did this shows signs of being very familiar with programming culture and I’d bet that this was another deliberate attempt to hide in the background noise.

bonyt8 hours ago

Wait, you're right, 0x20 is space and 0x09 is ht/tab... they've swapped spaces for tabs...

pdpi7 hours ago

The payload is supposedly a corrupted xz file. An overzealous "replace spaces with tabs" formatter could easily cause this sort of corruption, so it seems like a perfectly reasonable test case.

y_xc8 hours ago

"The exploit was caught promptly, so almost no users were affected. Debian sid, Fedora Rawhide, the Fedora 40 beta, openSUSE Tumbleweed, and Kali Linux all briefly shipped the compromised package."

One thing that is noteworthy is that Kali Linux integrated the backdoor briefly.

So we should have a discussion about security of distros packaging speed. Is it more secure to integrate all patches as fast as possible or is it more secure to wait a bit until it is being tested elsewehere. The "elsewhere" would catch the bad effects then. Integrating security patches as fast as possible could be very costy.

hypeatei8 hours ago

It probably depends on what's being updated in the new version. If it's a vulnerability fix, then fast is good. If not, then waiting a little bit is ideal.

Not sure how that would work in an automatic way without some sort of tagging system marking it as "critical" but that could be abused by bad actors too.

greggsy7 hours ago

Most major distros include mechanisms to install packages not security updates, either via the package manager, or through specific repositories.

jeltz7 hours ago

Wasn't Debian testing also affected? That should have affected some people's PCs.

deng5 hours ago

Yes, it was. And by that, for instance also all docker images that derive from debian:testing, which is not that uncommon.

ajross7 hours ago

> Is it more secure to integrate all patches as fast as possible or is it more secure to wait a bit until it is being tested elsewehere.

That's a chicken/egg argument. Someone needs to be "elsewhere". Integration testing is absolutely part of security defense in depth architecture. In point of fact this was caught by running Debian testing. Delaying inclusion would have delayed discovery.

y_xc7 hours ago

Exactly that is my fear, too. It's like zero trust in everything. - Don't trust that any other party do the security testing for you - Don't trust any developer, it could have malicious intents - Don't trust the patches, they could introduce bugs/backdoors/...

So we're straight going in the direction of protectionism and isolation. Wouldn't that be a big step back for the whole OSS/FOSS community?

One thing I read the xc-writeups last days (I think it was here was that we at least shouldnt trust any devs/maintainers that have no online identity. Though the argument that you could create a fake ID is possible. So I'm still obsessed from the question, what are the lessons learned for the community from this and how restrictive the participation in OSS will become.

ajross6 hours ago

I don't think that's quite right: Trust the community. Trust the process. Sure, any one actor might be compromised, but that will be rare and exceptional. So build layers such that their bad actions get detected and corrected.

Some parts of that aparatus worked poorly in this case, it's true: xz had a single maintainer who turned out to be susceptible to a deliberate human engineering attack, and yet was trusted to be linked into the some of the highest-trust parts of the system. That's bad, and we should work to avoid that kind of situation in the future.

But other bits worked very well: multiple downstreams detected the flaw (though only Andres saw it for the attack it really was), and it was corrected before it reached any production releases. The community as a whole is set up in the right way and doing the right things.

Topfi11 hours ago

Great summary, especially since I missed this yesterday. To be honest, it's far above my knowledge, so I'd just like to thank Andres Freund and everyone else for investing their time, effort, and knowledge into keeping all of us secure.

bsmartt9 hours ago

> To be honest, it's far above my knowledge

Just means it's an even better opportunity to learn!

coleca8 hours ago

This may be a dumb question, but is law enforcement investigating this? Is it even technically a crime?

EasyMark2 hours ago

Doubtful if law enforcement is, you can bet that the CIA and NSA and SS are looking into it though hoping to find a thread to pull on the sweater.

lambersley7 hours ago

In Canada, it would fall into a number of federal laws (Criminal Code)

Unauthorized use - 342(1) Mischief in Relation to Data - 430(1.1) Interception private communications - 184(1) Deceit/fraud - 380(1)

1. 2. 3. 4.

H8crilA8 hours ago

Of course it is a crime. This is in fact more than a crime, it's a counter-intelligence problem, even if done by a non-state actor.

brookst7 hours ago

I’m not sure that all counter-espionage problems are necessarily crimes, in the sense that a specific law was violated.

alickz7 hours ago

What would the crime be? Misuse of computers? Espionage?

Curious as to the legal angle of it

PennRobotics6 hours ago

just a guess: Illegal Electronic Surveillance

more of a guess from the below link?

18 U.S.C. § 2512, which prohibits the manufacture, possession, advertisement, sale, and transportation in interstate or foreign commerce of devices that are primarily useful for the surreptitious interception of communications

(although is this a hardware-specific prohibition?)

H8crilA7 hours ago

This is a ChatGPT-level question :) . Pasting GPT-4 response:

If someone is caught installing a backdoor into a software library such as libxz, particularly one that interacts with a secure communication protocol like OpenSSH, they could be charged with several offenses under United States law. The specific charges would depend on the details of the case, but here are some possibilities:

1. Computer Fraud and Abuse Act (CFAA) Violations: The CFAA is the primary federal law in the U.S. for computer crime. It prohibits a variety of different types of computer-related activities, including unauthorized access to a computer system, causing damage to a computer system, trafficking in passwords or similar information, and more. A person who installs a backdoor could be charged with unauthorized access and/or causing damage.

2. Wire Fraud: If the backdoor was used to obtain sensitive information or to cause harm, the person could be charged with wire fraud. This is a federal crime that involves using interstate wire communications to carry out a fraudulent scheme.

3. Identity Theft: If the backdoor was used to steal personal identifying information, the person could be charged with identity theft.

4. Economic Espionage Act (EEA) Violations: If the backdoor was used to steal trade secrets, the person could be charged under the EEA.

5. National Stolen Property Act (NSPA) Violations: If the backdoor was used to steal data or other "property," the person could be charged under the NSPA.

6. The USA PATRIOT Act: If the backdoor was used in a way that could be considered "cyberterrorism," such as causing harm to a critical infrastructure system, the person could be charged under the USA PATRIOT Act.

It's also worth noting that if the person was working on behalf of a foreign government or organization, they could be charged with additional crimes, such as espionage.

Keep in mind that this is a complex legal issue, and the specific charges would depend on the details of the case. If you're dealing with a situation like this in real life, you should consult with a legal professional.

magic_hamster7 hours ago

This is 100% a state actor. We can also kind of narrow down who.

greggsy7 hours ago

Based on the Chinese-sounding name alone? They also used two other sock puppet accounts that sound Indian and Anglo:

VHRanger6 hours ago

The chinese name may be a red herring, as it's mixing mandarin and cantonese namtes.

jeltz7 hours ago

And a Scandinavian and a Russian sock puppet too.

shaky-carrousel7 hours ago

The author's name may be a decoy. I'd have done that.

coffeeaddicted6 hours ago

As someone on reddit mentioned yesterday "Jia Cheong Tan" is an anagram of "CIA Agent John". Which may be accidental or a funny pun by the backdoor coder.

fl73057 hours ago

That would be far from unlikely.

But have we seen anything that would require more than a very smart individual with some time on his hands?

jeltz7 hours ago
saagarjha7 hours ago


irobeth6 hours ago
throwaway4good9 hours ago

Is the original git repository still available somewhere?

deng8 hours ago

Note the malicious m4 build scripts were not checked into git, but only put in the released tar balls. You can see the original content here:

throwaway4good6 hours ago

The test file containing the backdoor seems to be here:

deng5 hours ago

Yes, the binary "test files" were checked in. But the code that actually decompresses and executes the shell code in that file is in this m4 script, which only exists in the tar archive:

rrdharan8 hours ago
somemisopaste11 hours ago

> (...) the backdoor adds an audit hook. The dynamic linker calls all the registered audit hooks when it is resolving a symbol.

How was this possible without also modifying the LD_AUDIT var? Haven't seen that mentioned yet, or perhaps I'm missing something.

rwmj11 hours ago

When you're running inside the binary you can do mostly whatever you want. Especially in this case where the back door could run before mprotect(2) has been used to write-protect critical structures like the GOT and PLT (not that that is watertight either).

tgv11 hours ago

I guess this is the answer (from

> Symbols of type STT_GNU_IFUNC (GNU-specific extension) are treated differently from normal symbols. Such IFUNC symbols point to the resolver function, and all calls to such functions are delayed until runtime.

nwellnhof10 hours ago

It's probably as easy as modifying "extern struct rtld_global_ro _rtld_global_ro", exported from ld-linux, the dynamic linker/loader. During IFUNC resolution this struct seems to be writable.

somemisopaste10 hours ago

So in other words LD_AUDIT is useless? If it's that easy to overwrite the GOT I fail to see the purpose in audit functionality.

INTPenis9 hours ago

OpenWRT also uploaded backdoored packages.

1oooqooq8 hours ago
zekica8 hours ago

Not really, they updated to 5.6.x but used code from the git tag, not from the tarball, and only in their snapshot (not stable) version. The git repository never contained <i>build-to-host.m4</i>, so the injected build step was never executed and not backdoored library was produced.

H8crilA10 hours ago

Do we have any idea at all who has done it? From what I've read there's no credible attribution in public sources.

npteljes8 hours ago

I read an analysis that creates a profile, but that's as far as public information goes at the moment.

kencausey10 hours ago

No. I'm sure it is being looked into by multiple parties.

bjoli9 hours ago

Pretty much this. Some of the actors that would be affected by this are as large as can be.

Sshd is used everywhere.

1oooqooq8 hours ago

full of "is still being investigated".

Yet another summary of the initial findings. So, click bait.

Jgrubb8 hours ago

First summary I’ve bothered to click on so mission accomplished, and I learned a lot.

1vuio0pswjnm713 hours ago

If the backdoor relies on IFUNC then it would not work on distributions using musl. Like the one I use. Always amusing when HN commenters try to belittle use of musl and promote use of glibc. I like musl. I like choice.

1vuio0pswjnm722 minutes ago

Linux is not the only system I use. It is just one option. I compile sshd from source. It's not always OpenSSH, but when it is, it does not include any patches for systemd. I like choice.

fl0ki9 hours ago

A lot of people who don't like musl just prefer not to have the slowest memory allocator on the planet, especially when multiple threads are involved which they usually are in modern code.

I've watched release-optimized builds with musl run orders of magnitude slower than unoptimized builds with mimalloc, and this was on only 20 cores, it wasn't exactly pushing the envelope of big iron scaling.

Even after waiting years for its mallocng, which is better, it is still the slowest. It's no longer just that it wasn't a priority, it's inherent fallout from musl's practice of reinventing everything without learning nearly enough about why other implementations were built that way in the first place.

With memory allocators there were several legendary implementations to learn from. My pick would be mimalloc because it's not just fast, it's also hardened, which seems relevant in a thread about security.

At least once you know this, you can substitute the allocator for your program even if you otherwise use musl. That's common practice when producing static binaries from Rust code.

The problem remains that far too many people just use musl without actually benchmarking their program under a real workload to see just how much they've sacrificed in return for, well, what exactly, because it also wasn't hardening.

_joel8 hours ago

Agreed, it's not an insignificant addition to the time of a pipeline run (in our use case) and adds up if you're running a good number. Spent a bit of time porting some images to alpine and it wasn't really worth the effort. If you're restricted by device resource, which what it was created for I guess, then fine.

probably_wrong9 hours ago

If you really want to bait HN you should point out instead that this backdoor exclusively affected distros using systemd.

dist-epoch9 hours ago

Hey, you should put a trigger-warning on that

lifthrasiir13 hours ago

IFUNC was probably chosen because most systems run glibc anyway so that is enough for the widespread vulnerability. The exploit would have used other mechanism tailored for musl if it was a majority instead.

adql9 hours ago

Backdoor relied more on compromising supply chain than on particular glibc feature. And the fact the notify support in sshd wasn't implemented directly but via libsystemd that is the one that dragged the lzma dependency into SSH

deng8 hours ago

Yes, using a niche system is a good way to not getting hacked, but then, why would you use Linux at all? Looks like you'd be happier with OpenBSD, or maybe Plan9?

realusername11 hours ago

xz is bundled in millions of combination of OS/architecture/tooling/config, even the attacker had to make a choice.

SV_BubbleTime6 hours ago

Lends some credit to the idea that they had a target or series of targets in mind when deploying this seemingly general attack vector.

dmitrygr9 hours ago

I do now wonder if systemd itself is part of a similar long game by someone. It is part of everything. One compromise there and it is game over.

It has all the signs:

- Replaces old perfectly working systems? Check

- Large and inscrutable? Check

- Touches a lot of things an init system has no business touching? Check

- Pushed in rather suddenly, in a coordinated fashion, and against much reasonable opposition? Check

npteljes8 hours ago

>One compromise there and it is game over.

Consider that every agency like the NSA, and private groups like the NSO Group (authors of the Pegasus spyware) already sit on a pile of unpatched vulnerabilities, for various systems. If one vuln in systemd is game over, then the game is already over. And if what we currently experience is the game over state, then it's manageable.

fl0ki6 hours ago

It looks manageable to most of us because we don't know who was targeted and what would have happened if they weren't.

We don't even know what whistleblowers were caught before they could publish their information, and we don't know what information that would have been.

Imagine if nobody ever leaked what Edward Snowden did. Imagine that something similarly important, anywhere in the world, has not been leaked because of successfully exploited vulnerabilities.

npteljes6 hours ago

I'd rather have a safer world for the software, and for the people especially. What I wanted to reflect on is just OP's assertion that if bad thing X happens, it's game over. It's not, I think, because if it would, then it were game over long ago. So now, if we find a vulnerability in systemd, we'll just fix it and put it with the rest of the fire.

andrewshadura9 hours ago

Same old arguments debunked many times already.

1. Old systems it replaced were difficult to use and maintain.

2. It is in fact modular and the code is actually easy to understand. When I ran into a bug, I managed to find and fix it.

3. Systemd itself replaces the init and service management (initscripts etc), that's quite in the scope. Other bits and pieces like timers and resource management, are also in the scope. Other things like network management, are separate things that are only developed under the umbrella of the systemd project.

4. Systemd was not pushed suddenly, it was many years in brewing, and it became gradually adopted when it was actually ready for production use.

swader9999 hours ago

Real time sophisticated astro turfing system in place to stifle any criticism - Check.

Just kidding, I think the defense given holds up.