Reflections on Distrusting xz

286 points4 monthsjoeyh.name

irdc • 4 months ago

One thing that comes to mind is that “Jia Tan” might be more accurately seen as a “sleeper” of some sort: a foot soldier who infiltrates a juicy open source project and waits for further instructions; backdooring sshd might not have been part of the original plan.

Which raises the concerning question of how much more sleeper maintainers there are.

constantcrying • 4 months ago

>Which raises the concerning question of how much more sleeper maintainers there are.

Given how easy the infiltration is and how extremely hard to detect it is, likely a lot.

For an intelligence operation it is also extremely cheap, you just need a few knowledgeable developers spending some time each week on the project. The upside being a backdoor into a significant portion of infrastructure, the downside being wasted time.

I do not think it is unlikely that in many important open source software projects there are one or two people assign to keep an eye on things. They don't even need to be malicious, just being somewhat trusted contributors is enough. I would be extremely surprised if the NSA hasn't a couple of guys who keep watch on the Linux Kernel.

moritonal • 4 months ago

The irony is that this would make a oddly effective way of having paid open source devs who for the most part just honestly improve projects. With the massive downside they undermine it at a critical moment.

rightbyte • 4 months ago

Ye then maybe thanks to them finally we'll have The year of the air gapped Linux desktop.

I dunno what to do if e.g. Debian gets compromized, as in, I can't trust the collective of maintainers.

I assume any Windows machine is backdoored. Trivially proven by forced auto updates.

Maybe air gapping some home computer for sensitive data might be a good idea.

pants2 • 4 months ago

The ideal situation is having multiple intelligence agencies all working on one project and spotting each others' backdoors, so at the end of the day we just have a really secure and well-maintained project.

david_draco • 4 months ago

> Given how easy the infiltration is and how extremely hard to detect it is, likely a lot.

I read it exactly the other way around: the infiltration took years and detecting it was the default with a fuzzer, which had to be disabled for the exploit to succeed.

It speaks to the hardening and that hardening should be required more. And of course the precarious roles of maintainers, which have been discussed elsewhere.

dullcrisp • 4 months ago

The exploit could be detected with a fuzzer perhaps. The infiltration seems like it was something that would be very easy for a funded intelligence agency to do, and would be nigh undetectable if they had been subtly introducing bugs rather than shipping a sophisticated backdoor to every Linux distro.

You have to assume if this was an intelligence agency they didn’t burn their only agent like this.

HankB99 • 4 months ago

> they didn’t burn their only agent like this.

I think I'd characterize that as "their only identity." What's the chance that some number (>1) of actual agents were sharing this identity? If that's the case, I'd extrapolate that to multiple identities. In fact the social engineering to gain the trust of the original maintainer likely involved several identities.

I suspect that detectability of intentionally injected bugs would be very low.

geggo98 • 4 months ago

You are right, it took quite some time. On the other hand, it looks like the legitimate part of contributing to xz was only a part time job for the attacker. The rest of the time, they either worked on the exploits, or in other things, like infiltrating other projects using a different handle.

Basically I can imagine the attackers being a well organized group, using work sharing and pipelining. Some members of the group would be preparing exploits, some would infiltrate projects and some would make sure not to get caught. And since infiltrating takes time, they would make sure to have multiple projects in the pipeline, seine in the early contributor stage, some in the social pressure stage, and some in the exploiting stage.

Thorrez • 4 months ago

>a fuzzer, which had to be disabled for the exploit to succeed.

According to this comment, the fuzzer wouldn't have detected it. It wasn't necessary to disable the fuzzer:

>https://news.ycombinator.com/item?id=39911249

jannes • 4 months ago

At this point I wouldn’t be surprised if the NSA also has a couple of Microsoft employees on their payroll.

Gormo • 4 months ago

> Given how easy the infiltration is and how extremely hard to detect it is, likely a lot.

This particular case seems to be an example of the exact opposite. It took "Jia Tan" two years to conduct all of the social engineering necessary to get into a position to introduce the backdoor, then upon doing so, was caught almost immediately, with the initial discovery not even coming from a dedicated security researcher, but from a sysadmin who kept digging when he saw unusual performance issues.

And the threat actor here deliberately went after the weakest link in the chain. Assuming that the goal was to compromise sshd, the antagonist evidently found it too much of a challenge to attempt to infiltrate OpenSSH itself, so targeted a small-team compression project that only some deployments even link to, and still got caught extremely rapidly.

constantcrying • 4 months ago

I don't think you are looking at it the right way. Two years matter if you are a hobbyist or your goal is to compromise the system for some individual gain.

For a nation state actor two years is nothing. Likely the entire attack didn't cost more than a couple of thousand hours of developer time. I would guess it was easily cheaper than 100k in financial terms, that is extremely efficient for an intelligence operation with the upside being access to a large amount of servers.

The method by which he got caught also depended largely upon random chance, some major "if's" needed to happen, the performance reduction was an unfortunate side effect from the perspective of the attacker, really a minor mistake exposed him. Even if he were exposed a couple months later the damage would have been enormous, if that version had become a stable part of any major distro for the next few years hundreds of thousands of machines would have been vulnerable.

There is absolutely no reason to assume that if another attack of this quality happens it will not find it's way into some stable distro.

Gormo • 4 months ago

kevans91 • 4 months ago

fwiw, characterizing Andres as a sysadmin isn't really the whole picture; he's a postgres developer that conducts benchmarking operations with some frequency (and he's quite good at what he does)... he's perhaps naturally a bit more sensitive to things like the cumulative effect of 500ms or so over a number of sshd invocations.

Gormo • 4 months ago

You're right -- I went back and changed "sysadmin" to "engineer". Either way, though, he was not a dedicated security researcher, and managed to unravel this entire thing upon noticing an anomaly in the course of his regular work.

cactusfrog • 4 months ago

I believe LLMs could be useful here as a component of pre-commit hook.

sherburt3 • 4 months ago

[dead]

berkes • 4 months ago

I once got a (probably scam) offer for adding a cryptominer to a library that I maintained at that time. And a more serious offer to add trackers to a popular >1M installs app.

Both cases I obviously ignored it. But it made me aware of a nasty attack vector: someone who's thanklessly building a wordpress-plugin, pip, npm, or whatever software, thanklessly dealing with issues, PRs, support, maintainence, often for no pay, suddenly gets offered three figure sums to add a few lines of "affiliate stuff" or such. There are many places in the world, or people in situations where this amount of money really makes a compelling case.

irdc • 4 months ago

“Given enough underfunded maintainers, all security is shallow.”[0]

0. https://en.wikipedia.org/wiki/Linus%27s_law

lenerdenator • 4 months ago

I wouldn't say "funding" is necessarily the problem.

Most maintainers do it because they like doing it. Their main limiting factor is time. I can drop a million dollars an hour into a maintainer's lap; that doesn't mean they can dedicate every waking moment to a project. They still have human needs that money can't buy like sleep, family obligations, and health concerns. And that's making the assumption that the maintainer uses that million/hr to quit their job.

No, the problem is a lack of trustworthy candidates for maintainership and a lack of time. There are components of a GNU userland that are now too complex for a single human to both maintain and enhance at the same time. We now need to target multiple distros (really, more than are necessary, strictly speaking) and ISAs. Most are written in systems programming languages like C that are more complex than the average software engineer in 2024 works with.

We need consolidation, simplification, maintainer redundancy, and a trust/governance framework for packages.

mikrotikker • 4 months ago

We need to utilise a specialised AI to scan through the code looking for bugs and security holes. Imagine if openai donated server time to this.

strogonoff • 4 months ago

I believe the problem of thankless maintenance is best solved with two things: the thanks (yes, we are all human and want recognition and appreciation from fellow humans)[0], and a stable employment (work for a good large business while open-sourcing what’s possible)[1].

If you do OSS for profit, then it can become a question of where is more money; but if you work a reliable job with insurance, relationships and other implications then the stakes may be a bit different.

Many of the biggest OSS projects today were started by people who had no money in mind whatsoever. Some had other jobs, others were students, etc. If we feel relatively secure, we are driven by our innate desire to tinker, create cool things and show it off.

[0] Undermined by LLMs that are used to gobble up your code and suggest it to others commercially and without attribution.

[1] Undermined by low employment protections (if you can expect to be fired at any time, you would be less loyal), and by LLMs (whatever you open-source now more directly benefits Microsoft or whatever).

aleph_minus_one • 4 months ago

> and a stable employment (work for a good large business while open-sourcing what’s possible)

Even if it was hypothetically possible to open-source basically everything that the team in which I work produces:

The software that I work on is very specialized software that is used by the company's employees and customers for specialized purposes. Imagine some nice LoB application that is actually somewhat comfortable to use. It basically does "what the users need" and is thus deeply ingrained in some parts of the company's workflows. The only use someone outside the industry might have for it is "cosplaying being employed in this industry".

A lot of software that is developed (in particular in companies that don't sell or rent software) is of this kind.

Thus: the open-source scene does in my opinion not have any use for a huge amount of software that is actually developed and actively used.

strogonoff • 4 months ago

> The software that I work on is very specialized software that is used by the company's employees and customers for specialized purposes

Oh really? Welcome to the club. Our very specialized software for very specialized purposes used Django with a certain auth provider. So I refactored that into a standalone Django app that painlessly handles this specific OAuth provider, configurable via settings with sane defaults, and open-sourced it. (The refactoring was very beneficial to myself, that part of the project got instantly nicer to work with.)

Of course, it is small beans compared to something algorithmically hardcore (I was a junior myself back then), but it’s just an example.

Any software, no matter how specialized and bespoke, can be expressed as many self-contained isolated components that individually know nothing about that specialization and bespokeness. In fact, such factoring is generally a sign of good design: you may have heard of the loose coupling & high cohesion principle—once you follow it, open-sourcing a particular component is very straightforward.

Note, though, that if your contract has certain licensing provisions in certain countries you may not be allowed to unilaterally open-source anything during the full term of employment (even if it is unrelated to your dayjob). You may need to get approval first. However, many good tech companies are reasonable when it comes to open-sourcing non-core components.

david_allison • 4 months ago

3. More maintainers

Days of 100+ notifications aren't easy. Things will slip through

strogonoff • 4 months ago

Agreed, but especially in light of recent events it’d be important to know who they are, and that’s not always easy.

kijin • 4 months ago

This is what worries me more.

It's easy to point a finger at a specific Bad Guy® and shout "He did it!" It's much harder to face the reality that any maintainer of any open-source project can slowly burn out to a point where they become accomplices in an attack, or at least turn a blind eye.

The pool of open-source developers does not split cleanly into honest contributors and evil agents. The boundary is quite fluid -- more so in some circles than in others -- and there are always temptations to move from one side to the other and back again.

david_allison • 4 months ago

> suddenly gets offered three figure sums to add a few lines of "affiliate stuff" or such

Back of the envelope calculation: you're looking at 2 orders of magnitude more money from "affiliate stuff" than you would be from generous user donations

berkes • 4 months ago

Well, yes. But it's also something you can do once. When (not if) it comes out, all credibility is lost.

Whereas donations, regardless of how puny, are recurring and potentially forever.

xign • 4 months ago

That's definitely true. And a lot of times it could be a random open source project that is under the radar and rarely thought about. E.g. The Great Suspender Chrome extension which was sold to an unknown buyer which later turned it to malware: https://www.bleepingcomputer.com/news/security/the-great-sus...

moritonal • 4 months ago

It's why I actually always encourage app devs to charge for their apps, even open source ones. It creates an exchange of value for the author to feel valued and detract from these vectors.

acdha • 4 months ago

> There are many places in the world, or people in situations where this amount of money really makes a compelling case.

It’s especially easy to imagine that using the classic intelligence agency playbook: monitor high-impact maintainers and look for leverage before making the approach (“hey, saw your post about the divorce settlement and that $%#@ cleaning you out. My affiliate marketing pays in bitcoin…”) just as they’ve done for ages.

andrewinardeer • 4 months ago

I believe this is a nation state actor and there are a a fleet of 'Jia Tans' working on other OSS projects to backdoor operating systems.

And some have probably succeeded.

prmoustache • 4 months ago

I wouldn't limit that to OSS projects. How many of them managed to get hired and are working for Microsoft, Apple, Google, Oracle or Amazon?

In some cases they don't even need to introduce backdoors themselves but just review and spot bugs they don't correct or raise issues for but communicate to mothership. They could even work in team with having one building the backdoor and the other approving the code.

Most companies have more thorough processes to avoid this but that doesn't mean those processes are applied correctly everytime, especially if more than one malicious engineer is involved.

adql • 4 months ago

I'm now imagining a department where every single worker is a spy for different government and they play endless game of "add exploit, close off the other people's exploits". And all of them think they are the 10x developer because all the other people do in their view is pushing shoddy code

simonvc • 4 months ago

I worked on a project just like this once.. a mobile phone network build in the middle east before the arab spring. 10/10 would not repeat the experience.

belorn • 4 months ago

Most of the time they don't need infiltrators. Governments can just pressure companies with export controls or warrantless surveillance to get backdoors into commercial systems. OSS projects require different methods because the more direct method would be discarded by the community and forked.

StayTrue • 4 months ago

> Most of the time they don't need infiltrators. Governments can just pressure companies with export controls or warrantless surveillance to get backdoors into commercial systems.

Or they simply pay companies with a “support contract” in return for embedding spyware that sells out customers. Seen that first hand (private key exfiltration), resigned the same day.

Lots of comments saying we need to do something about the OSS supply chain but in my estimation the problem is much worse with closed source commercial software.

autoexec • 4 months ago

> I wouldn't limit that to OSS projects. How many of them managed to get hired and are working for Microsoft, Apple, Google, Oracle or Amazon?

They don't have to sneak into those companies tho, they just hand over something like a national security letter and do whatever they want while making it clear to the heads of the company that anyone who talks or pushes back will rot in gitmo. Why wouldn't there be at least an equivalent to Room 641A (https://en.wikipedia.org/wiki/Room_641A) in every major US corporation that deals with massive amounts of people's sensitive data and communication?

acdha • 4 months ago

> Most companies have more thorough processes to avoid this but that doesn't mean those processes are applied correctly everytime, especially if more than one malicious engineer is involved.

I’d also bet that you could exploit the tiers at many companies: how many places have more robust review for the staff engineers but then assume that some lowly “ops monkey” will take care of the build environment, etc.? I’d hope that wouldn’t work at Google, Microsoft, etc. but have heard enough stories about disparities between which jobs are contracted out and which have the coveted FAANG benefits that I wouldn’t exactly be shocked if it turned out otherwise.

thewanderer1983 • 4 months ago

Deleted.

prmoustache • 4 months ago

The thing is they only need one member sometimes, to observe what is in use.

Example scenario: "malicious engineer in say Microsoft, finds out that office365 is using xz internally and the library is pulled directly without code review. Same engineer or another member of same group would be that Jia Tan doing the necessary backdooring in xz to target office365. And bam all worlwide Office365 accounts would be backdoored."

I am not saying Office365 is using xz, I have no idea really, but this would be a possible scenario. I know MsTeams is using ffmpeg for example.

So I think having this discussion while only scoping linux distributions is a big mistake. xz project was particularly interesting as a target as it is distributed under BSD zero-close license, which is pretty much a public domain license. You don't have the attribution part of the BSD license so there are probably myriads of proprietary software using it too without them acknowledging it.

Y_Y • 4 months ago

I believe the term "nation state actor" is a term that means "country" but with the bonus of connoting that the writer is an armchair infosec wizard. This speculation is not valuable without adding information, otherwise it's just McCarthyist bluster.

MattPalmer1086 • 4 months ago

Nation state actor is a standard info sec term. Using it does not imply any kind of wizardry.

Edit: most threat actors do not have the patience or the motive to behave in this way. It is reasonable to suppose that this is a nation state actor.

XorNot • 4 months ago

HelloNurse • 4 months ago

It's also a term that implies the competent, official hacking departments or espionage agencies of that country, rather than government-supported amateurs (e.g. an untrained policeman) or generic people from that country.

hyperhopper • 4 months ago

Whales is a country but it is not a nation-state and probably does not have its own APT.

Y_Y • 4 months ago

irdc • 4 months ago

At this point, considering the apparent ease with which a project that is used pretty much everywhere was taken over, that seems like a reasonable position.

arp242 • 4 months ago

I can walk out on the street and stab someone to death if I wanted to. This is surprisingly easy.

Just because something is relatively easy to pull off doesn't mean it happens a lot.

It's also not that easy to pull off because you need to have a project with relatively few eyes and a place to hide it. In this case: binary tests. But most projects don't have those.

There is no evidence for any of this, including that it's a nation-state actor. There's also a case to be made that it's NOT a nation-state actor as nation states use Linux and want a secure Linux. The NSA and such have somewhat conflicting interests here. We just don't know. It's likely we will never know.

All of this is starting to resemble the spy paranoia of the first world war. A few spies got caught and suddenly everyone was now a suspected German spy (including a general, if I recall correctly, who was detained for a while because he couldn't answer a question about baseball or some such).

I suspect that very soon people will start demanding maintainers put some of their blood in a Petri dish to be tested with a hot needle. Just in case.

acdha • 4 months ago

> There's also a case to be made that it's NOT a nation-state actor as nation states use Linux and want a secure Linux. The NSA and such have somewhat conflicting interests here. We just don't know.

I agree that we do not know that it’s a nation-state but this point seems to work in the opposite direction: this attack was very carefully constructed so only someone with a particular key pair could exploit it. That’s reminiscent of what the NSA did with the Dual EC constants, and they were confident enough about that to push it into the FIPS requirements for federal IT.

mannykannot • 4 months ago

Motive, opportunity, means - and consequences: it is primarily the absence of a motive, and secondarily the likelihood of consequences, that keeps the prevalence of street stabbings way lower that if means and opportunity were the only factors.

The argument against nation-states being involved has some problems: a state can avoid becoming victim to its own work, while its own restraint would not prevent developments elsewhere.

lolc • 4 months ago

You're commenting under a link where commits to the xz-decoder are discussed. Some level of paranoia is warranted.

The binary files look like a sideshow in comparison. Maybe we're lucky the attacker was tempted to hide something in there.

The_Colonel • 4 months ago

> I believe this is a nation state actor

It is certainly possible, but we don't really have a good indication for that. This whole thing would be definitely doable by a single individual.

Barrin92 • 4 months ago

doable yes, but what seems to me like a strong indication is the duration, multiple years, and the effort to set up a quasi patch infrastructure for the backdoor which I can't remember ever having seen in some amateur or ransomware hack.

mrkramer • 4 months ago

My assumption is that this was state sponsored mass surveillance campaign of some kind but God knows what exactly they were looking for.

I think if backdoor was discovered 2 or 3 months later, we maybe could understand better what they wanted to do. My speculation is that they wanted to build a massive botnet and then snoop on machines' processes and traffic looking for something. It's hard to speculate because luckily they were captured soon enough.

swed420 • 4 months ago

I find it intriguing that out of all the speculative comment threads I've read so far, none of them have suggested it was Microsoft attempting to make FOSS look bad/vulnerable.

beardedwizard • 4 months ago

How would that benefit Microsoft, who owns GitHub, the home of OSS? It's not a secret that oss is vulnerable, the opportunity for MS is to sell the solution to a captive audience.

swed420 • 4 months ago

Microsoft making the decision to own GitHub in the first place also speaks to my suspicion. Embrace, Extend, Extinguish.

epr • 4 months ago

I've never been concerned about spies infiltrating open source projects compared to legitimate maintainers being hacked, even now after this whole xz incident.

I'll put it this way. Let's say a bad guy had a decent budget to spend on paying agents/criminals to break into maintainer's homes on their behalf with a rubber ducky, etc. I'd expect a pretty high success rate compromising their hardware...

dartos • 4 months ago

You’re ignoring scale.

A single Jia Tian can be infiltrating 10s or more OSS projects each week without needing to travel around the world physically stealing hardware from various maintainers who they then need to impersonate.

They can just impersonate some anons with no real lives or connections and just get the keys to OSS projects given time.

epr • 4 months ago

> A single Jia Tian can be infiltrating 10s or more OSS projects each week

Single? 10s or more per week?! I can't help but think you are underestimating the cost of developer time. How many hours of work did it take JT to infiltrate to the point of finally implementing a backdoor? How much does that time cost?

> just get the keys to OSS projects given time.

This is not what JT did though, and for good reason. Trust of anons in open source is generally built through contributions of real developer work over time. That does not scale.

> without needing to travel around the world physically stealing hardware from various maintainers

I wasn't suggesting stealing hardware to impersonate someone. I'm talking about hiring petty criminals or using field agents to break into a house, using physical hardware access to install a backdoor, etc. into the legit maintainers hardware. The field guy's goal is to not get caught, so the maintainer is unaware they are compromised.

I suppose the limitation with both approaches (maintainer plant vs compromising maintainers) is cost. My educated guess is that the cost of hiring skilled developers from a very limited pool for multiple years is more than it would cost to hire criminals that are already breaking into houses for low risk jobs where they don't even need to steal anything.

berniedurfee • 4 months ago

When you find one cockroach, you can be sure there are thousands more you haven’t found.

echelon • 4 months ago

We could all be Jia Tan.

Someone could be bought, killed and replaced, or simply shadowed when they die or go to jail. Anonymity makes this even easier.

craftkiller • 4 months ago

> killed [...] die or go to jail

All of my commits are signed with a PGP key that is on hardware security tokens and password-protected. In the event of my death, my digital identity could not be stolen without backdoors in my hardware security tokens.

That being said, $5 wrenches and large sums of money are still possible attack vectors.

acdha • 4 months ago

Also don’t forget that not everyone expects perfection and a canny attacker can exploit that. It’s really easy to focus on how you’d avoid trojans, keyloggers, etc. but I’d also ask how likely it is that if someone sent a message from your email address claiming you’d lost your token in a minor accident, etc. that they’d believe it - or simply accept it if commits started showing up with a new key (maybe with an upgraded crypto system) since 99% of Git users never check those.

ccccccc1 • 4 months ago

A cool tax-free no questions 500k can convince a lot of people

TheCondor • 4 months ago

One thing I’ve learned, not from direct experience but from observation. These things are way cheaper than the more ethical and optimistic of us in society think. Your point is totally valid but the number is probably more like $5k-10k.

saagarjha • 4 months ago

Tax free $500k? I don't want the IRS to come after me. Please mark all your bribes as regular income thanks

brabel • 4 months ago

Everyone working on important open source code should have a real identity associated with them. The fact that "Jia Tan" was able to become a maintainer without anyone ever trying to figure out their real identity shows a huge weakness in our trust model in OSS (everyone real would have something like a Linked In page, Facebook, Twitter, Instagram, or better, their own website with stuff that could be used to ensure they're a real person - that could be faked as well but the amount of effort would be high, and checking this would be much better than just allowing effectively anonymous users to be maintainers - there's just no need for anonymity in this scenario!).

ninkendo • 4 months ago

> everyone real would have something like a Linked In page, Facebook, Twitter, Instagram, or better, their own website with stuff that could be used to ensure they're a real person

Oof, I guess I’m not real then, as I have none of those things.

SanitaryThinkin • 4 months ago

On top of what you mentioned I also dislike the TSA-like response the OSS community is taking with this happen stance.

I have anonomously contributed to many projects because I enjoy my privacy. All of my founding projects have also been done with anonymity.

Because someone wants their anonmity and privacy does not mean they're nefarious, and I find it funny the group that takes to these principals most is negging on those ideas.

xign • 4 months ago

nottorp • 4 months ago

I have Facebook and the account and what's on it is no one's -ing business in a professional context.

ghaff • 4 months ago

ed_elliott_asc • 4 months ago

Also this can all be faked

reisse • 4 months ago

No-one working on open source code on their time owes anything to anyone using the code. If you want an important open source project to be maintaned by non-anonymous person, surprise-surprise, hire that person and pay them.

Besides, some of the best open source contributors I know are almost-anonymous people behind nicknames and anime girls avatars.

xign • 4 months ago

At the same time no one is obligated to use your source code. I think the point here is from now on people (companies and large projects) may be more paranoid about anonymous contributors and refuse to sign off on using code maintained exclusively by them. It's fine for people to stay anonymous, but they just run the risk of not having the credibility for adoption and need to accept that.

But yes, I do trust certain figures like that, e.g. Asahi Lina. It's a fine ambiguous line. But at least in Asahi Linux there are real known human figures and they know who Asahi Lina is.

brabel • 4 months ago

It is not about owing someone... it's about having provenance of code.

If you're just an anonymous guy doing stuff for free and want to remain anonymous, that's fine, but then your software shouldn't be used by anyone who cares about toolchain attacks as there's just no way to trust you, and no way to verify every single commit you make on new releases.

For software that gets used by many, which is a goal of OSS (otherwise just don't even bother to publish stuff, what's the point?), there needs to be a face behind it.

I do agree with others that identity is a hard problem, but people here are pretending there's no solution to that (or misinterpreting what I wrote to mean people should have a Facebook or Twitter account, which is absolutely not what I was trying to say - I just mentioned the most popular websites real people are likely to be found on, as that could be used to prove their identity... for example, I have a Keybase account where my proof of identity, which is tied to my public keys, can be found on my GitHub profile - but they let you choose Facebook or Twitter for that purpose as well) when obviously there is. I should know, I work on this space.

reisse • 4 months ago

> If you're just an anonymous guy doing stuff for free and want to remain anonymous, that's fine, but then your software shouldn't be used by anyone who cares about toolchain attacks as there's just no way to trust you, and no way to verify every single commit you make on new releases.

What is, from security point of view, the difference between a toolchain attack performed by an anonymous contributor and by an identifiable real person?

> For software that gets used by many, which is a goal of OSS

It's not. The goal of OSS is to give users the possibility to study, change and improve the software. And that includes giving you ability to independently audit the code. All of that does not need any person behind it.

acdha • 4 months ago

> everyone real would have something like a Linked In page, Facebook, Twitter, Instagram, or better, their own website with stuff that could be used to ensure they're a real person

Have you seen the campaigns people have run building fake LinkedIn profiles and slowly adding “connections”? There was one a few years ago which roped in a lot infosec people who should have known better and it’s gotten much worse with AI generators. Even before LLMs what you described would have been a godsend for intelligence agencies - who has more time for it, an open source developer writing actual code or the dedicated social media team at the IRA? – and now that’s increasingly worse.

brabel • 4 months ago

I believe the solution to identity on the Internet needs to be tied to governments, that's unfortunate but in real life, that's always been the case and I can see no alternative here. Blockchain is a pipedream and no serious work is going to associate a person's identity to a key which cannot be revoked, cannot be recovered in case of "loss", can be tracked on a public ledger etc. etc...

But there's actual good work going on in the identity industry, like Verifiable Credentials, so this will become a reality soon: you will be able to verify someone's identity as long as you trust the issuer of their "credential" (which in the case here would mean basically a username and a public key or reference to a JWKS which can be used to verify the signature of the person, very much like the digital version of an identity card which can be used to check the signature on some piece of paper, but actually cyptographically safe)... so you would need to add a few governments to your list of "approved issuers", or something more indirect like universities (which themselves would rely on the government-issued identity) or traffic authorities (if you rely on driving licenses). Sure, Governments can lie, and people go to great lengths to steal others' identities in real life, but in the current world, we're still able to get bank accounts, passports etc. based on this model... just because the system is not perfect doesnt' mean it's not good enough, specially when there's no better alternative at all.

iron-s • 4 months ago

What is real identity? Anything online can be faked. A state-issued id? How that protects against nation state?

brabel • 4 months ago

Nothing protects you if you're up against a state. That doesn't mean we should give up completely.

Do you have a passport? That's a real identity in most places. Soon, it may be possible to use that to link your identity to a set of public keys which you can then use to identify yourself.

There's a lot of work to be done to make this a reality, but work is surely going on right now and this is going to be possible one day.

Check this out, as a starting point: https://curity.io/resources/learn/verifiable-credentials/

nottorp • 4 months ago

Why would I trust an "important open source project" with my identity?

It goes both ways.

Besides, the 'state actor' the security theater people keep mentioning would have no trouble creating such real identities.

brabel • 4 months ago

If you don't trust the project, you wouldn't contribute to it.

The state actor may be able to fake identities, but that would still allow tracking the identity to a particular state... and if caught multiple times, that state would start losing credibility and projects may choose to stop trusting people from such nationality, unfortunately, or at least require more strong evidence the person is real and trustworthy if they come from known rogue nations.

nottorp • 4 months ago

> If you don't trust the project, you wouldn't contribute to it.

Trust them to merge a bugfix is different from trusting them with my identity isn't it?

There are degrees of trust. For example I have a gmail address in my profile because the spam filter on there is better than what I have on my personal domain. People I've known for longer, business or otherwise, get the other (that I read more often).

executesorder66 • 4 months ago

News just in: NSA et al. defeated after having to create a LinkedIn and Instagram profile for their agents.

the8472 • 4 months ago

If you want security, pay for independent code audits (not compliance bullshit). Repeatedly. Don't offload your desires onto one-man-shows that the world decided are useful tools.

MattPalmer1086 • 4 months ago

None of those things prove identity. A well funded or just patient attacker can spoof all of those. Sure, it raises the bar a tiny bit, but it's no proof of identity.

tomn • 4 months ago

Yeah, it's not enforced (and certainly not with linked-in and facebook) but it's really not uncommon to require use of real names for contributions.

Linux doesn't allow anonymous contributions:

https://www.kernel.org/doc/html/latest/process/submitting-pa...

and this guide has been adopted by a lot of GPL-licensed projects (at least openwrt, glibc and gcc).

xign • 4 months ago

Wasn't there some controversies around this before? I remember there was some talk of why Asahi Lina (anonymous vtuber working on Asahi Linux) can contribute code to Linux. From casual search: https://www.spinics.net/lists/kernel/msg4888830.html

FWIW I like Asahi Lina, just trying to understand the discrepencies

tomn • 4 months ago

Interesting. My understanding is that these projects don't allow anonymous contributions to make their copyright situation clear, so in theory if marcan42 sent a letter to the linux project saying that contributions from Asahi Lina are actually theirs, they might reasonably be fine with that.

It seems like this is how the ASF runs: you can be anonymous publicly, but you have to sign their CLA (or whatever they call it) properly.

To me, the people trying to unmask Asahi Lina are being simultaneously mean and silly. If it's so obvious that it's marcan42 doing a voice, do you really need to point it out? That's kind of the joke.

ghaff • 4 months ago

I'm not sure why the downvotes. That seems to be a statement of fact.

You can do a certain amount of identity obfuscation online but for anyone with a real professional profile you're generally not really anonymous if anyone really cares to find out your true name.

tomn • 4 months ago

j-krieger • 4 months ago

Who are you to propose requirements onto people who work for free?

brabel • 4 months ago

I am not imposing anything on anyone. I am only saying that an OSS project that aims to be used as part of important infrastructure should impose at least some sort of identity vetoing and not just make random anonymous users maintainers of anything.

If your project is not important and you don't care about any of this security stuff, feel free to continue publishing your untrustable projects.

loftsy • 4 months ago

I took a look at the diff linked in the article with code that "we are all running". The top of the diff certainly looks interesting. They remove the bounds check in dict_put() and add a safe version dict_put_safe().

This kind of change is difficult to make without mistakes because it silently changes the assumptions made when code calling dict_put() was originally written. ALL call sites would need to be audited to ensure they are not overflowing the dictionary size.

The diff I am referring to is here:

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=de5c5e41764...

justinsaccount • 4 months ago

Also because the 'safe' version only checks

  dict->pos == dict->limit

and not

  dict->pos >= dict->limit

if you can get one call of dict_put somewhere to pass the limit, all later calls of dict_put_safe will happily overwrite memory and not actually be safe.

Calzifer • 4 months ago

No, because dict_put will update the limit value if the new pos exceed it.

justinsaccount • 4 months ago

I don't see anything like what you are describing. What line exactly are you talking about?

ahartmetz • 4 months ago

Wow, that is 1000% obviously malicious

Matumio • 4 months ago

Agree, nice catch. Also, there are many other opportunities in this patch to hide memory safety bugs.

This is the kind of optimization I might have done in C 10 years ago. But coming back from Rust, I wouldn't consider it any more. Rust, despite its focus on performance, will simply not allow it (without major acrobatics). And you can usually find a way to make the compiler optimize it out of the critical path.

kmfpl • 4 months ago

I agree, this looks extremely sketchy. Especially because the code is just writing a fully controlled byte in the buffer and incrementing its index.

This would give you a controlled relative write primitive if you can repeatedly call this function in a loop and going OOB.

liendolucas • 4 months ago

I think at this point is clear that everybody has to assume that XZ is completely rotten and can no longer be trusted. Is it XZ easy to replace with some other compression tool? Or has it been so widely adopted that is going to take huge effort moving out of it?

dralley • 4 months ago

There is no reason to assume that. Even if you assume every commit since Jia became a maintainer is malicious, the version from 3 years ago is perfectly fine.

Zstd has a number of benefits over Xz that may warrant its use as a replacement of the latter, and this will likely be a motivating factor to do so. But calling it entirely rotten is going way too far IMO

mmd45 • 4 months ago

There is an interesting argument to be made that pre-JT xz code is probably pretty secure due to the fact that the threat actors would have already audited the code for existing exploits prior to exerting effort to subvert it.

tripflag • 4 months ago

I always use "zstd --long=31 -T0 -19" to compress disk images, since that is a usecase where it generally offers vastly superior compression to xz, deduplicating across bigger distances.

XZ offers slightly better compression on average, but decompression is far slower than Zstd.

dralley • 4 months ago

IIRC memory consumption is generally worse for Zstd at comparable levels of compression. Which, these days, is generally fine, but my point is you can't thoughtlessly substitute the two.

liendolucas • 4 months ago

What keeps ringing in my head is the "." that was found that invalidates compilation. I personally don't buy it (but is my opinion).

dralley • 4 months ago

kzrdude • 4 months ago

Huge effort, because it is the default .deb compressor in Debian for example

rthnbgrredf • 4 months ago

Arch Linux has replaced it with zstd in 2020 already. It's doable for the next major release of Debian.

kzrdude • 4 months ago

logro • 4 months ago

This is 100% malicious or novice coder. And we surely know it's not the latter.

If you need an unsafe call, you add a dict_put_unsafe(). That again should of course be rejected in a code review.

rwmj • 4 months ago

I think Joey's right that we should all go back to the "pre-Jia-Tan" xz, and I've raised this with Red Hat too. It's actually not a big deal as xz and liblzma is relatively stable and the version from 2 years ago is fine, although I understand that Debian's dpkg uses some new API(s) from liblzma which makes this a problem albeit a minor one.

(Unfortunately the Debian bug report that Joey filed got derailed with a lot of useless comments early on.)

iso8859-1 • 4 months ago

How do you know what 'pre' means given that pseudo-anonymous identities are free and Tan is already suspected of having some (e.g. Hans Jansen and Jigar Kumar: https://research.swtch.com/xz-timeline)

rwmj • 4 months ago

I mean we go back before all possible sockpuppets. We do have a reasonably good idea of when the attempt started.

glitchcrab • 4 months ago

The point that the person you replied to is trying to make is how do you know when the repo is clean? How can you ever be sure that someone hasn't introduced a backdoor at some point? It's bigger than just what has been discovered.

rwmj • 4 months ago

How do you know anything has not been compromised? You go and look at the commits and the code. It's hard work with no easy answers despite what many think.

rrr_oh_man • 4 months ago

By not letting the perfect be the enemy of the good-enough-for-now

meinersbur • 4 months ago

The greater concern should be how many other sleeper contributors are out there. Anonymous contributions are accepted every day, and we know of cases with malicious intent such by "James Bond" (https://lore.kernel.org/lkml/20200809221453.10235-1-jameslou...).

I am not specifically worried about other contributions by "Jia Tan", those are being extensively looked at right now. They and other sleepers may just as well have contributed to any project with a different name and therefore "Jia Tan" does not pose more danger than any other contribution whose submitter cannot be held responsible.

rkta • 4 months ago

What's malicious about that patch? From reading the thread it looks like an attempt to fix a FP from some tooling.

meinersbur • 4 months ago

One of the patches that the University of Minnesota was banned for from contributing to the Linux kernel. They were trying to introduce a use-after-free (Fig. 9 in their paper).

https://news.ycombinator.com/item?id=26887670

meinersbur • 4 months ago

I just had to think about how ironic it would be if "Jia Tan" turned out to be a Post-Doc from the University of Minnesota continuing that research on hypocrite commits.

codezero • 4 months ago

Consider “Jia Tan” started working on xz because they already found a critical vulnerability and wanted to maintain it, or more tin foil, they burned xz to get upstreams to use another compression that is also already backdoored. When dealing with state actors there’s really no limit to how complex the situation can be.

alt227 • 4 months ago

This is something I also wondered but havent seen discussed anywhere. This could all be a smokescreen to get distros to switch to the next best compression library which already contains malicious code. Hopefully maintainers of any upstream compression libraries are all looking hard at their code bases right now.

throwaway63467 • 4 months ago

Seems like a sensible thing to do, assuming this is a state-level threat actor there’s really no easy way to prove that their contributions are free of back doors. Seems not worthwhile risking the security of a large part of the Internet over a few thousand lines of code.

afc • 4 months ago

But why would the entire behind this submit all their attacks through the same single identity? Removing all this code could just be removing 1% of their harmful code. How do you deal with the rest? How do you discover the other identities?

rwmj • 4 months ago

You start with what you know about, and you investigate other projects carefully at the same time. There's no easy answer here, you do what you can.

berkes • 4 months ago

Full on tinfoil hat here. But warranted and practical.

I'm wondering what fallout we'll see from this backdoor in the coming weeks, months or years. Was the backdoor used on obscure build servers or obscure pieces of build infrastructure somewhere? Lying dormant for a moment in future to start injecting code into built packages maybe? Are distro's going to go full-on tinfoil-hat and lock down their distribution, halting progress for long time? Are software developers (finally?) going to remove dependencies(now proven to be liabilities!), causing months of refactoring and rewriting, without any other progress?

constantcrying • 4 months ago

>Are software developers (finally?) going to remove dependencies(now proven to be liabilities!), causing months of refactoring and rewriting, without any other progress?

How is that even a possibility? xz was very useful software, which can only exist if people with significant knowledge put effort into it. Not every OSS project has the ability or resources to duplicate that. The same goes for many, many other dependencies.

I believe that there is essentially nothing you can do to prevent these attacks with the current software creation model. The problem here is that it is relatively simple for a committed actor to make significant contributions to a publicly developed project, but this is also the greatest asset of that development model. It is extremely hard to judge the motivation of such an individual, for most benign contributors it is interest in the project, which they project onto their co contributors.

TeMPOraL • 4 months ago

Agreed. More than that, there's not much of a way to preven these kinds of attacks, period, whether in software or otherwise, if the perpetrator is some intelligence agency or such.

For threats lesser than a black op, the standard way of mitigating supply chain attacks in the civilized world is through contracts, courts, and law enforcement. I could, in theory, get a job at a local food manufacturer, and over the course of year or two, reach the point I could start adding poison to the products. But you're relatively confident that this won't happen, because should it ever did, the manufacturer will be smeared and sued and they'll be very quick to find me and hand me over to the police. That's how it works for pretty much everything; that's how trust is established at scale.

Two key components of that: having responsibility over quality/fitness for use of your product, and being able to pass the blame up to your suppliers, should you be the victim too. In other words: warranty and being an easily identifiable legal entity. Exactly the two components that Open Source development does away with. Software made by a random mix of potentially pseudoanonymous people, offered with zero warranties. This is, of course, also the reason OSS is so successful. Rapid, unstructured evolution. Can't have one without the other.

Or in short: the only way I see to properly mitigate these kinds of threats is to ditch OSS and make all software commercial again (and legally force vendors to stop with the "no warranty" clause in licensing). That doesn't seem like a worthwhile trade-off to me, though.

pixl97 • 4 months ago

>the only way I see to properly mitigate these kinds of threats is to ditch OSS and make all software commercial again (and legally force vendors to stop with the "no warranty" clause in licensing).

Which just pushes the problem to commercial companies getting a 'friendly' national security letter they can't talk about to anyone stating they should add REDACTED to the library they provide.

TeMPOraL • 4 months ago

Correct. Hence the disclaimer in my first paragraph, which could also be stated as the threat duality principle, per James Mickens[0]:

"Basically, you’re either dealing with Mossad or not-Mossad. If your adversary is not-Mossad, then you’ll probably be fine if you pick a good password and don’t respond to emails from ChEaPestPAiNPi11s@virus-basket.biz.ru. If your adversary is the Mossad, YOU’RE GONNA DIE AND THERE’S NOTHING THAT YOU CAN DO ABOUT IT."

[0] - https://www.usenix.org/system/files/1401_08-12_mickens.pdf

coretx • 4 months ago

Code is law. As such, the "standard way" you mention is appropriate for people with zero strategic foresight. There is no absolute need to depend on third parties to solve your problems and the possibility to limit and disperse trust to mostly yourself is real. Sure, glowies can always get to you but they can't get to you everywhere nor all the time. Security/Assurance models and both proprietary and free software architecture are already adapting to such facts.

berkes • 4 months ago

Not all dependecies are "xz" complexity.

Minimizing dependencies, probably means keeping a few libs things like crypto or compression and such.

But do you need a library to color console output? Even if colored console output is a business critical feature, you don't need a (tree of) dependencies for that. I see so many, rather trivial software that comes with hundreds or thousands of dependencies, it's mind boggling really. Why have 124million people downloaded a rubygem that loads and parses a .env file, something I do in a bash oneliner? Why do 21k public npm packages depend on a library that does "rm -f"?

The answer, I'm afraid is mostly that people don't realize this isn't just some value added to their project but rather a liability.

Some liabilities are certainly worth it. XZ is probably one of them. But a Library that does "rm -f" certainly isn't.

pdimitar • 4 months ago

It's impossible to insure against in any practical terms.

The way forward is to invest heavily in a much more security-oriented kernel(s) and make sure that each program has the bare minimum to achieve what it offers as a value-add.

The human aspect of vetting seems like an impossibly difficult game of whack-a-mole. Though realistically I doubt that the bad actors have infinite agents everywhere, this also has to be said. So maybe a "sweep" could eliminate 90% of them, though I'd be skeptical.

galangalalgol • 4 months ago

Agreed, as a developer: minimize your dependencies while providing your core function. Don't grant dependencies permissions they don't need. Be granular about it. Austral lets you select what filesystem, network, etc. access each library gets.

Also, in big organizations, risk assessment is more about making sure there is someone to point the finger at, than actual security. Treating libfubar as golden because it ships with something you paid another company money for makes sense in that light. But not from an actual security mindset.

mrspuratic • 4 months ago

"reduce the attack surface" is Security 101. Noting again that sshd doesn't natively use xz/liblzma (just libz) or systemd, so I don't think I need to point out where the billowing attack surface is ;)

Apache (by way of mod_systemd) is similarly afflicted, as is rsyslogd, I guess most contemporary daemons that need systemd to play fair are (try: "fuser -v /lib64/liblzma.so.?" and maybe "ldd /lib64/libsystemd.so.?" too).

Like a Luddite I still use Slackware and prefer to avoid creeping dependencies, ever since libkrb5 starting getting its tentacles into things more than a decade ago.

galangalalgol • 4 months ago

Yeah, its almost like "do one thing and do it well" had security benefits...

SElinux has the desired sort of granular permissions at the OS level, but if everything is dynamically linked to everything else that doesn't help as the tiniest lib is now part of every process and can hence pick and choose permissions.

But even if we go full monolith OS when systemd takes over the job of the kernel and the browser, that just changes where we need those permissions implemented. We can't practice zero trust when there is no mechanism for distrust in the system.

adql • 4 months ago

> Agreed, as a developer: minimize your dependencies while providing your core function. Don't grant dependencies permissions they don't need. Be granular about it. Austral lets you select what filesystem, network, etc. access each library gets.

Still wouldn't help for this particular exploit.

galangalalgol • 4 months ago

berkes • 4 months ago

In a way it would.

If a software project has hundreds of dependencies, finding that one that was compromised is hard, impossible even. But if it has three dependencies (that aide in the core functionality) keeping a keen eye on them is much easier.

When I look at a typical `node_modules` or `pipenv` directory, I see there's absolutely no way I can vet that all is safe in there. When I look at my typical cargo tree, the four or five dependencies (of dependencies) are doable to just go over every so often.

Automation helps. But that doesn't give me the confidence that just opening the project pages of the stuff that I use, once every few months does for me.

pdimitar • 4 months ago

Since I didn't keep as current as I wanted to be (work and life happen a lot lately), what could have prevented it?

usefulcat • 4 months ago

> The way forward is to invest heavily in a much more security-oriented kernel(s)

While I don't disagree that kernels should be secure, I also don't see how that would have helped in this case, given that (AFAICT) this attack didn't rely on any kernel vulnerabilities..

pdimitar • 4 months ago

True, I wasn't specific enough. The attack exploited that nobody thinks security is a serious enough problem. It's a failure of us (the technical community) as a whole, and a very shameful one at that.

skywhopper • 4 months ago

IIRC Debian has wiped and is rebuilding all their build hosts, so yes.

But while I understand what you mean, I would not call improving the security of a piece of software “halting progress”. Security improvements are progress, too. Plus, revisiting processes and assumptions can also give opportunities to improve efficiency elsewhere. Maintenance can be an opportunity if you approach it properly.

berkes • 4 months ago

What I meant with "halting progress" is what commonly happens when a piece of software is "rewritten from scratch". Users (or clients or customers) see no improvements for years or weeks, while the business is burning money like mad.

The main reason why I am firmly opposed to "rewrite from scratch" or "we'll need some weeks to refactor¹".

Removing upstream dependencies and replacing them with other deps, with no-code, or with self-written code² is a task that takes long time during which stakeholders see no value added other than "we reduced the risk"; in case of e.g. SAAS that's not even risk these stakeholders are exposed to, so they then see "nothing improving". I'm certain a lot of managers, CTOs and developers suddenly realize that, wow, dependencies really are a liability.

¹ I am not against refactoring, just very much against refactoring as standalone, large task. Sometimes it's unavoidable because of poor/hard choices made in the past, but it's always a bad option. The good option would be "refactor on touch" - refactoring as part of our daily jobs of writing software.

² Too often do I see dependencies that are redicoulously simple. Left-pad being the posterchild. Or dependencies that bring everything and the kitchen-sink, but all we use is this one tiny bit that would've cost us less than 100 lines to write. Or dependencies to solve -- nothing really? Just that no-one took the time to go through it and remove it. And so forth and so on.

coretx • 4 months ago

Not fallout but increased vigilance is the expected most significant outcome. Some 5+ years ago or so I listened to a Debian guy his talk about reproducible builds and security, he was stressing the audience to be aware of just happened in a very detailed manner. One of the details he mentioned was glowies having moved their focal point to individual developers and their tooling & build systems. At least some people who matter have been working on these threats for many years already, maybe more people will start to listen to them; in such case this entire debacle could have a net positive effect on the long run.

ProblemFactory • 4 months ago

> Was the backdoor used on obscure build servers or obscure pieces of build infrastructure somewhere?

And developer machines. The backdoor was live for ~1 month on testing releases of Debian and Fedora, which are likely to be used by developers. Their computers can be scraped for passwords, access keys and API credentials for the next attack.

Macha • 4 months ago

> Are software developers (finally?) going to remove dependencies(now proven to be liabilities!), causing months of refactoring and rewriting, without any other progress?

We've been here before, with e.g. event-stream and colors on npm. So I don't think it will change much. Except maybe people will stop blaming it on JS devs being script kiddies in their mind, when they realise that even the traditional world of C codebases and distro packages are not immune.

gammalost • 4 months ago

You can't really remove dependencies in open source. It is so intertwined at this point that doing it would be too expensive for most companies.

I think the solution is to containerize, containerize and then containerize some more times and make it all with zero trust in mind.

rwmj • 4 months ago

Containerizing is entirely the worst response here. Containers, as deployed in the real world, are basically massive binary blobs of completely uncertain origin, usually hard to reproduce, that easily permit the addition of unaudited invisible changes.

(Yes yes, I know there are some systems which try to mitigate this, but I say as deployed in the real world.)

gammalost • 4 months ago

Your application is already most likely a big binary blob of uncertain origin that's hard to reproduce. Containers allow these big binary blobs of uncertainty to at least be protected from each other.

adql • 4 months ago

Pretty much; updating say libssl in a "traditional" system running app, or maybe 2-3 dependent apps fixes the bug.

Put all of them in containers and now every single one needs to be rebuilt with the dep fixed and instead of having one team (ops) responsible, you now need to coordinate half of the company to do so. It's not impossible but in general much more complex, despise containers promising "simpler" operations.

...that being said I don't miss playing whack-a-mole game with developers that do not know what their apps need to be deployed on production and for some retarded reason tested their app on unstable ubuntu while all of the servers run some flavour of stable linux with a bit older libs...

funcDropShadow • 4 months ago

Docker containers are not really a security measure.

gammalost • 4 months ago

It is a security measure. Sure it doesn't secure anything in the container itself. But it secures the container from other containers. Code can (as proven) not be trusted, but the area of effect can be reduced.

65a • 4 months ago

Only with additional hardening between the container and the kernel and hardware itself.

fsflover • 4 months ago

> What if xz contains a hidden buffer overflow or other vulnerability, that can be exploited by the xz file it's decompressing?

If you generalize this problem further, to all packages, then the only reliable solution is security through compartmentalization. On Qubes OS, any file I open, including .jpg and .avi, can't have the access to my private data or attack the admin account for the whole computer. This is ensured by hardware-assisted virtualization.

JoshTriplett • 4 months ago

> the only reliable solution is security through compartmentalization

I hope we get there eventually. Not just for standalone processes, but for individual libraries. A decompression library could run inside a WebAssembly sandbox, with the compressed file as input, the uncompressed file as output, and no other capabilities.

funcDropShadow • 4 months ago

What does this have to do with WebAssembly? That is another runtime that adds complexity. Apple has been sandboxing codecs for a long time. They run in a sandboxed process that is only communicating through stdin and stdout or something similar, if I remember correctly. You can ran native code directly. Adding a runtime with a JIT-compiler makes it harder to understand what is going on.

JoshTriplett • 4 months ago

WebAssembly can be run in-process rather than requiring a process switch, and it can be easier to port library code to run inside a WebAssembly sandbox than a completely separate process. Also, sandbox mechanisms for separate processes are not always as robust, since they have to give access to any direct syscalls the process makes, whereas WebAssembly completely insulates a library from any native surface area.

01HNNWZ0MV43FF • 4 months ago

There's AOT wasm too. Firefox uses it to sandbox some stuff. https://hacks.mozilla.org/2021/12/webassembly-and-back-again...

nottorp • 4 months ago

WebAssembly is the new Rust, I think

Hey, no one proposed to rewrite xz in Rust yet! I'm sure that would automatically protect any project from social engineering attacks!

joeyh • 4 months ago

Running xz in a sandbox would not prevent an attack that causes it to modify source code in a .tar.xz that is being streamed through it.

JoshTriplett • 4 months ago

No, it wouldn't, but that wasn't the attack here. And code outside the sandbox could check a checksum of the uncompressed data, to ensure that the decompression can't misbehave.

sergioisidoro • 4 months ago

It's a bit ironic that after a trust attack this person ends the article sayin

> I do have a xz-unscathed fork which I've carefully constructed to avoid all "Jia Tan" involved commits.

He may be fully legitimate, and perhaps a famous person in OSS (which I was unfamiliar with), but still ironic :)

KyleSanderson • 4 months ago

There seems to be a fundamental misunderstanding with a lot of these writeups. Are they 100% sure history was not rewritten at any point? Going back in time on the repo prior to listed involvement doesn't do anything as the attacker had full control. Starting from the last signed release prior to their involvement is the only way to actually move this forward (history may be fully lost at this point), the rest is posturing.

mxmlnkn • 4 months ago

Even history rewrites would be visible with Github's new Activity tab, e.g., see the two force-pushes in llama.cpp https://github.com/ggerganov/llama.cpp/activity So, while, yes, git history can be rewritten, commits pushed to Github can effectively never be deleted. Personally, I find this to be a downside. Think, personal information, etc. But, in this case, it is helpful. Of course, the repository is suspended right now, so the Activity cannot be checked.

azornathogron • 4 months ago

While it's certainly possible to rewrite git history, it's tricky to do it without other maintainers or contributors noticing, since anyone trying to pull into an existing local repo (rather than cloning fresh) would be hit with an unexpected non-fast-forward merge.

It seems likely to me that Lasse Collin would have one or more long-standing local working copies.

So IMHO injecting malicious changes back in time in the git history seems unlikely to me. But not strictly impossible.

KyleSanderson • 4 months ago

Based on how this has gone (remember xz has effectively been orphaned for years, and the majority of long-standing setups were using the release archives), unless if Lasse has never run any code from Jia (unlikely) I'd consider the entire machine untrusted (keys, etc). Provided the tarballs are still signed from that date, from another immutable source, that's really the only starting point here to rebuilding.

pdw • 4 months ago

In any case Debian has its own archive of every xz-utils version they've used in the past.

rkta • 4 months ago

The attacker had access to the GH mirror of the repo. The original repo remained at https://git.tukaani.org/

fl7305 • 4 months ago

> Are they 100% sure history was not rewritten at any point?

With git, one way to check is if other people still have clones of the xz repository from a time when it was trusted.

If you suspect the repo history has been tampered with, you can check against those copies.

I believe it would be hard to introduce such a history rewrite, since people pulling from the xz repo would start getting git error messages when things don't match up?

I don't know to what degree intentional SHA-1 hash collisions could be used to work around that?

dist-epoch • 4 months ago

You can create pairs of SHA-1 hash collission, but not for a particular existing SHA-1 hash (the git one)

AtNightWeCode • 4 months ago

People think git is immutable. It is not.

Lichtso • 4 months ago

Yes and no.

A local GIT repo can be changed (including its history) however you please. But once you have shared it with others you can't take that back. If you try to, then others will notice that the hashes mismatch and that their HEAD diffs uncleanly.

I know the term is infamous here, but GIT is essentially a blockchain. Each commit has a hash, which is based on the hashes of previous commits, forming a linked list (+ some DAG branching).

The_Colonel • 4 months ago

> If you try to, then others will notice that the hashes mismatch and that their HEAD diffs uncleanly.

So it relies on a human noticing and acting upon it. People not noticing backdoors being merged into the project is kinda the source of this problem.

fl7305 • 4 months ago

craftkiller • 4 months ago

Its a Merkle Tree. They were invented 3 years before blockchains: https://en.wikipedia.org/wiki/Merkle_tree

Lichtso • 4 months ago

It also uses a Merkle tree to compress the snapshot versions associated with commits. But the actual commit structure builds on top of that. A pure Merkle tree or forest would only give you a set of overlapping snapshots, without any directionality. So, I think it is fair to call it a blockchain as well.

dboreham • 4 months ago

ptx • 4 months ago

Well, it is and it isn't: It has mutable pointers (branches and tags) to immutable nodes in a graph (commits).

fl7305 • 4 months ago

Can you elaborate? Are you thinking of intentional SHA-1 has collisions? Would that work in practice?

AtNightWeCode • 4 months ago

The history. Every time something like this attack happens people think they can read the complete git history in the repo.

fl7305 • 4 months ago

If some commits are signed by people you trust, can the chain before that still be compromised?

smartmic • 4 months ago

Concerning history rewrite, it makes sense to point to Fossil and its major difference to Git:

https://fossil-scm.org/home/doc/trunk/www/fossil-v-git.wiki#...

There is also a link to "Is Fossil a Blockchain?", an interesting read because the term was mentioned elsewhere is this thread.

logro • 4 months ago

Trusting anything from that actor is full on ignorant, let alone "a new decoder". It's insane.

iso8859-1 • 4 months ago

Trusting people in general is inadvisable. I haven't trusted anyone for years and I am richer than ever.

codetrotter • 4 months ago

You probably still rely on trusting others a lot more than you realize.

If you really, really, didn’t rely on trusting anyone, I don’t even see how it would be possible to exist on earth.

jiripospisil • 4 months ago

You're trusting millions of people just to be able to write this comment.

logro • 4 months ago

Those two topic don't have much in common, trusting a state level hacker actor vs. trusting people in general.

moomin • 4 months ago

> Hopefully, Lasse Collin will consider these possibilities and address them in his response to the attack.

Here's the thing: Lasse Collin was overloaded back in 2021. I've no particular reason to believe that isn't still the case. He needs help. Dealing with this solo is an incredible amount of work. Also, he needs help from a verifiably trustworthy source, verifiable in a way that doesn't require a lot of effort. In practice, that almost certainly means help from a major open source company.

I seriously doubt that's going to happen, because the people who really need to learn this lesson won't, because it's probably not in their financial interest to realize that supply chain problems start with them doing things on the cheap. While we keep on running the world's infrastructure like XKCD 2347 every so often everything's going to topple over.

user3939382 • 4 months ago

> Also, he needs help from a verifiably trustworthy source, verifiable in a way that doesn't require a lot of effort.

PGP signing parties?

moomin • 4 months ago

I remember when they were big. People signed anyone’s key, didn’t need to know them. Yes, sensible people thought this was an issue. Still happened.

j16sdiz • 4 months ago

... but `xz` is pretty much feature complete to me.

Lasse Collin was doing bug-fix-only release just fine.

adql • 4 months ago

The start of the attack was few fake accounts trying to shame the maintainer for not "developing" it constantly and so give maintainer rights to someone else.

And there wasn't really anyone to say "nope, it's fine, fuck off"

FdbkHb • 4 months ago

> And there wasn't really anyone to say "nope, it's fine, fuck off"

That's because for people for whom the project is doing fine and they haven't experienced any bug, why would they go to the mailing list, forum or whatever other form of communication channel the project has?

One has to understand and keep in mind that places that can gather feedback will invariably attract more of the negative kind than the positive kind because people who are happy are not motivated to say they're happy. Those people wouldn't even know people were complaining about xz.

There's thousands of libraries/independent software projects installed on any computer. No one has the time to check the place of all those software projects and go there just to say "hey, I'm happy, no need to change anything, thanks?", right?

People who are discontent with something on the other hand are sure to be vocal about it. But just because they're the most vocal doesn't mean they are the majority of your users.

Dalewyn • 4 months ago

I can't help but feel open source's responses ultimately don't address the root of the problem.

Yeah okay, reverting to 5.4.6 or some version from over 2 years ago might "solve" the immediate problem that is the backdoor, but it's not going to solve anything else.

More specifically, I've not heard so much as a rumor that any of the dependents will contribute time and manpower to the project they rely so heavily on. I find it amusing that it was someone from Microsoft, a company reviled by a lot of the open source (and particularly FOSS) community, who brought this problem to light.

Producing something needs time and manpower, and time and manpower ultimately are not free (both beer and libre).

jmclnx • 4 months ago

>Yeah okay, reverting to 5.4.6 or some version from over 2 years ago might "solve" the immediate problem that is the backdoor, but it's not going to solve anything else.

The author suggested going back to 5.3.x, I tend to agree with this. From what I read, "Jia Tan", had hands in 5.4.x, if true I would revert back to a version earlier than 5.4.x. "Jia Tan" proved to be quite skilled at Obfuscation.

imjonse • 4 months ago

I am wondering if the person who sent the patch to disable systemd reliance on lzma shortly after the release of the backdoored xz knew about the plan. Maybe an agent of a competing entity?

skywhopper • 4 months ago

my gut instinct is that xz needs to be rolled back to its pre-attack state, but obviously that would probably also reintroduce some bugs and likely break some things. Still, very curious to see some analysis on the impact of doing so, because this article points out, xz is in a critical path for lots of system level processes.

fullstop • 4 months ago

There were symbol changes in recent releases, and things like apt link to liblzma. If liblzma were downgraded without also updating apt at the same time you could be left with a non-functioning apt.

geggo98 • 4 months ago

That’s a good start. In the long run probably three things are necessary:

1) wiring critical software in a language that protects better against such exploits. Might be Rust, Go, perhaps also C# and Nim.

2) Making reproducible builds the norm, that start from the original source code repositories (e.g., based on a Git hash)

3) making maintainers more resilient against social attacks. This means more appreciation, less demands, and zero tolerance against abuse. If the maintainer can be pressured, I am at risk.

The last one is probably the most difficult.

bilekas • 4 months ago

> It feels good to not need to worry about dpkg and tar. I only plan to maintain this fork minimally, eg security fixes.

This is exactly the problem in the first place, lack of support for maintainers.

OP themselves say "I will only minimally maintain this fork". Okay, but it's so easy in hindsight to criticize what has happened.

> Hopefully Lasse Collin will consider these possibilities and address them in his response to the attack.

I can't even imagine how he's feeling these days.

dmitrygr • 4 months ago

> I can't even imagine how he's feeling these days.

None of this is his problem or fault. I see no reason he should feel anything about this. Keeping everyone safe was never his job. He wrote code and gave it away for free. That should be enough.

david_draco • 4 months ago

anyone who has ever been pickpocketed or robbed or worse will know reason and feelings are different things

bilekas • 4 months ago

> I see no reason he should feel anything about this.

Absolutely agree, but from the sounds of the emails at least, he was going through a bad time then, and nobody feels good when they realise they were taken advantage of.

ducktective • 4 months ago

I'm here wondering why big tech companies that have Too Much to Lose didn't already massively fund a project that freaking sshd depended on it (through systemd).

Like how does it hurt Google to assign 100 people to review and investigate commits of some project as basic and fundamental as a compression tool

WesolyKubeczek • 4 months ago

Poke at any large company at all, and you'll find that their in-house critical fundamental infrastructure thing is chronically underfunded, understaffed, bug-ridden, everyone is worried but no budget is ever approved.

ghaff • 4 months ago

Within a given part of an organization, just about everyone thinks they're underresourced and understaffed.

dlandau • 4 months ago

That's exactly what they do, though: https://cloud.google.com/assured-open-source-software/docs

arp242 • 4 months ago

Google also does code reviews for some commonly used projects (or maybe that's part of the same thing? I don't know). I went through that last year with one of my Go libraries.

The idea is good, but the entire process is so bureaucracy-heavy and time-consuming that I found it both frustrating and entertaining in equal parts; like something out of Brazil (the film, not the country). So many emails, so many video meetings, so many people involved, so much talking. And all for looking at a 4,500 line Go library.

"Here's the code; just clone and look at it, and let me know if you find something"... It's not like you need my permission to do any of this *shrug*.

dlandau • 4 months ago

Brazil the country is also known for onerous bureaucracy

galangalalgol • 4 months ago

MS funded people and pipelines to analyze it. Jia Tan convinced them to disable the fuzzing that was designed specifically to find malicious behavior, using social engineering, and one MS engineer did find it.

mkj • 4 months ago

What were the MS analysis projects? The fuzzing was google-sponsored.

galangalalgol • 4 months ago

You are correct. The article I read led me to conflate the Azure engineer's valgrind triggered activity with oss-fuzz. Which is as you say a google effort.

01HNNWZ0MV43FF • 4 months ago

Very shrimple https://en.wikipedia.org/wiki/Tragedy_of_the_commons

junon • 4 months ago

sshd didn't depend on it, to be fair. Not officially at least.

Maledictus • 4 months ago

Is XZ embedded affected by any of this?

squarefoot • 4 months ago

Good question. Embedded systems are harder, sometimes impossible, to upgrade, and there are chances a backdoor in a small board inside an appliance that doesn't offer easy physical access could take years before it is found and removed.

fsniper • 4 months ago

It's like security 101. If a system has been infiltrated, you can't trust any part of it. So it's better to discard any part that has been reached or possibly affected.

Perhaps it's the correct action to distrust xz/lzma or any source code this team has control over and switch to alternatives. If there are no alternatives, to start ones.

usrusr • 4 months ago

Hiding more backdoors in the library would only increase the risk of getting discovered. Care is certainly advised on the source level, but I'd leave the paranoia to the state of systems where the code has run.

From the attackers' perspective, what they'd want to do is use their project infiltration success as little as possible, only enough to squeeze in other backdoors completely unrelated to xz. But that's all operations, not development.

iso8859-1 • 4 months ago

What are your definitions of 'system' and 'any part'? Any big company has been breached at some point. They don't throw away all their hardware everytime, even though it was connected. You have to draw the line somewhere.

You're assuming a world with separate hardware and software. That's not the case any more. We have closed sourced firmware running anywhere and no way to verify what's running.

fsniper • 4 months ago

Sure that's a problem for threat assessment process. And I totally agree in today's world software/hardware and wetware are too interconnected. And that's another threat for this kinds of attacks.

In this case, is the whole git repo a threat? Or are just the manually created distribution files? Threat actors' access reach defines that. As time passes by we see that reach is not too limited. They even reached to other software with patches too. So that assessment should be done.

nottorp • 4 months ago

> If a system has been infiltrated, you can't trust any part of it. So it's better to discard any part that has been reached or possibly affected.

systemd! let's discard systemd!

ptx • 4 months ago

They just added an example to the documentation[0] of how to implement the sd_notify protocol without linking to libsystemd, so a little bit of discarding systemd (or at least parts of it) does seem to be part of the solution.

[0] https://github.com/systemd/systemd/pull/32030/files

cb321 • 4 months ago

People have been bandying about "10 lines of C", but I'm curious if you know why the protocol is not "2 characters" of shell, namely ":>PATH" (ok, ok, PATH is probably something like /run/serviceName/I-B-ready). At the user (i.e. service daemon) -level this seems much simpler. (EDIT: and systemd would unlink the file as soon as it "gets the message", of course.)

There's just a 40 year culture of using some "official" lib to implement socket protocols - even if the docs suggest you roll your own. I feel like file creation escapes that "reach-for-the-official-lib TCP/UDP/datagram" culture.

It's probably not harder for systemd either if they just use/require the Linux inotify and incorporate that into its select or poll or whatever. I mean, if they wanted to be portable to non-inotify kernels some timeouts/stat-loop would be an ok fallback that would probably be rarely-to-never needed.

It sounds like it's not even hard to add this simpler channel in after the fact just as an alternative option for `whateverd` and then deprecate the datagram one for 10 years (if they even care to).

lrem • 4 months ago

But this suggests reimplementing xz/lzma. Which would cost money. Hence, won't be done.

dspillett • 4 months ago

> But this suggests reimplementing xz/lzma.

If there is a known good copy of the repo from before the attacker had sufficient access to alter history, then that is an acceptable starting point.

From there you look at each update since and assess what they do to decide if you want to keep (as they are valid improvements/fixes) or discard them. If some are discarded, then later ones that are valid may need further work to integrate them into the now changed codebase. Similar to Debian assessing upstream security patches to the latest version to possibly back-port them to the version they have in stable, when there is significant disparity (due to a project being much faster moving than Debian:Stable).

As xz/xzutils is a relatively stable package, with very few recent changes, this should be quite practical. A full rewrite shouldn't be needed at all here.

WesolyKubeczek • 4 months ago

> If there is a known good copy of the repo from before the attacker had sufficient access to alter history, then that is an acceptable starting point.

I heard someone calling themselves “Honest Ivan” has just the thing, totally trustworthy.

dspillett • 4 months ago

jiripospisil • 4 months ago

But there are alternatives, most notably zstd.

WesolyKubeczek • 4 months ago

It's a different algorithm made for a different purpose.

kzrdude • 4 months ago

sadly, the zstd cli tool links to lzma right now (as installed by some distros) :/

fsniper • 4 months ago

a half-arsed search resulted in this half-baked rust library: https://github.com/gendx/lzma-rs

nomilk • 4 months ago

It seems detecting holes in jia’s code could be extremely difficult. Given the stakes, as a precaution, would it be viable to simply wipe and rewrite (from scratch) the last ~2 years of commits to xz?

JetSpiegel • 4 months ago

Why not add another layer to the tinfoil hat?

How do we know "Jia Tan" is not a Facebook op to "nudge" people to switch to Zstandard?

Macha • 4 months ago

Yeah, it's left me a little disappointed in Arch in particular that they didn't follow the lead of Debian and Fedora and revert to a much older version, instead just building 5.6.1 from the git repo and basically defended it as "the hacked build script checked for dpkg/rpm anyway".

diggan • 4 months ago

Is this what you're referring to?

> Regarding sshd authentication bypass/code execution

> Arch does not directly link openssh to liblzma, and thus this attack vector is not possible. You can confirm this by issuing the following command:

> However, out of an abundance of caution, we advise users to remove the malicious code from their system by upgrading either way. This is because other yet-to-be discovered methods to exploit the backdoor could exist.

https://archlinux.org/news/the-xz-package-has-been-backdoore...

I'm not finding anything from Arch mentioning dpkg/rpm, the linked article above is the latest article about the xz compromise from the Arch homepage.

Macha • 4 months ago

- https://bbs.archlinux.org/viewtopic.php?pid=2160841#p2160841

- https://gitlab.archlinux.org/archlinux/packaging/packages/xz...

- There were some comments in the middle of the giant openwall mailing list thread I can't find now because they're in the middle of 30,000 replies

antoinealb • 4 months ago

Arch Linux is not vulnerable to this specific attack, which requires sshd to be linked to liblzma. This link is provided by out-of-sshd patches, that Arch does not apply to their build.

cowsandmilk • 4 months ago

The point here is there is uncertainty in all commits by Jia Tan, Arch’s focus is on this specific hack, but are there other vulnerabilities in the hundreds of commits to the git repo from the same author?

skywhopper • 4 months ago

But as this article points out, liblzma is used in other crucial processes, and is generally trusted, often probably being run as root. The known bad actor contributed lots of code to xz that isn’t involved in the SSH backdoor. To assume it’s all innocuous would be truly foolish.

TheTxT • 4 months ago

Arch tries to always be as current as possible, for better or for worse. So this definitely makes sense for arch

k3vinw • 4 months ago

Wow. So for the xz package it looks like they changed the upstream to this (edit: the original maintainer’s personal repo, Lasse Collins) git repo that still contains Jia Tan’s commits: https://git.tukaani.org/?p=xz.git

tl;dr they re-enabled the sandboxing previously disabled by Jia Tan.

rpigab • 4 months ago

What if over 80% of all open source projects are secretly sleeper agents for various malicious actors, states, terrorists and whatnot, and they pretend to give use precious software updates for free so they can attack later?

What if proprietary software is running the same way, except they don't even give you free updates and you can't audit the source code and have to trust them when they push updates?

What if your mother gave birth to you just so she can slap you in the face when you're 30?

Yeah, we can go very far, but in this moment, xz is under so much scrutiny that in 2-4 weeks, I'd trust it with my life unless the big orgs looking at it issue more reverts (hence, the delay). So if there are issues, they're everywhere else.

rfoo • 4 months ago

Nit-picking but, eh, png does not use lzma at all.

> PNG compression method 0 (the only compression method presently defined for PNG) specifies deflate/inflate compression with a sliding window of at most 32768 bytes. Deflate compression is an LZ77 derivative used in zip, gzip, pkzip, and related programs.

jwilk • 4 months ago

I don't think joeyh wanted to imply that PNG uses liblzma. PNG is just a convenient place to put opaque binary stuff that'd trigger an xz compression bug.

rfoo • 4 months ago

I still don't understand how would that work. The post said:

> Let's say they want to target gcc. Well, gcc contains a lot of documentation, which includes png images. So they spend a while getting accepted as a documentation contributor on that project, and get added to it a png file that is specially constructed, it has additional binary data appended that exploits the buffer overflow. And instructs xz to modify the source code that comes later when decompressing gcc.tar.xz.

It says "when decompressing", and I would imagine that such a bug needs specifically constructed lzma stream to trigger. If you want to do it by changing a source file (a png here) you need to make "second-order" bugs: i.e. the compressor needs to output a broken lzma stream which when later decompressed would exploit (not simply cause) a memory corruption bug. This is too brittle [1] and are very likely to be detected.

[1] Disclaimer: I'm not an expert in writing backdoors. I consider myself reasonably competent for writing exploits, and I've written deliberately buggy programs (for CTFs) before.

j16sdiz • 4 months ago

Consider this backdoor:

- if the decompressed stream contain a magic keyword, run the rest of the file as x86-64 binary.

Now you just need any opaque binary file to host the payload. A PNG works fine, because most decoder don't care extra bytes at the end.

joeyh • 4 months ago

decompressing gcc.tar.xz which contains foo.png followed by main.c, the decompressor is instructed by the hidden data in the png how to alter the code.

GoblinSlayer • 4 months ago

The build script decoded a precompiled backdoor code from a binary test file that wasn't really an archive, but encrypted with Caesar cipher. Any blob can be used like this as a trivial steganographic contained.

rfoo • 4 months ago

Any code that extract data from such a blob would look very suspicious.

lakomen • 4 months ago

I'm disappointed, to put it very mildly how archlinux handled the matter. They still use version 5.6.1 and assume that switching from github to the one hosted by Lasse fixes the issue. They say "our sshd isn't compromised". But what the author of this article wrote, who knows what else might be affected. There's a forum post on the archlinux forum, which was closed by ewaller, an administrator account, with a weird reason, that this thread was only to inform people, but when people started calling out the malpractice of the archlinux maintainers the thread got locked.

To me, this is very suspicious.

lispm • 4 months ago

Why not get rid of xz completely? Hoe about using a simpler piece of software, which could be maintained by more people?

adql • 4 months ago

> Why not get rid of xz completely?

... you mean compression package used by most big distros to make their packages? Do you really need to ask

> Hoe about using a simpler piece of software, which could be maintained by more people?

It's not a complex piece of software. Lib itself is ~15k lines of code.

It does it's job well and it needs little work. It didn't had any outstanding bugs lingering unfixed for years

Complexity have nothing to do with the problem, it's just... uninteresting enough that there is no reason for contribute.

imjonse • 4 months ago

Complexity is one of the root causes of the problem. The script was added via messy autoconf scripts inscrutable to most people. That is decades old tech which has alternatives.

lispm • 4 months ago

I've seen for example here a lot of issues raised:

https://www.nongnu.org/lzip/xz_inadequate.html

simoncion • 4 months ago

> ... you mean compression package used by most big distros to make their packages? Do you really need to ask

Because of the existence of gzip, zlib, bzip2, and many others, it's trivial to drop xz.

So yeah, why not get rid of xz completely?

WesolyKubeczek • 4 months ago

From reading comments here alone, I can predict the following (unhealthy) effects on software at large:

first, huge scaremongering like this writeup. It all hinges on a notion that who knows who is writing walls of who knows what code, and the code and its purpose is absolutely unpenetrable, inscrutable by anyone else at all. It's not true, as evidenced by analyses of this hack alone and by reverse engineering community at large. Of course, it requires doing what Anders has done, meaning, rolling up your sleeves and actually reading the source and trying to understand it, and not just hope someone else will do it. Whatever one person tangles, another can always untangle.

Then, there is going to be a witch hunt and people jumping on any innocent change with pitchforks (already happening, satanic panic over ibus is in the nearby discussion). The code review theater will be in full swing. Previously, people were conserving their brain energy (or masking their incompetence) by skimming walls of changes and stamping LGTM on them based on how well-formed they had been, how long they had known the author, and how the builds were not broken. Now it will become extremely hard to get any changes done at all, because the new mental shortcuts will be: too long, didn't approve; explanations too complex; just plain Reviewer Says No; and so on. Any sloppy PR denial will follow by patting oneself on the back, look at me, I just have thwarted a KGB agent.

Some people thinking they know a lot without basing such assertions in reality will try to become overnight „wonder experts in security”, barking on every shell script they don't understand, or every piece of generated text, like generated Makefiles.

Vulnerabilities akin to CVE 2022-3786 and CVE 2022-3602 — which got introduced by writing a whole new email address parser from scratch (which IIRC also got checked in as a whole wall of code at once, and I read someone blaming exactly this as the culprit) — will lead to questioning by police at least once.

Automated codebase scans with bogus reports like Daniel Stenberg wrote about in [0] will be more abound. Everyone will just jump on any "unsafe" function call without actually understanding its context, and will keep pestering authors to "make a fix" because potentially something (gasp!) may happen.

Later everyone will get tired of this charade, and everything will come back to "normal", probably with added processes and red tape to make FOSS maintainership even more of a liability than it is today. Nothing will be done at all to make it less of a burden.

[0] https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-f...

cletus • 4 months ago

The US spends $30-60 billion a year on agriculture and subisides [1]. This is controversial for all the obvious reasons. Without subsidies we'd waste less food but overproduction of food is an intentional objective of these programs. Why? Because if there's a major drought or crops are lost to ice or snow or flooding, Americans won't starve. It's why we have things like the US government having a reserve of over a billion pounds of cheese [2].

People not starving is a national security interest.

It's getting to the point where the software we rely on is also a national security interest. The US government should be paying to maintain and improve this software. Security risks in Linux and core packages threaten to shut down key infrastructure.

Paying developers to maintain open source projects that are actually used could be an incredibly effective use of tax dollars.

[1]: https://usafacts.org/topics/agriculture/#581290c9-a960-49aa-...

[2]: https://www.deseret.com/2022/2/14/22933326/1-4-billion-pound...

Gormo • 4 months ago

It's not valid to regard the potential of negative consequences in any sphere of human life as a "national security risk".

It's not valid to presume that the only way to mitigate risks is through top-down political intervention.

And it's absolutely not valid to presume that making FOSS communities dependent on political subsidies would not have much worse and much longer term consequences than the problems we are trying to mitigate.

In fact, it's entirely possible that the political intermediaries who control the purse-strings would have an even greater capacity to introduce their own backdoors or otherwise compromise security in pursuit of their own ambitions.

What you're proposing here might well represent trying to keep out one set of threat actors by handing the keys to the castle over to another set of threat actors. And this would be on top all of the other problems it would cause: convergence toward homogeneous monocultures, project priorities being distorted by political incentives, vested interests using political influence to suppress competition from FOSS projects, etc.

It's worth pointing out that the swift detection and remediation of the xz backdoor by the community almost immediately after the threat actor pulled the trigger on their two-year long con represents a resounding success of the FOSS "many eyes" model, and it's not clear what politicians throwing money around would add to the equation.

acdha • 4 months ago

Your argument falls apart when you remember that the U.S. federal, state, and local governments are critically dependent on open source software – not just directly in things like Linux servers or Chrome/Edge/Firefox but also open source components used in appliances or compiled into commercial software. It is quite reasonable to argue that even a narrow approach of improving only components they run would be justifiable on those grounds and it’d be a tiny part of, say, NIST’s budget to fund developers directly or to pay some group like the Linux or Apache foundations to support an open source maintainership team.

Gormo • 4 months ago

Organizations of all types -- government and otherwise -- are dependent on a wide variety of externally-sourced solutions for mission-critical operations. They can and do develop their own processes for testing and vetting potential solutions against their own criteria for performance, reliability, maintainability, and security.

Government orgs can and do contribute the results of the work they do in this regard upstream to FOSS projects. This has never not been the case, and when government-employed developers release the work they do to meet their own security requirements to the broader community, everyone benefits.

But this is drastically different from the scenario that the preceding poster was proposing, in which government officials would assume effective responsibility for the entire project, not just act as participants in the FOSS community.

That proposal would invert the situation, and change it from government devs adhering to the norms and conventions of the community to the community adhering to the rules and priorities defined by the government, which is where the negatives I outlined above would come into play.

illiac786 • 4 months ago

I fail to see how you addressed the previous comment here. You ignored 80% of the points made.

Donations is fine, but it needs to be “no strings attached”, otherwise I agree with the GP that the risk of weaponising FOSS may become even greater.

I do agree that the US government is critically dependent on FOSS by now though. But “why throw money at it if it ain’t broken?” is the prevalent mentality, especially when everyone can have their own definition of “broken”…

acdha • 4 months ago

unethical_ban • 4 months ago

It's worth pointing out that we got lucky with a savvy tester stumbling on the backdoor in the dark. It wasn't anyone's job to find this backdoor, and arguably if it had been designed just a little better, no one would have noticed.

I wouldn't have gov't be maintainers of the main repos, but either be assigned to vetting critical repos, or mirroring them and co-maintaining copies endorsed for security (if I agreed gov't should be involved).

The ultimate questions is, are our critical systems (kernel, systemd, core userland) safe *enough* against future similar attacks, that we are okay trusting the global economy to another lucky valgrind test?

You incorrectly implied that a national security risk is "the potential of negative consequences in any sphere of human life". If we didn't get lucky, this could have been an economic and human catastrophe. That's not an inconvenience.

Gormo • 4 months ago

> It's worth pointing out that we got lucky with a savvy tester stumbling on the backdoor in the dark.

I believe in Bayesian probability a lot more than I believe in luck. The fact that a random sysadmin investigating performance issues was able to rapidly unravel this whole thing is at least a minimal indicator that either (a) this particular attack was largely incompetent, and any of the many "savvy testers" in the community would have uncovered it within short order, but more sophisticated attacks might remain undetected, or (b) this was a sophisticated attack, and the community is generally resilient to this form of infiltration.

Spooky23 • 4 months ago

I think we’re lucky that the compromise was ham fisted enough to be trivially detectable. I’m sure other, better attacks exist, implemented by smarter actors.

Software is a national security threat because IT is increasingly dependent on computing. Plenty of civilian tech is being used in Ukraine to good effect.

Smart stakeholders are going to watch where stuff is used and target the supply chain. It may be Russians, Mideast stakeholders, or domestic extremists.

Software is to 2024 what a truck bomb was in 1994.

chjj • 4 months ago

I'm not sure why people keep misidentifying the problem as "lack of funding". Lasse Collin was doing fine as a maintainer up until Jia Tan showed up. He was psy-op'd into believing there was a crowd of angry people eagerly awaiting a new release when there wasn't. No real person was unhappy with the way he'd been maintaining xz.

Cthulhu_ • 4 months ago

Funding aside, single individuals being responsible for software is not a good thing, see bus factor.

Gormo • 4 months ago

In fact, there was no issue with Lasse Collin maintaining xz as a single individual, and creating a false impression to the contrary was the primary tactic used by the antagonist to gain access to the project.

rebolek • 4 months ago

Fortunately there was Jia Tan to help him. /s

diggan • 4 months ago

I'm not saying money would absolutely fix the issue, but I could also see it helping. If Collin was approached by a government that said "Hey, the thing you're maintaining is important, if you want, we'll fund 2 additional full-time maintainers that can contribute based on your guidance", maybe Collin would be in a better position to ensure the Jia Tan contributions were genuine and proper.

chjj • 4 months ago

There's probably a number of things that could improve the situation. Mindlessly throwing money and government at a problem almost never improves things.

Which government bureaucracy decides how much Lasse Collin should be paid? Based on what metrics? This is a giant can of worms.

rebolek • 4 months ago

If I was the maintainer and was approached by government telling me, "hey, here are two folks who're going to be two new full-time maintainers and we're funding them" I certainly would be worried.

diggan • 4 months ago

Similarly, if the government approached me and said "Here, embed this black-box binary into your build process", I'd be worried too. But luckily, no one suggested this, nor what you wrote about :)

irdc • 4 months ago

Getting help for mental health issues is a whole lot easier if you have the money.

lenerdenator • 4 months ago

IIRC the xz package was maintained by an individual in a place with at least some socialized healthcare, but correct me if I'm wrong (I'm not trying to be snarky here, please do).

dboreham • 4 months ago

Perhaps socialized medicine is the solution.

Gormo • 4 months ago

Socialized or politicized?

transportgo • 4 months ago

But if he was paid he might not have given up control

chjj • 4 months ago

Okay, sure, but rather than trying to solve a problem by throwing money at it (or worse, trying to solve it with government intervention), maybe it's better to think of other mitigations.

For example, maybe developers need to be made aware of potential psyops by attackers (the publicity surrounding this issue probably made some progress on that front).

udev4096 • 4 months ago

He never mentioned about any financial problems, it was more about his mental health

mwcampbell • 4 months ago

throw4847285 • 4 months ago

Agricultural subsidies are a terrible example because they're so corrupt. The cheese reserve doesn't exist because it's gonna protect Americans from starvation. It exists because the dairy industry massively overproduces. Only a small amount gets converted into cheese. Millions of gallons just get dumped[1]. The dairy industry is entirely unsustainable, especially given that the government keeps the price of milk low because it's considered vital for children's development, dairy lobby propaganda with no basis in fact. And these massive subsidies don't even help small farmers. The only way to keep up with the absurd artificial demand is massive factory farming of genetically engineered super cows bred in a lab to produce as much milk as possible with a life expectancy a third of a normal cow.

[1]: https://www.wsj.com/articles/americas-dairy-farmers-dump-43-...

(I guess this isn't really relevant to the OP, but I recently read a fantastic book about the dairy industry and now I can't shut up).

jart • 4 months ago

Why are you advocating for the government to take control of open source projects? Is anyone here naïve enough to believe that, after being persuaded of the national security interests of these projects, they're just going to hand over money to the random people who maintain them to keep doing what they're doing? The U.S. Govt isn't Santa Claus. Look at what they did to the farming industry. Most people used to be farmers and now there aren't many farms at all, since most of it's being done by big companies. Applying that idea to open source means the government would use regulation to prevent community developers from having their software used in production, and all future work on open source code would have to be done by engineers at big tech companies. In many ways that's already the de facto system we have today. So if you get the government involved, it'll just become law, and the lone wolves in open source who big tech doesn't want to hire will be fined, sent to jail, etc. Read "Everything I Want To Do Is Illegal" by Joel Salatin.

vb-8448 • 4 months ago

It seems very stupid to me.

How long will it be before the government starts pressuring these maintainers to do the things the government wants?

Gormo • 4 months ago

For example, introducing their own backdoors.

sofixa • 4 months ago

The EU realised this like a decade ago, and have had a couple of programs around it that have been small steps in the right direction, but not enough - such as EU funder bug bounty programs, giving grants, mandates that the EU should use specific open source tooling for specific needs (e.g. VLC).

lenerdenator • 4 months ago

Doesn't .gov do a bit of that already?

Part of the problem is, no one knows who's really working on what. If you asked some of the most knowledgeable people in the GNU/Linux ecosystem who maintained xz before last week, there's a real chance they couldn't have told you, not without some investigating first. And that would only have gotten them a name, not the maintainer's personality, resources situation, etc.

There needs to be a census of sorts over the stuff that goes into the GNU/Linux ecosystem to see who needs what.

sylware • 4 months ago

It may be paying for minimal (including the SDK), but able to do a good enough job, ultra stable in time reference software/network protocol/file format maintainance.

It excludes nearly all software out there (even open source, and closed source are de-facto excluded), because most "developers" are only a bunch of scammers heavy on planned obsolescence.

That includes software "maintained" by the academic sector, like your have with MIT media labs and the nice "donations" from bill gates (to probably steer it the way he wanted, I would not be surprised this is not alien to c++ in gcc... one of the biggest mistakes in open source software), that revealed in the epstein files.

To say the least, it is far from ez. If you are an _honest_ dev, you know it is extrutiatingly hard to justify a permanent income.