Back

Does X cause Y? An in-depth evidence review (2021)

232 points28 dayscold-takes.com
levocardia27 days ago

Seems very dismissive and unaware of recent advances in causal inference (cf other comments on Pearl). Putting "throw the kitchen sink at it" regression a la early 2000s nutritional research (which is indeed garbage in garbage out) in the same category as mendelian randomization, DAGs, IP weighting, and G-methods is misleading. I do worry that some of these EA types dive head-first into a random smattering of google scholar searches with no subject matter expertise, find a mess of studies, then conclude "ah well, better just trust my super rational bayesian priors!" instead of talking with a current subject matter expert. Research -- even observational research -- has changed a lot since the days of "one-week observational study on a few dozen preschoolers."

A more general observation: If your conclusion after reading a bunch of studies is "wow I really don't understand the fancy math they're doing here" then usually you should do the work to understand that math before you conclude that it's all a load of crap. Not always, of course, but usually.

Recursing27 days ago

> I do worry that some of these EA types dive head-first into a random smattering of google scholar searches with no subject matter expertise, find a mess of studies, then conclude "ah well, better just trust my super rational bayesian priors!" instead of talking with a current subject matter expert. Research -- even observational research -- has changed a lot since the days of "one-week observational study on a few dozen preschoolers."

EA types spend a lot of time talking with subject matter experts, see e.g. https://www.givewell.org/international/technical/programs/vi...

Xen927 days ago

This is the problem of gnosis vs doxa. Regardless of one's intelligence, since the priors (or rather posteriors as at least I as Bayesian thinker would really look at paper X and then look at who made it (how much funding they had), where, and the context (is this obvious marketing) and compute from overly safe approximation it's BS, the actual probability that tou call priors) actually are reliable & you cannot know what you cannot know & Postman.

So then you are hands in the air, trying to get others internalize, a discussion culture where you actually can as normal thing claim or are by default assumed to be probably superficial.

Even if someone knows lots about foundational things like philosophy & what they want etc., statements have a certain feel to it regardless of mathematical truthness, and at some point you will do the value benefit calculation that you might be superficial but you should learn about others topics first because you know you can for less time gain more information that is valuable from them.

---

Pearl's causality as far as I see is best modelled with cdr car + NBG as embedded agency foundations + computability in the mix + signal theory (time as discrete evolution from one state in the chain to another; signal theory is relevant when you have multiple agents or the environment & PARTS of you run at different clocks or something more complex) i.e. part of formal embedded agency. Doesn't feel too meaningful without it except epistemologically (close what can we know type questions) where causality might be a good lens, especially to filter inquiries.

soerxpso27 days ago

> If your conclusion after reading a bunch of studies is "wow I really don't understand the fancy math they're doing here" then usually you should do the work to understand that math before you conclude that it's all a load of crap.

While this is true, putting the onus on the reader to understand a lot of advanced math makes it easy to avoid scrutiny by increasing the complexity of your math such that the only people who are ever going to be able to critique you are the intersection between PhD-level mathematicians and experts in the field your paper actually pertains to. Anyone can just say whatever they want and assure you that they must be right because they know more math than everyone else who's interested in that problem.

Instead of, "Understand the math before you conclude that it's all a load of crap," I would say, if it's an unreasonable level of complexity for the particular problem and you can't find a large body of other papers doing something similar with the same problem, just ignore it.

t_mann27 days ago

We don't even need to go into the 2000's. The author openly dismisses Generalized Method of Moments (published in 1982 by Lars Hansen [0]) as a 'complex mathematical technique' that he's 'guessing there are a lot of weird assumptions baked into' it, the main evidence being that he 'can't really follow what it's doing'. He also admits that he has no idea what control variables are or how to explain linear regression. It's completely pointless trying to discuss the subtleties of how certain statistical techniques try to address some of his exact concerns, it's clear that he has no interest in listening, won't understand and just take that as further evidence that it's all just BS. This post is a rant best described as Dunning-Kruger on steroids, I have no idea how this got 200 points on HN and can just advise anyone who reads here first to spare themselves the read.

[0] edit: Hansen was awarded the Nobel Memorial Prize in Economics in 2013 for GMM, not that that means it can't fail, but clearly a lot of people have found it useful.

MichaelDickens27 days ago

I think you are significantly misrepresenting what the author said. He didn't say he has no idea what control variables are. What he said is:

> The "controlling for" thing relies on a lot of subtle assumptions and can break in all kinds of weird ways. Here's[1] a technical explanation of some of the pitfalls; here's[2] a set of deconstructions of regressions that break in weird ways.

[1] https://journals.plos.org/plosone/article?id=10.1371/journal...

[2] https://www.cold-takes.com/phil-birnbaums-regression-analysi...

To me this seems to demonstrate a stronger understanding of regression analysis than 90+% of scientists who use the technique.

groby_b27 days ago

> He didn't say he has no idea what control variables are

He did say exactly that.

> They use a technique called regression analysis that, as far as I can determine, cannot be explained in a simple, intuitive way (especially not in terms of how it "controls for" confounders).

That's about as /noideadog as you can get.

roenxi27 days ago

That is unfair, he says...

> "generalized method of moments" approaches to cross-country analysis (of e.g. the effectiveness of aid)

Which is an entirely reasonable criticism. GMM is a complex mathematical process, wiki suggests [0] that it assumes data generated by a weakly stationary ergodic stochastic process of multivariate normal variables. There are a lot of ways that a real world data for aid distribution might be nonergodic, unstationary, generally distributed or even deterministic!

Verifying that a paper has used a parameter estimation technique like that properly is not a trivial task even for someone who understands GMM quite well. A reader can't be expected to follow what the implications are from reading a study; there is a strong element of trust.

[0] https://en.wikipedia.org/wiki/Generalized_method_of_moments

t_mann27 days ago

Every statistical model makes assumptions. As a general rule, the more mathematically complex the model, the fewer (or weaker) assumptions are made. That's what the complexity is for. So the criticism 'it looks complex, so the assumptions are probably weird' doesn't make sense.

If as a reader you don't understand a paper (that's been reviewed by experts), then the best thing to conclude is that you're not the target audience, not that the findings can be dismissed.

+1
roenxi27 days ago
hn_throwaway_9927 days ago

Yeah, I found this article to be annoying AF, because it seemed to fall into the same traps that he's accusing these study authors of making in the first place. It seemed by the end of it he was just trying to yell "correlation is not causation!" but in an even smarter "I am very smart" sort of way.

E.g. I certainly found myself agreeing with his points about observational studies, and there are plenty of real-world examples you can point to where experts have been lead astray by these kinds of studies (e.g. alcohol consumption recommendations, egg/cholesterol recommendations, etc.)

But when he talked about his reservations re "the wheat" studies, they seemed really weak to me and semi-bizarre:

1. Regarding "The paper doesn't make it easy to replicate its analysis." I mean, no shit Sherlock? The whole point is that it would be prohibitively expensive or unethical to carry out these real experiments, so we rely on these "natural" experiments to reach better conclusions.

2. "There was other weird stuff going on (e.g., changes in census data collection methods), during the strange historical event, so it's a little hard to generalize." First, this seems kind of hand-wavy (not all natural experiments have this issue), but second and more importantly, of course it's hard to "generalize" these kinds of experiments because their value in the first place is that they're trying to tease out one specific variable at a specific point in time.

3. The third bullet point just seemed like it could be summarized as "news flash, academics like to endlessly argue about shit."

I think the fundamental problem when looking for "does X cause Y", is that in the real world these are complex systems: lots of other things cause Y too (or can reduce its chances), so you're only ever able to make some statistical statement, e.g. X makes Y Z% more likely, on average. But even then, suppose there is some thing that could make Y Z% more likely among some specific sub-population, but make it some percent less likely in another sub-population (not an exact analogy but my understanding is that most people don't really need to worry about cholesterol in eggs, but a sub-population of people is very reactive to dietary cholesterol).

Basically, it feels like the author is looking for some definitive, unambiguous "does X cause Y", but that's not really how complex systems work.

mnky9800n28 days ago

In my own research we are investigating how fluids cause changes in rocks that allow for mineralization of CO2 and have such problems of confounding variables (not terribly unique I suppose). One thing we note is that, well, fluid comes from the sky and goes into the ground. Thus, the deeper you go, the less fluid there is since the pathways from the sky to deep into the ground become more sparse as well as needing higher pressures to enter these regions to either overcome capillary pressures in existing fracture zones or to literally break the rock (which is highly unlikely using naturally occuring pressures from fluids from the sky). And so, literally everything in all the data sets correlates with depth in some way. But in what way? well this has many dependencies as well, did the rock that absorbed some of the fluids grow in volume because of a chemical change? are the fluid pathways currently connected? What kind of rock is absorbing the fluids? Are microbes in the fluid absorbing contents from the fluid that would otherwise be used for rock changes? and so you are left with this giant pile of data (tens of terabytes) without a clear connection between fluid and rock interactions except that there is less fluids from the sky the deeper you go into the rock. This is obvious, however it is also rather unhelpful when trying to understand the other processes that exist. Of course you might say, have you tried detrending your data? And the answer is yes and to no effect. The simple truth is that this depth dependency interacts in different ways with different systems and there is no easy way to figure out how it does for each sub-system such as the fluid rock chemistry interactions, the rock fracture mechanics, the subsequent methane and hydrogen that is produced and likely consumed by microbes, etc.

whatshisface27 days ago

Have you tried checking to see if the depth dependency is different in different large-scale geological regions?

mnky9800n27 days ago

That is a really good idea. I have considered it before. A core issue is that most data across different regions is collected and stored in it's own unique way (as far as drilling cores goes anyways). So what I decided was that it might be better to develop some data pipelines first for this region, then trying to refactor those pipelines to accept data from a lot of different regions. So it's on my TODO list but it is a lot of work so I haven't gotten to it yet.

Kaotique28 days ago

I think a lot of these kinds of studies are not really about objectively studying a phenomenon but trying to prove a predetermined point. The study is designed and adjusted until it proves what it should prove. Then it's wrapped in a nice news headline which goes away with all the details and subtleties and used for political or economic gain. Reproducing the results is not interesting and not funded. Other studies are then using these results as sources to stack the house of cards even higher. I think this does a lot of harm to science as a whole because a lot of people disregard all scientific results as a result.

nkoren28 days ago

Yeah, sadly, I think it's worth having "ulterior motives" on the list.

One of the first time I got interested in reading medical studies was when I saw a bunch of headlines announcing that a randomized controlled trial had proved that echinacea was ineffective for treating respiratory problems. This surprised me, because I'd always been a dogmatic drinker of echinacea tea whenever I had a cold, and had thought that it helped. But then again, I come from a culture of damn dirty hippies, so I was open to being wrong about it. Rather than rely on the headlines, I decided to dig up the study itself.

Here's what the study actually found: that rubbing an echinacea-infused ointment on your wrists has no effect on respiratory health.

Er... yeah, no shit, Sherlock. Literally nobody uses echinacea that way. You've just falsified a total straw-man of a hypothesis, and based on the number of headlines generated off the back of this, I think it's reasonable to presume there was some kind of funded apparatus for disseminating that bogus result.

Ever since then, I've learned not to trust the headlines when it comes to trials, reserving judgment until I've looked at the methodology. When I do, a lot come up short.

kridsdale127 days ago

I’ve gotten in the habit of sending study pdf files to Claude, having it write its own Abstract and headline from the rest of the content, then comparing those to the “organic” Abstract and headline.

Xcelerate28 days ago

This is exactly what’s going on in many situations. For any proposed study, you can ask the question “Is there a possible outcome of this study that would have a strong emotional effect on someone?” If the answer is “yes”, then I’d say it’s more likely than not that the study’s results are already compromised in some subtle way.

HPsquared28 days ago

Like a lot of other noble pursuits, scientific enquiry can be corrupted by money.

uniqueuid28 days ago

Oh what fun to discover the horror of causality!

For some areas of research, truly understanding causality is essentially impossible - if well-controlled experiments are impossible and the list of possible colliders and confounders is unknowable.

The key problem is that any causal relation can be an illusion caused by some other, unobserved relation!

This means that in order to show fully valid causal effect estimates, we need to

- measure precisely

- measure all relevant variables

- actively NOT measure all harmful (i.e. falsely correlated) variables

I heartily recommend the book of why [1] by Pearl and Mackenzie for a deeper reading and the "haunted DAG" in McElreath's wonderful Statistical Rethinking.

[1] https://en.wikipedia.org/wiki/The_Book_of_Why

kqr28 days ago

Pearl's Causality is very high on my "re-read while making flashcards" list. It is depressing how hard it is to establish causality, but also inspiring how causality can be teased out of observational statistics provided one dares assume a model on which variables and correlations are meaningful.

uniqueuid28 days ago

"provided one dares assume ..." - that's a great quote which I'll steal in the future if you allow!

Most things we learn about DAGs and causality are frustrating, but simulating a DAG (e.g. with lavaan in R) is a technique that actually helps in understanding when and how those assumptions make sense. That's (to me) a key part of making causality productive.

currymj28 days ago

even if you hit all the assumptions you need to make Pearl/Rubin causality work, and there is no unobserved factor to cause problems, there is still a philosophical problem.

it all assumes you can divide the world cleanly into variables that can be the nodes of your DAG. The philosopher Nancy Cartwright talks about this a lot, but it’s also a practical problem.

shadowgovt27 days ago

And this is even before we get into the philosophical / epistemological questions about "cause."

You can make the argument, from correlative data, that bridges and train tracks cause truck accidents. And more importantly, if you act like they do when designing roadways, you actually will decrease truck accidents. But it's a common-sense-odd meaning of causality to claim a stationary object is acting upon a mobile object...

KempyKolibri28 days ago

I’ve heard Miguel Hernán’s “What If” is also excellent, but not got round to reading it.

uniqueuid28 days ago

Yes it's great!

There is also this great book on causality in ML, but it's a much heavier read:

Chernozhukov, V., Hansen, C., Kallus, N., Spindler, M., & Syrgkanis, V. (2025). Causal Inference with ML and AI.

levocardia27 days ago

For a lighter introduction to Hernán’s ideas check out:

"The C-Word: Scientific Euphemisms Do Not Improve Causal Inference From Observational Data" (https://pmc.ncbi.nlm.nih.gov/articles/PMC5888052/)

"Does water kill? A call for less casual causal inferences" (https://pmc.ncbi.nlm.nih.gov/articles/PMC5207342/)

dan_mctree27 days ago

And even if you do know there's causality (eg: the input variable X is part of software that provides some output Y), the exact nature of the causality can be too complex to analyze due to emergent and chaotic effects. It's seldom as simple as: an increase in X will result in an increase in Y

alexpetralia28 days ago

I have reflected on a good definition of causality and would be curious if anyone has thoughts or critiques of it. I am repasting part of my essay below. (https://alexpetralia.com/2023/02/25/statistics-only-gives-co...)

--

Can we nevertheless extract causality from correlation?

I would argue that, theoretically, we cannot. Practically speaking, however, we frequently settle for “very, very convincing correlations” as indicative of causation. A correlation may be persuasively described as causation if three conditions are met:

Completeness: The association itself (R²) is 100%. When we observe X, we always observe Y.

No bias: The association between X and Y is not affected by a third, omitted variable, Z.

Temporality: X temporally precedes Y.

kqr28 days ago

I feel like you have this backwards. In the assignment Y:=2X, each unit of Y is caused by half a unit of X. In the game where we flip a coin at fair odds, if you have increased your wealth by 8× in 3 tosses, that was caused by you getting heads every toss. Theoretically establishing causality is trivial.

The problem comes when we try to do so practically, because reality is full of surprising detail.

> No bias: The association between X and Y is not affected by a third, omitted variable, Z.

This is, practically speaking, the difficult condition. I'm not so convinced the others are necessary (practically speaking, anyway) but you should read Pearl if you're into this!

uniqueuid28 days ago

You are missing one crucial additional condition:

- No colliders have been included in the analysis, which would introduce appearance of causality that does not exist

dan_mctree27 days ago

You probably also need at least: - Y does not appear when X does not - We need an overwhelming sample size containing examples of both X and not X - The experiment and data collection and trivially repeatable (so that we don't need to rely on trust) - The experiment, data collection and analysis must be easy to understand and sensible in every way without leaving room for error

And as another commenter already pointed out: You can't really eradicate the existence of an unknown Z

istjohn27 days ago

Lightning doesn't cause fire because I have observed fire created by matches under a blue sky.

(I've also observed lightning that was not followed by fire. We really need to stop wasting money on lightning rods.)

HPsquared28 days ago

Ruling out all Z is the almost-impossible part. It's hard to prove a negative, especially with incomplete information.

stonemetal1228 days ago

What of the double slit experiment, where observation changes the outcome? Do we call observation the cause of the outcome?

uniqueuid28 days ago

In general you assume DAGs, i.e. non-cyclical causality. Cyclical relations must be resolved through distinct temporal steps, i.e. u_t0 causes v_t1 and v_t1 causes u_t2. When your measurement precision only captures simultaneous effects of both u on v and v on u you have a problem.

QuantumGood27 days ago

That colliders and confounders have technical definitions is not known by some:

------------------ Confounders ------------------

A variable that affects both the exposure and the outcome. It is a common cause of both variables.

Role: Confounders can create a spurious association between the exposure and outcome if not properly controlled for. They are typically addressed by controlling for them in statistical models, such as regression analysis, to reduce bias and estimate the true causal effect.

Example: Age is a common confounder in many studies because it can affect both the exposure (e.g., smoking) and the outcome (e.g., lung cancer).

------------------ Colliders ------------------

A variable that is causally influenced by two or more other variables. In graphical models, it is represented as a node where the arrowheads from these variables "collide."

Role: Colliders do not inherently create an association between the variables that influence them. However, conditioning on a collider (e.g., through stratification or regression) can introduce a non-causal association between these variables, leading to collider bias.

Example: If both smoking and lung cancer affect quality of life, quality of life is a collider. Conditioning on quality of life could create a biased association between smoking and lung cancer.

------------------ Differences ------------------

Direction of Causality: Confounders cause both the exposure and the outcome, while colliders are caused by both the exposure and the outcome.

Statistical Handling: Confounders should be controlled for to reduce bias, whereas controlling for colliders can introduce bias.

Graphical Representation: In Directed Acyclic Graphs (DAGs), confounders have arrows pointing away from them to both the exposure and outcome, while colliders have arrows pointing towards them from both the exposure and outcome.

------------------ Managing ------------------

Directed Acyclic Graphs (DAGs): These are useful tools for identifying and distinguishing between confounders and colliders. They help in understanding the causal structure of the variables involved.

Statistical Methods: For confounders, methods like regression analysis are effective for controlling their effects. For colliders, avoiding conditioning on them is crucial to prevent collider bias.

lenzm27 days ago

If you have to start with apologies then you know, just stop and don't post.

QuantumGood27 days ago

Sure, but someone else did this for me, using AI, I found it useful to scan in the moment. I appreciated it and upvoted it.

Like that experience, this was meant as a scannable introduction to the topic, not an exact reference. Happy to hear altenative views, or downvote to give herding-style feedback.

Had I done a short AI-generated summary, it would have been a bit less helpful, but there wouldn't have been downvotes.

Had I linked instead of posted the same AI explanation, there would have been no or fewer downvotes, because many wouldn't click, and some of those that did would find it helpful.

Had I linked to something else, many would not click and read without a summary, both of which could have been AI-created.

I chose to move on and accept a few downvotes. The votes count less than the helpfulness to me. Votes don't mean it helps or doesn't. Many people accept confusion without seeking clarification, and appreciate a little help.

Although I personally do tend to downvote content-free unhelpful Reddit-style comments, I'm not overly fond of trying to massage things to help people manage their feelings when posts are only information, with no framing or opinion content. I understand that there is value in downvotes as herding-style feedback (as PG has pointed out). Yes, I've read the HN guidelines.

I think beyond herding-style feedback downvotes, AI info has become a bit socially unacceptable—okay to talk about it but not share it. But I find AI particularly useful as an initial look at information about a domain, though not trustworthy as a detailed source. I appreciate the footnotes that Perplexity provides for this kind of usage that let me begin checking for accurate details.

laurentlb28 days ago

On a similar note, I enjoyed watching the video: https://youtu.be/mQ56uOkjccg?si=1hpwGqv2dQqLQ-ME (by Nutrition Made Simple!)

It takes a specific topic (here, health effects of red meat) and explains how each type of study can provide information, without proving anything. It helped me a lot understand the science related to nutrition, where you never have perfect studies.

KempyKolibri28 days ago

Dismissing all observational study designs out of hand because they can be difficult and easy to perform poorly seems like quite the take.

I see this all the time in people’s interpretation of nutrition research, and they do exactly as this article suggests and fall back to the “intuitive option”, and go onto some woo diet that they eventually give up because they start feeling awful.

I would disagree that observational study designs should be thrown out the window or that it makes sense to, as this article seems to do, lump cross-sectional ecological data in with prospective cohort studies.

Things often “make intuitive sense” only because of these study designs. We used to get kids to smoke pipes to stave off chest infections because it made “intuitive sense” and it’s only because of observational studies that we now believe smoking causes lung cancer.

The direction of evidence from prospective cohort studies to RCTs in the field of nutrition science on intake vs intake shows a 92% agreement. If we take RCTs to be the “gold standard” of evidence that best tracks with reality, it seems a little odd that these deeply flawed observational studies that we should apparently disregard seem to do such a good job coming to the correct conclusions.

https://bmcmedicine.biomedcentral.com/articles/10.1186/s1291...

derbOac28 days ago

It's important to be thoughtful about research interpretation, but I'm kind of tired of kneejerk dismissal of observational studies for a couple of reasons.

First, experiments have their own varieties of horrors. Many are small N, with selective data reporting, and lack external validity — that is, the thing you really want to randomize is difficult or impossible to randomize, so researchers randomize something else as a proxy that's not at all the same. Other times there's complex effects that distort the interpretation of the casual pathway implied by the experiment.

Second, sometimes it's important to show that any association exists. There are cases where it's pretty clear an association is non-existent, based on observational data and covariate analysis. You just don't hear about those because people stop talking about them because of the null effects. So there's a kind of survivorship bias in the way results are discussed, especially in the popular literature.

It's easy to handwave about limitations of studies, it's much harder to create studies that provide evidence, for logical, practical, and ethical reasons. Why you'd want less information about an important phenomenon isn't clear to me.

mistercow28 days ago

> We used to get kids to smoke pipes to stave off chest infections because it made “intuitive sense” and it’s only because of observational studies that we now believe smoking causes lung cancer.

This is an interesting example, because I don’t know of any studies (although there probably are some, if only old ones) specifically about whether smoking pipes staves off lung infections, but the “intuitive sense” answer has changed because of adjacent evidence. And in this case, it’s not the lung cancer evidence that makes it intuitively unlikely that pipe smoking would be helpful, but a broader understanding of what causes lung infections, and what tobacco smoke contains and doesn’t contain.

KempyKolibri28 days ago

I think that’s a fair critique! Probably would have been better to say that the intuitive position was that smoking was unrelated to lung cancer.

not_kurt_godel28 days ago

It is quite the take indeed; one that I posit resonates most strongly with people whose societal views tend to conflict with the available evidence.

talkingtab28 days ago

I am slowly becoming convinced that studies are in fact cargo-cultism. And there are many, many studies that confirm this.

But about causality. Long ago (old cars) I had a friend who told me that most mornings his car would not start until he opened the hood and wrapped some wires with tape (off with the old tape on with the new). Then the car would start. Every now and then it would take two wraps. Hmmm.

After he demonstrated this, I decided to try to help. I followed the wires that were wrapped. Two of them. To my surprise they were not connected at either end. This was insane, and yet his study - and my own observation - demonstrated that wrapping these two wires which were completely disconnected caused his car to start. Now there is causality for you.

Except that if you have a more complex model of cars, there is a sane explanation. Again this is an old car with a carburetor. In case you don't know this is a little bowl of gas it that provides a combustible mix of air and gas. If there is too much gas then your car won't work. The mix is controlled by a little float that controls the level of gas in the little bowl. Toilet bowls work on the same principle.

If your float is bad (or other issues) your car engine would get too much gas - be "flooded" and you have to wait until much of it evaporates. So if you flood your car engine, go and wrap some wires, it may be that your car will start right up.

So I rebuilt the carburetor and my friend never had that problem again.

The moral of the story is that I had better "model" of how cars work. But in the back of my mind I am aware that my model may be or have been just as deficient. Did you know that we are bombarded from space by an unknown type of neutrino that stops electricity from working unless there is a little pool of some liquid nearby or it is Thursday. I am going to do a study of this.

There are very good reasons to understand how frail our ability to understand causality is. And we are talking simple things here. The scientific method is about EXPERIMENTS. Yes, I did that in bold. Doing things. We have deeply complex situations we need to understand and in my opinion, studies do not help.

gwern28 days ago

> After he demonstrated this, I decided to try to help. I followed the wires that were wrapped. Two of them. To my surprise they were not connected at either end. This was insane, and yet his study - and my own observation - demonstrated that wrapping these two wires which were completely disconnected caused his car to start. Now there is causality for you.

You didn't show causality, though. You never randomized anything. His study and your observation was purely observational. At no point did you open the hood, get ready to wrap the wires, and flip a coin to decide whether to wrap the wires or do a placebo wrapping somewhere else.

Had you done that, you would have found, per your ultimate explanation, that the wrapping made no causal difference: you did the procedure, and either way, the car turned on. Hence, there is no causality for you.

bwfan12328 days ago

imo, The idea of a cause is a logical concept of containment when used in theories. A causes B means the phenomena represented by A implies the phenomena represented by B. So, causation is a device of our symbolization and understanding of the world rather than anything fundamentally out there. this is of course a controversial view.

Causality eventually demands a "theory" for full explanatory power and understanding. Theories have premises, involve inference, and have predictions. Otherwise, we get ad-hoc models of phenomena via observations which is a great start, but ends up as an oversimplification. X causes Y but, what caused X or why did X cause Y and not Z ? models represent phenomena while theories explain them. we start with models, and then our curiosity eventually leads to a theory. refer [1] for a great read from a physicist turned quant.

[1] https://www.amazon.com/Models-Behaving-Badly-Confusing-Illus...

yccs2727 days ago

If I understand it correctly, they randomly decided to try starting the car immediately or to go wrap the wires first. This absolutely demonstrates a causality, they just didn't cleanly separate the different factors which changed.

Your comparison to placebo is very apt: Giving medication to a patient (vs not giving anything) causes them to get better, but it might be the "giving a pill" part instead of the "ingesting medication" part that matters.

swores27 days ago

It doesn't show causality because their decision of when to do it wasn't random (so maybe when in the mood to go straight to wrapping wires they were also in the mood to turn the key slightly more firmly, or...), and also because there's no way of knowing if wrapping the wires was in any way relevant - maybe the vehicle actually just needed X minutes between unlocking and starting, and wrapping wires was a way to spend that time without using a timer, or maybe... who knows what other maybes!

And sure enough, their story concludes with discovering that there was no causality from the wire wrapping at all. It was just about killing time.

schneems27 days ago

I liken this to the experience of playing an old school fighting game in the era before the internet. You would be mashing buttons when suddenly your character would do a power move. Then spend the rest of the day trying to figure out how to reproduce it.

If you could reproduce it, it would usually be intermittent. Eventually you would learn “when I X then my character will Y, but only sometimes.

This is due to the real command being a subset or being a slight variation of what you thought was correct that you accidentally do sometimes.

Even when it’s ephemeral and seemingly random I still find these things valuable. It’s better to be able to reproduce it sometime instead of never. Answering the question “is doing this better than random?” (P95) can help you throw away a bad hypothesis. Most people don’t realize that when they are providing evidence for causality they are competing with random. If they had instead done jumping jacks or said a prayer to the engine gods X times, then the correlation between the wires and the engine might suddenly seem much weaker.

Once you have one hypothesis you can test it against others and I believe that’s powerful. Provided it’s done systemically and with at least a mild understanding of probability and error. Also a hypothesis without a theory first scientific. Why did your friend wrap the wires to begin with?

It’s okay to act in random until we find some effect, but then we also need to take the time to roll back (as you did) to ask “WHY did this happen?” In which case you can begin the process with a fresh hypothesis.

I feel when we are taught the scientific method in elementary school it doesn’t stick for most of us, even engineers. Especially non-engineer folks. It seems at first blush like some truisms strung together, but that simplicity hides very powerful capabilities and subtle edge cases.

shadowgovt27 days ago

I think the most fascinating thing about the practice of science (and this is one of those things I wish I'd realized sooner when learning physics) is that experimental evidence often outstrips theory.

There are all manner of observable, reproducible behaviors in nature that we barely have an explanation of. Those things remain observable and reproducible whether we can tell a tidy story about why they happen.

In a very meaningful sense, the local healer applying poultices formulated from generations of experimentation is using science much as the medical doctor is (assuming, of course, they're taking notes, passing on the discoveries, and the results are reproducible). The doctor having tied their results to the "germ theory of medicine" vs. the local healer having tied theirs to "the Earth Mother's energies impregnate the wound" is an irrelevant distinction until (and unless) a need comes along to unify the theory to some other observable outcomes.

atombender27 days ago

That's true for simple things. You don't need to know what the pharmacological mechanism of COX inhibitors is in order to prescribe Advil for a headache. But if you're a scientist trying to make a better Advil you probably need to know how it works.

Doctors routinely prescribe medications that have no randomized clinical trials supporting their use. In those cases, clinical experience replaces trial data; they "know" the drugs work because all the patients have effectively been trial subject over a span of decades.

Retric28 days ago

> The scientific method is about EXPERIMENTS

IMO the model of that story is the S at the end of experiments is more than just repeating the same things. Fixing the carburetor was the second and vastly more informative experiment, but your friend could have tried various alternatives to doing exactly what he was doing which would then uncover the time component.

Science digs into problems, so the most important part of meta analysis, which is often ignored, is asking if the question even makes sense in a generic context. Just as crucial is narrowing down the effect size which may be dwarfed by random noise in some experiments etc.

ramon15628 days ago

Doesn't this heavily apply to building software as well? e.g. instead of spray and pray development we should get a better understanding of the model we're working with.

If my parser gets null's when it should be non-null then I first need to find where they could potentially even come from, aka get a better understanding of the model I'm working with.

thenoblesunfish28 days ago

As with many things, just understand what you are trying to do.

If you want to predict Y and you know X, you can use data that tell you when they happen together.

If you are trying to cause (or prevent) Y, it's harder. If you can't do experiments (e.g. macroeconomics), it's borderline impossible.

groby_b27 days ago

"a technique called regression analysis that, as far as I can determine, cannot be explained in a simple, intuitive way (especially not in terms of how it "controls for" confounders)"

That sounds very much like a skills issue. Because it can. You call out what you consider might be confounders as independent variables (covariates). You can then use regression analysis to estimate the individual contributions from each confounder, and control for them by essentially filtering out that contribution.

Is reality harder than that? Yes. Much. The world of science isn't 9th grade math, sorry. You are not entitled to understand everything deeply with 5 minutes of mediocre effort.

BugsJustFindMe28 days ago

> Now, a lot of these studies try to "control for" the problem I just stated - they say things like "We examined the effect of X and Y, while controlling for Z [e.g., how wealthy or educated the people/countries/whatever are]." How do they do this? The short answer is, well, hm, jeez.

You mean they don't cluster the data into sets of overlapping bins where the controlled attribute has approximately the same value and then look for the presence of an XY relationship within the bins instead of across them?

Sniffnoy27 days ago

No. What they actually do is that they do a regression with both X and Z among the independent variables, and then look solely at the coefficients coming from X. (As mentioned in the article.) Including Z as a dependent variable alongside X "controls for" it in that now the coefficients for X are supposed to not include any effect from Z (since any Z effect should go in the Z coefficients). How well this works is something I don't know enough to answer.

I don't actually know how the method you suggest compares in the limit of finer bins. It's possible it might only achieve similar results?

KempyKolibri27 days ago

The smaller bins approach is adjustment via stratification.

Good primer on both here: https://www.mynutritionscience.com/p/statistical-adjustment

youainti23 days ago

My understanding is that in the limit, it does the same thing, but with more of a flattened tree representation.

gns2428 days ago

"A study using a complex mathematical technique claiming to cleanly isolate the effect of X and Y. I can't really follow what it's doing..."

This is a frustrating type of issue. Dismissing something with "I don't understand this, but I don't believe it" isn't the sort of thing I want to be doing. However, I don't have any desire to waste time trying to understand what someone has done (and did they really understand what they were doing themselves?) when it's clear that the effect isn't cleanly isolated in the data and no amount of mathematics is going to change that.

sujumayas28 days ago

Am I the only one thinking through the reading of this: "Wait a minute... isn't this article some kind of weak X then Y also? Observation of many cases, with generalized causality concludes that he just feels like x should cause Y? hahaha. Love the article btw.

spacebanana728 days ago

I disagree strongly with this mathematised notion of causality. Two things can be perfectly correlated at all observed points in history without necessarily being causal. There can always be some unknown variable driving change in both.

ibeff28 days ago

That's what the author deals with in the first part of the article on observational studies. Randomized studies don't have that problem.

ekianjo28 days ago

Or they can also not be related at all and just happen by pure coincidence.

ibeff26 days ago

Right but we have the tools to rule that out. That's what the field of statistics deals with. It tells you with mathematical certainty how likely or unlikely the correlation you're observing is to be random.

ekianjo26 days ago

Statistics never give you certainty. You get probabilities.

ibeff25 days ago

I can't tell if you're intentionally misrepresenting what I said. I said we can tell with certainty "how likely or unlikely" something is, i.e. we can precisely calculate the probability.

ngriffiths27 days ago

And all this before you even get to "how much of an impact of X on Y should there be before it is even close to a bottleneck that's worth actually acting on, and do we think it reaches that threshold?"

daoboy28 days ago

It's layers of abstraction all the way down the light cone.

The causality is always present, we just don't have the processing power to ensure with 100% certainty that all relevant factors are accounted for and all spurious factors dismissed.

winternewt28 days ago

This is what I miss for important subjects: an actual ambitious, reductionist approach where in-depth cause-effect analysis is performed for each individual sample.

einpoklum28 days ago

> I have to say, this all was simultaneously more fascinating and less informative than I expected it would be going in.

Direct quote from the author of this post and I couldn't agree more, particulartly about the post itself.

dang27 days ago

Related. Others?

Does X cause Y? An in-depth evidence review - https://news.ycombinator.com/item?id=30613882 - March 2022 (3 comments)

tomrod27 days ago
jtrn28 days ago

As a clinical psychologist, I find it increasingly frustrating to sift through research studies that fail to meet even the most basic standards of scientific rigor. The sheer volume of studies that claim “X is linked to Y” without properly addressing the correlation-versus-causation fallacy is staggering. It’s not just an oversight—it’s a fundamental flaw that undermines the credibility and utility of psychological research.

If a study is publicly funded, there should be a minimum requirement: it must include at least two research arms—one with an experimentally manipulated variable and a proper control condition. Furthermore, no study should be considered conclusive until its findings have been successfully replicated, demonstrating a consistent predictive effect. This isn’t an unreasonable demand; it’s the foundation of real science. Yet, in clinical psychology, spineless researchers and overly cautious annd/or power crazed ethics committees have effectively neutered most studies into passive, observational, and ultimately useless exercises in statistical storytelling.

And for the love of all that is scientific, we need to stop the obsession with p-values. Statistical significance is meaningless if it doesn’t translate into real-world impact. Instead of reporting p-values as if they prove anything on their own, researchers should prioritize effect sizes that demonstrate meaningful clinical relevance. Otherwise, we’re left with a field drowning in “statistically significant” noise—impressive on paper but useless in practice.

gloomyday28 days ago

Obsessing with p-values while at the same time shunning replication studies and studies with negative results is a catastrophe. It causes everyone to be confidently wrong way more often than one would think at first.

What worth is a result with p<0.01 when the 10 previous articles with negative results were never actually written?

parpfish28 days ago

another contributing factor is that in psychology (and possibly other fields), it's very hard to make a career doing rigorous, incremental science that results in confident outcomes because each step along the way, people just way "yeah, sounds about right".

to make a career, you need to discovering quirky counterintuitive findings that can be turned into ted talks and 'one weird trick' clickbait. you become a big deal once you start providing fodder for the annoying "well, actually..." guy to drop on people at a dinner party/reddit comment section.

fritzo27 days ago

I like this writing style with unbound variables. Reminds me of Maya Binyam's novel "Hangman", or Kafka's novels.

m3kw927 days ago

so if we have a scenario where we have data points where when X ball moves white ball also moves, but we’re missing some direct evidence where they actually hit each other or not. But they correlate from the limited sample. I think this is what most correlations are like, we do not see the direct atoms causing the causation, only a probability

zkmon28 days ago

There is no causality, what so ever. The perceived causality is built backwards, only to make something appear sensible. Every event in this universe contributes as a cause to every other event in the universe. It's like fluid flow. Every molecule of the fluid affects the movement of every other molecule. The world evolves in a fluid motion, not through isolated causal chains.

Matumio27 days ago

If you read the mathematical theory of causality (e.g. Pearl), you'll learn that you must have the ability to make interventions "from outside" (at least in theory) before you can talk about causality. You have to define what is "inside" the system you study.

If you define everything to be "inside", then causality disappears because intervention disappears.

adrian_b28 days ago

Tell that to one who gives you a punch in the face, that there is no causal relationship between his desire to punch you and your bloody nose :-)

zkmon27 days ago

They just get a couple of harder punch back. But you missed the point in your rush to make a dramatic comment. It's not about how someone would interpret the causality or how they react. It is about about how a set of events can't be considered as an isolated system of chain of causally related things, disconnected from other things. If you like to think about it in terms of punches, I think you would get lots of them.

bowsamic28 days ago

The Scottish man still speaks it seems

skirge28 days ago

Most important factor on results of research are personal beliefs, especially in "economics".

Cappor28 days ago

The question of whether X can cause Y remains open and requires further research. The article highlights the importance of thoroughly checking sources and methodology to draw clear conclusions. This is an important step towards a deeper understanding of such relationships.

epidemiology28 days ago

In introductory epidemiology courses you'll usually get the Bradford Hill criteria in the first week or two, which gives a good foundation of determining public health causality. After digging deeper, the entire field of causal inference is revealed.

A healthy respect for the difficulties of determining causality is beneficial. Irrational skepticism ignoring the evidence of strong observational research simply replaces it with... what exactly? That's how we ended up with an 71 year old anti-vaccine conspiracist as the health secretary.

Chance-Device27 days ago

Well, of course the conclusion is that you don’t know, Mr. Author. Because the very thing that triggered your interest in the subject of X and Y was that there was no clear cut consensus on the subject. If there were, you wouldn’t have needed to do research at any level of depth at all, because those findings would be well known, and you’d have found them easily through a simple web search.

Instead you were drawn to a topic which seemed ambiguous, which had multiple possible interpretations, multiple plausible angles, and on which nobody could agree. You didn’t explicitly know these things starting out, but they were embedded in the very circumstances which caused you to investigate the subject further.

Yes, determining causation is sometimes hard, is it also sometimes very easy. However, very easy answers are not interesting ones, and so we find ourselves here.

HPsquared27 days ago

Nice hypothesis, but how do we prove it?

msarrel26 days ago

The variable you really have to worry about is z.

Temporary_3133728 days ago

And don’t even get me started on A leading to B

aqueueaqueue28 days ago

The old headline: B happens as A happens.

Baby boom as solar panels sales skyrocket.

skyde27 days ago

is it only me or this completely miss all the recent research on causal inference using causal graphical model ?

aqueueaqueue28 days ago

So, Bayesian or Frequentist?

uniqueuid28 days ago

Funnily enough that hardly matters here.

Causality is a largely orthogonal problem to frequentist/bayesian - it makes everything harder, not just one of those!

procaryote28 days ago

Causality at least correlates with a lot of problems

uniqueuid28 days ago

Yeah but in this case it's really a wrong way to think about it.

If you have a DAG based on wrong assumptions, it doesn't matter whether you get a point estimate based on null hypothesis thinking or whether you get a posterior distribution based on some prior. The problem is that the way in which you combine variables is wrong, and bayesian analysis will just be more detailed and precise in being wrong.

GuB-4228 days ago

Does frequentist/bayesian matters to anything but quasi-religious beliefs?

I mean, that's maths, either approach has to give the same results, as they come from the same theory. The Bayes theorem is just a theorem, use it explicitly or not, the numbers will be the same because the axioms are the same.

uniqueuid27 days ago

No, they are linked to beliefs (like anything else), but the canonical forms do differ a lot. Most importantly:

- bayesian methods give you posterior distributions rather than point estimates and SEs

- bayesian methods natively offer prior and posterior predictive checks

- with bayesian methods, it's evidently easier to combine knowledge from multiple sources, which null-hypothesis testing struggles with (best way is probably still meta-analyses)

stickfigure27 days ago

I can't believe nobody has posted the obvious XKCD of relevance yet:

https://xkcd.com/552/

daft_pink28 days ago

After reading this article, it would be really interesting to have a study on whether they can do research to indicate when correlation == causation and when correlation != causation for any given study and what the factors and a tool so we can have a simple risk assessment on whether there is a link or not.