Cars Parts Show Us How Some Genetic Stats Mislead
We can “read” genes with ease now, but still can’t say what most of them “mean.” To show why we need clearer “causology” and fitter metaphors, let's scrutinize cars and their parts like we do bodies and genes.
1. We can “read” genes with ease now, but still can’t say what most of them “mean.” Mastering precisely how they “cause” higher-level traits will require clearer “causology” and fitter metaphors.
2. Genes (more precisely, gene products) contribute to fiendishly complex processes that confound the standard stats grinder. To illustrate, imagine scrutinizing cars and their parts like we do bodies and genes in “genome-wide association studies” (GWAS). The details don’t matter here, beyond that a car-GWAS would analyze a car-level trait like fuel efficiency by variations in the properties of all the car’s parts.
3. Consider a car having standard and sporty models. The latter have larger gas-guzzling engines and available pimped-up painted brake calipers. And let’s say sporty buyers more often pick red brakes, then statistically speaking red brakes bring greater gas guzzling “risk.”
4. If I’m not mistaken (please correct me stats geeks), no stats-only data wizardry can distinguish such non-causal entanglements (p-values can’t discern “phantom patterns”).
5. Generally, part-level properties can have non-causal and non-random “links” to higher-level traits. And including non-causal factors distorts the statistics (misallocating the variation that seems “explained by,” “accounted for,” or “linked to”). Lacking causal insights, you always run the “red-brake” risk.
6. Regarding metaphors, gene products work more like words than car parts (genes aren’t static “blueprints”). They act via sentence-like structures with collective effects and multiple “meanings.” But we lack the rules (~cellular syntax, gene grammar) for how parts of biology compose life’s activity-sentences.
7. Genes also sort of work like music: Typically “played” in precise synchrony to orchestrate many molecular melodies (simultaneous biochemical sentences) enabling enormous ensemble effects.
8. And life typically has way more moving parts than cars, and more complex transient casual structures. It’s traits often have multiple hetero-causal etiologies (roadmaps exhibiting sufficient but not necessary logic). Current stats can’t disentangle hetero-causal effects (larger type-mixed samples often won’t help).
10. Thankfully, fitter thinking is afoot—for instance, geno-pheno mapping (Massimo Pigliucci), better “Laws of Biology” (Kevin Mitchell), Reductionist Bias Corrections (Krakauer), and Causal Structure Modeling (Judea Pearl).
11. Biology and social science need less primarily parts-focused thinking (you can't grasp chess by studying the properties of its pieces alone), and ways to handle different kinds of causes and roles—see Krakauer’s Figure 4, Aristotle’s four causes, Tinbergen's four questions, Marr’s three levels. Much in these fields is more process-or-algorithm shaped (often resisting Occam’s Razor).
12. Related iffy thinking exists far beyond genomics. As mostly practiced, stats presume a flat or “heap” causal structure that’s often ill-suited for process-oriented life, or car making, or even cooking (cooks need step-by-step recipes to turn parts into wholes).
13. Statistical analysis without causal insights often runs the red-brake risk. The habit of adding variables to “control for” factors can misallocate variation (itself often a nonsensical or low quality quantification).
14. Similar structureless-sausage data risks pervade black box approaches to Big Data and AI.
15. You know that correlation doesn’t imply causation, but AI doesn’t “know” that.
Illustration by Julia Suits, The New Yorker cartoonist & author of The Extraordinary Catalog of Peculiar Inventions
Lumina Foundation is partnering with Big Think to unearth the next large-scale, rapid innovation in post-high school education. Enter the competition here!
Butter supply and life satisfaction are linked – but by causation or correlation?
- Haiti and other countries with low butter supply report low life satisfaction.
- The reverse is true for countries like Germany, which score high in both categories.
- As the graph below shows, a curious pattern emerges across the globe. But is it causation or correlation?
Advances in satellite imagery are shining a light.
- Today, there are 40.3 million slaves on the planet, more than the number of people living in Canada.
- Slavery can be hard to find, but it commonly occurs in several key industries like fishing and mining.
- Using satellite data, researchers and activists are using crowdsourcing and artificial intelligence to identify sites where slavery is taking place.
The catacombs of Paris. Secret graffiti beneath NYC. The hidden cities of Cappadocia. Writer and explorer Will Hunt is your philosophical tour guide to what lies beneath.
- "The surface of the earth is where we're rational . . . Part of us dreads the chaos, and part of us is always attracted to it."
- "There were these things hanging from the ceiling…long strands of bacteria called "snotsicles"… But at our feet was a natural stream that had been running through Brooklyn forever."
- "It's…about death. Undergoing a death. We're going into the other world and then retreating to the surface… changed in some way."
SMARTER FASTER trademarks owned by The Big Think, Inc. All rights reserved.