I realise that when a Christian starts out a post about flaws in any part of science by saying “I love science” some people see that as analogous to someone preambling the telling of a racist joke with the line “I have a black friend so it’s ok for me to think this is funny.”
I like science – but I think buying into it as a holus-bolus solution to everything is unhelpful. The scientific method involves flawed human agents who sometimes reach dud conclusions. It involves agendas that sometimes make these conclusions commercially biased. I’m not one of those people who think that the word “theory” means that something is a concept or an idea. I’m happy to accept “theories” as “our best understanding of fact”… and I know that the word is used because science has an innate humility that admits its fallibility. These dud conclusions are often ironed out – but it can take longer than it should.
That’s my disclaimer – here are some bits and pieces from two stories I’ve read today…
Science and statistics
It seems one of our fundamental assumptions about science is based on a false premise. The idea that showing a particular result is a rule based on it occuring a “statistically significant” number of times seems to have been based on an arbitrary decision in the field of agriculture in eons past. Picking a null hypothesis and finding an exception is a really fast way to establish theories. It’s just a bit flawed.
“The “scientific method” of testing hypotheses by statistical analysis stands on a flimsy foundation. Statistical tests are supposed to guide scientists in judging whether an experimental result reflects some real effect or is merely a random fluke, but the standard methods mix mutually inconsistent philosophies and offer no meaningful basis for making such decisions. Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.”
Did you know that our scientific approach, which now works on the premise of rejecting a “null hypothesis” based on “statistical significance” came from a guy testing fertiliser? And we now use it everywhere.
The basic idea (if you’re like me and have forgotten everything you learned in chemistry at high school) is that you start by assuming that something has no effect (your null hypothesis) and if you can show that it does more than five percent of the time you conclude that the thing actually does have an effect… because you apply statistics to scientific observation… here’s the story.
While its [“statistical significance”] origins stretch back at least to the 19th century, the modern notion was pioneered by the mathematician Ronald A. Fisher in the 1920s. His original interest was agriculture. He sought a test of whether variation in crop yields was due to some specific intervention (say, fertilizer) or merely reflected random factors beyond experimental control.
Fisher first assumed that fertilizer caused no difference — the “no effect” or “null” hypothesis. He then calculated a number called the P value, the probability that an observed yield in a fertilized field would occur if fertilizer had no real effect. If P is less than .05 — meaning the chance of a fluke is less than 5 percent — the result should be declared “statistically significant,” Fisher arbitrarily declared, and the no effect hypothesis should be rejected, supposedly confirming that fertilizer works.
Fisher’s P value eventually became the ultimate arbiter of credibility for science results of all sorts — whether testing the health effects of pollutants, the curative powers of new drugs or the effect of genes on behavior. In various forms, testing for statistical significance pervades most of scientific and medical research to this day.
A better starting point
Thomas Bayes, a clergyman in the 18th century came up with a better model of hypothesising. It basically involves starting with an educated guess, conducting experiments and your premise as a filter for results. This introduces the murky realm of “subjectivity” into science – so some purists don’t like this.
Bayesians treat probabilities as “degrees of belief” based in part on a personal assessment or subjective decision about what to include in the calculation. That’s a tough placebo to swallow for scientists wedded to the “objective” ideal of standard statistics.
“Subjective prior beliefs are anathema to the frequentist, who relies instead on a series of ad hoc algorithms that maintain the facade of scientific objectivity.”
Luckily for those advocating this Bayesian method it seems, based on separate research, that objectivity is impossible.
Doing science on science
Objectivity is particularly difficult to attain because scientists are apparently prone to rejecting findings that don’t fit with their hypothetical expectations.
Kevin Dunbar is a scientist researcher (a researcher who studies scientists) – he has spent a significant amount of time studying the practices of scientists, having been given full access to teams from four laboratories. He read grant submissions, reports, and notebooks, he spoke to scientists, sat in on meetings, eavesdropped… his research was exhaustive.
These were some of his findings (as reported in a Wired story on the “neuroscience of screwing up”):
“Although the researchers were mostly using established techniques, more than 50 percent of their data was unexpected. (In some labs, the figure exceeded 75 percent.) “The scientists had these elaborate theories about what was supposed to happen,” Dunbar says. “But the results kept contradicting their theories. It wasn’t uncommon for someone to spend a month on a project and then just discard all their data because the data didn’t make sense.””
It seems the Bayseian model has been taken slightly too far…
The scientific process, after all, is supposed to be an orderly pursuit of the truth, full of elegant hypotheses and control variables. Twentieth-century science philosopher Thomas Kuhn, for instance, defined normal science as the kind of research in which “everything but the most esoteric detail of the result is known in advance.”
You’d think that the objective scientists would accept these anomalies and change their theories to match the facts… but the arrogance of humanity creeps in a little at this point… if an anomaly arose consistently the scientists would blame the equipment, they’d look for an excuse, or they’d dump the findings.
Wired explains:
Over the past few decades, psychologists have dismantled the myth of objectivity. The fact is, we carefully edit our reality, searching for evidence that confirms what we already believe. Although we pretend we’re empiricists — our views dictated by nothing but the facts — we’re actually blinkered, especially when it comes to information that contradicts our theories. The problem with science, then, isn’t that most experiments fail — it’s that most failures are ignored.
Dunbar’s research suggested that the solution to this problem comes through a committee approach, rather than through the individual (which I guess is why peer review is where it’s at)…
Dunbar found that most new scientific ideas emerged from lab meetings, those weekly sessions in which people publicly present their data. Interestingly, the most important element of the lab meeting wasn’t the presentation — it was the debate that followed. Dunbar observed that the skeptical (and sometimes heated) questions asked during a group session frequently triggered breakthroughs, as the scientists were forced to reconsider data they’d previously ignored.
…
What turned out to be so important, of course, was the unexpected result, the experimental error that felt like a failure. The answer had been there all along — it was just obscured by the imperfect theory, rendered invisible by our small-minded brain. It’s not until we talk to a colleague or translate our idea into an analogy that we glimpse the meaning in our mistake.
Fascinating stuff. Make sure you read both stories if you’re into that sort of thing.