Tải bản đầy đủ (.pdf) (14 trang)

Causation - A philosophy of statistics

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.08 MB, 14 trang )

Chapter
11
A philosophy of statistics
Every truth...is anerror that has been corrected.
Alexandre Kojeve (Kojeve, 1980; p. 187)
Statistics, as a discipline, does not exist in a vacuum. It is a reection of our views on science,
and thus how it is understood and how it is used depends on what we mean by science.
Most statistics texts do not discuss these matters, or if they do, they are perfunctory. But it is
important for all involved (statisticians and clinicians) to appreciate their assumptions, and
to have some rationale for them.
Cultural positivism
Most doctors and clinicians have an unconscious philosophy of science, imbibed from the
larger culture: positivism. Positivism is the view that science is the accumulation of facts.
Fact upon fact produces scientic laws. Holding sway through much of the nineteenth and
twentieth centuries, the positivistic view of science has seeped into our bones. Beginning
in the late nineteenth century, and more denitely aer the 1960s, philosophers of science
have shown that “facts” do not exist as independent entities; they are tied to theories and
hypotheses. Facts cannot be separated from theories; science involves deduction, and not
just induction.
e nineteenth-century American philosopher Charles Sanders Peirce, who was a prac-
ticing physicist, knew what was involved in the actual practice of science: the scientist has
a hypothesis, a theory; this theory might have been based on previous studies, or it might
simply be imagined wholecloth (Peirce called this “abduction”); the scientist then tries to
verify or refute his theory by facts (either passively through observation or actively through
experiment). In this way, no facts are observed without a preceding hypothesis. So facts are
“theory-laden”; between fact and theory no sharp line can be drawn (Jaspers, 1997 [1959]).
Verify or refute?
is hypothesis–fact relationship leaves us with a dilemma: in testing our hypotheses, which
is more important: verication or refutation? e positivistic view was biased in favor of
conrming theories: fact was placed upon fact to verify theories (another name for this view
of science is “vericationism”). In the mid twentieth century, Karl Popper rejected positivism


by privileging refutation over conrmation: a single negative result was denitive – it refuted
a hypothesis – while any positive result was always provisional – it never denitively proves
a hypothesis, because it can always be refuted by a negative result. Let us examine Popper’s
views, and how they apply to dierent approaches to statistics, more closely.
Section 4: Causation
Karl Popper’s philosophy of science
I think it would be fair to argue that in today’s world of science and medical research, the
assumed philosophy of science (sometimes explicit) is that of the philosopher Karl Popper
(Popper, 1959). Popper sought to provide a deductive denition of science to replace the
more traditional inductive denition. In the older view, science seemed to involve the accu-
mulation of facts; the more facts, the more science. e problem with this inductive view can
be traced back to David Hume, who showed that this approach could never, with complete
certainty, prove anything (see Chapter 10). Popper sought complete certainty for science, and
he thought he had it with Einstein’s discoveries. Einstein was able to make certain predictions
based on his theories; if those predictions were wrong, then his theory was wrong. Only one
mistake was required to disprove his entire theory. Popper argued that science could best be
understood as an activity whose theories could be denitively disproved, but never deni-
tively proven. e best scientic theories, then, would be those which would make falsiable
propositions, and, if not falsied, then those theories might be true. Popper specied Freud
and Marx for blame for having claimed to provide scientic theories when in fact their ideas
were in no way falsiable. is approach has become quite popular among modern scientists.
Freud and Marx are, in some sense, easy targets; Darwin’s theory could just as well be rejected
for being unfalsiable. Ultimately, Popper did not solve the Humean riddle, for Popper’s view
tells us not which theories are true, but which ones are not.
The limits of refutation
We might summarize that contemporary views of science (heavily inuenced by Popper) are
focused on hypothesis-testing by refutation. We see this philosophy reected in statistics,
especially in the whole concept of the importance of the p-value and the idea of trying to
refute the null hypothesis (see Chapter 7).
My own view is that this refutationism is as wrong as the old vericationism, because no

single refutation is denitive. One can have positive results aer negative results; what then
to make of the original negative results? In statistics, this overemphasis on refutation leads
to overuse of p-values, while appropriate appreciation of positive results would lead us to a
dierent kind of statistics (descriptive eect size oriented methods, see Chapter 9).
Charles Peirce’s philosophy of science
is leads to an inductive philosophy of science, like that of Charles Peirce (Peirce, 1958),
but not exactly in the traditional sense. Peirce accepted induction as the method of science,
acknowledged that it led to increasing probabilities of truth, and argued that these proba-
bilities reached the limits of certainty so closely that it was mathematically meaningless to
deny certainty to them at a certain point of accumulated evidence. Peirce also added that this
accumulation of near-certain inductive knowledge was a process that spanned generations of
scientists and that the community of scientists which added to this fund of knowledge would
eventually reach consensus on what was likely to be true based on those data.
Causation again
We can now return to that key philosophical aspect of statistics: the problem of causation.
In Chapter 10, I reviewed the basic idea of the eighteenth-century philosopher David Hume,
arguing that inductive inference did not lead to absolute certainty of causation. e philoso-
pher Bertrand Russell tried to provide another way of looking at the question with his notion
82
Chapter 11: A philosophy of statistics
of “material implication.” Russell argued that if A causes B, we are saying that A “materially
implies” B. In other words, there is something in A that is also entailed in B (Salsburg, 2001).
He distinguished this material implication from the symbolic nature of other logical relation-
ships (such as conjunction – the “and” relationship – or disjunction – the “or” relationship).
When we say, “if A, then B,” the “if, then” relationship is not purely symbolic, but has some
material basis. is was Russell’s view; it does not solve the problem of causation but it sug-
gests a way of thinking about causation that entails that the idea is not a matter of purely
symbolic logic, but perhaps an empirical matter.
A nal way of thinking about causation – besides Hume’s description of induction, and
Russell’s logical concept of material implication – is a scientic perspective that can be traced

to one of the French founders of nineteenth-century experimental medicine, Claude Bernard
(Olmsted, 1952). Bernard held that we could conclude that A causes B by conducting an
experiment in which all conditions are held constant except A, and showing that B follows.
Such proof of causation then is based on being able to control all factors except one, the ex-
perimental factor. is is, in practice, dicult to do in biology and medicine, and much more
feasible in inorganic sciences such as physics and chemistry. But it can be done. For instance,
we have the technology today to conduct animal studies in which the entire animal genome is
xed beforehand; animals can be genetically bred to produce a certain genetic state and they
can all be identical in that genetic state; then we can control the animals’ environment from
birth until death. In that kind of controlled setting, where all genetic and environmental
factors are controlled, Bernard’s denition of experimental causation may hold.
Such causation is unethical and infeasible with human beings. e closest we get to it is
with randomization. As discussed throughout this book, randomization with human beings,
though reducing much uncertainty, never reduces all uncertainty, and thus we cannot achieve
absolute causation. e importance of randomized clinical trials (RCTs) in getting us very
much closer to causation might be highlighted by realizing that they are the closest human
approximation to Bernard’s experimental causation. Fisher was right in emphasizing the need
for RCTs in asserting causation, and Hill was right in recognizing the benets of other fea-
tures of research, in addition to experimentation with RCTs, so as to reduce uncertainty even
further.
The general versus the individual
Another philosophical aspect about statistics is how it reects the general as opposed to the
individual. e Belgian thinker Quetelet recognized the issue in the 1840s; he “knew that
individuals’ characteristics could not be represented by a deterministic law, but he believed
that averages over groups could be so represented.” (Stigler, 1986; p. 172.) About half a cen-
tury later, German philosophers (Wilhelm Windelband and Heinrich Rickert) made this gen-
eral distinction the basis for their understanding of the nature and limits of science: science
consists of general laws; it stops short of the unique and individual. ey said there were two
kinds of knowledge: nosographic (science – general, statistical, group-based) and idiographic
(individual and unique for each particular case). Science “explained” (Erklaren) general laws;

philosophy and the humanities “understood” (Verstehen) the unique characteristics of indi-
viduals (Makkreel, 1992).
is criticism of statistics, so oen used by modern critics of evidence-based medicine
(EBM), was present from the very beginning of the eort (in the mid nineteenth century) to
apply statistics to human beings (as in experimental psychology), as opposed to limiting it
83
Section 4: Causation
to mathematics, astronomy, and physics (as had previously been the case). Here is an exam-
ple from Auguste Comte attacking the statistician Poisson who in 1835 had suggested there
might be legal uses for statistics: “e application of this calculus to matters of morality is
repugnant to the soul. It amounts, for example, to representing the truth of a verdict by a
number,tothustreatmenasiftheyweredice,eachwithmanyfaces,someforerror,some
for truth.” (Stigler, 1986; p. 194.)
is history reminds me of an exchange I recently had, one that became somewhat heated,
during a symposium in the annual convention of the American Psychiatric Association. I
and others had reviewed RCTs showing that antidepressants were hardly eective in bipolar
depression; one of the discussants, who had previously supported their use, had to bow to
the data, but he ended his presentation by declaring forcefully: “Antidepressants may not be
as great as we had hoped, but, in the end, your individual experience as a practitioner and
that of the patient trumps everything!” Raucous applause followed from the packed audience
of clinicians. Fearing that three hours of painstaking exposition of RCT data had just been
ushed down a toilet, and perhaps angry about such dismissal of years of daily eort by
researchers like me, I wanted to retort: “Only if you don’t care about science.” But a debate
about philosophy of science could not occur then and there.
is is the problem: yes, statistics do not tell you what to do with the individual case, but
this does not mean that a clinician should decide what to do out of thin air. e clinician’s
decisions about the individual case need to be informed,notdictated, by scientic knowledge
as established in a general way through statistics.
is insight is present in the great neo-Hippocratic thinkers of modern medicine. Per-
haps the best example is William Osler, who always emphasized that medicine was not just

a science, but also an art, and that the art of medicine is the art of balancing probabili-
ties (Osler, 1932). If we use the reality of art to negate the necessity of science, we might
as well start Galenic bleeding all over again. e art of medicine is, as Osler suggests, in
fact, the proper appreciation of the science via a knowledge of statistics: the art of balancing
probabilities.
e problem with that colleague’s comment was that he was negating the general
knowledge of statistics by prioritizing the individual experience of clinicians. e his-
tory of medicine, and a rational approach to the philosophy of science, indicates that the
prioritization should be the other way around (which is the basic perspective of EBM).
The illogic of hypothesis-testing statistics
When most people use the word “logic,” they mean what philosophers call “predicate” logic,
meaning discussions of statements about present facts: things that are.However,whatmay
be true in predicate logic – things that are – may not be true for other kinds of logic, such
as modal logic – things that possibly or probably are. As noted in Chapter 7,JacobCohen’s
intuition (Cohen, 1994), translated into the language of logic, is that the key problem with
hypothesis-testing statistics is that itworksinpredicatelogic,butfailsinmodallogic.
Logic is important. As a branch of philosophy, it examines whether one’s conclusions ow
from one’s premises. Logic is an important method, because no matter what the content of
one’s views, if the logical structure of an argument is invalid, then the whole argument is
faulty. We may or may not agree with the content of any statement (the world is round; the
world is at), but we should all be able to agree on the logic of any claim that if X is true, then
Y must be true. If an argument is illogical, then it can simply be dismissed.
84
Chapter 11: A philosophy of statistics
Now let’s see why hypothesis-testing statistics is illogical. Predicate logic applied to
hypothesis-testing statistics would be as follows:
If the null hypothesis [NH] is correct, then these data cannot occur.
ese data have occurred.
erefore, the null hypothesis is false.
is argument is logically valid; but it becomes invalid once it is turned into a statement

of probability:
If the null hypothesis [NH] is correct, then these data are highly unlikely.
ese data have occurred.
erefore, the null hypothesis is highly unlikely.
I have italicized the dierences where we have moved from statements of fact to state-
ments of probability. e falsity of this transition becomes clear once we use examples. Using
predicate logic:
If a person is a Martian, then he/she is not a member of Congress.
ispersonisamemberofCongress.
erefore, he/she is not a Martian.
is logic of facts is valid; but the logic of probability is invalid:
If a person is an American then he is probably not a member of Congress.
ispersonisamemberofCongress.
erefore, he is probably not American.
(Pollard and Richardson, 1987)
Cohen calls this logical fallacy “the illusion of attaining improbability,” and if true, which
appears to be the case, it undercuts the very basis of hypothesis-testing statistics, and thereby,
the vast majority of medical research. e whole industry of p-values comes tumbling down.
Inductive logic
Medical statistics are based on observation, and thus they are a species of induction. Induc-
tion, in turn, is philosophically complex. It turns out that one cannot easily infer causation
from observation, and that the logic of our hypothesis-testing methods is faulty. What are we
to do?
Once again, the answer seems to be to give up our theories and return more closely to
our observation. e more we engage in descriptive statistics, the farther away we get from
hypothesis-mongering, the closer we are to a conceptually sound use of statistics. We can
quantitate without over-speculating.
I hope some day to be able to publish research studies on small sample sizes where the
results can be accepted as they are, with the main limitation of imprecision, but without the
irrelevant claim that they can only be “hypothesis-generating” as opposed to “hypothesis-

testing.” Science is not about hypothesis-testing or hypothesis-generating; it is about the
complex interrelation between theory and fact, and the gradual accumulation of evidence
for or against any scientic hypothesis. Perhaps we can then get beyond the logical fallacies
so rampant in statistical debates, so closely related to the lament of a philosopher: “All logic
texts are divided into two parts. In the rst part, on deductive logic, the fallacies are explained;
in the second part, on inductive logic, they are committed.” (Cohen, 1994.)
85

×