What follows is a description of various views on inductive inference and methods for inferring general theories as they have developed from the scientific revolution to modern times. Later, the development of methods for discovering causal relationships will be discussed.
A strong influence on contemporary methodology is interdisciplinary research. In the twentieth century, the question of how we can use observations to attain empirical knowledge became the subject of research in a number of disciplines, such as statistics, econometrics, and computer science. Modern philosophy of method continues to contribute to and draw on developments in related disciplines.
Another strong influence on contemporary methodology arises from studies of the history of science, which captured the attention of philosophers because of the groundbreaking work of Thomas Kuhn (1922–1996) on the Structure of Scientific Revolutions. Kuhn argued that scientific textbook accounts of the history of science as a wholly progressive series of discoveries are false for scientific revolutions. His work has suggested that changes of method across revolutions undercut attempts to apply common standards to evaluate prerevolution and postrevolution theories.
Kuhn also criticized the methodological ideas of Karl Popper (1902–1994). Popper had asked the question of what distinguishes ("demarcates") scientific hypotheses from nonscientific hypotheses. He emphasized that science proceeds by testing hypotheses against empirical data, and thus located the characteristic of scientific hypotheses in their empirical testability. Popper's basic view of testing a hypothesis against data was to derive predictions from the hypothesis and see if they matched the data (conjectures and refutations). If the data does not match the predictions, they falsify the hypothesis.
This led Popper to postulate that scientific hypotheses must be falsifiable. Popper's falsifiability criterion has been very influential, arguably more outside of the philosophy of science than inside. Kuhn objected to the falsifiability concept because, according to him, history shows that scientists do not subject major scientific theories (or paradigms) to falsification. Instead, scientists view a mismatch between theory and data as an anomaly, a puzzle to be resolved by further research. Many philosophers of science took Kuhn's moral to be that logic-based analyses of scientific method cannot capture the dynamics of major scientific change. Scientific revolutions would instead be determined by complex sociopolitical processes within the scientific community, played out within the specific historical context. Modern methodologists aim to avoid both the extremes of a context-free universal scientific logic on the one hand, and an entirely context-specific study of particular historical episodes on the other.
Method in the Scientific Revolution
Two topics of inquiry held center stage during the scientific revolution: the traditional problems of astronomy, and the study of gravity as experienced by bodies in free fall near the surface of the earth. Johannes Kepler (1571–1630) proposed that the predictive empirical equivalence between geocentric and heliocentric world systems that holds in principle could be offset by appeal to physical causes (Jardine 1984). He endorsed the appeal by Nicolas Copernicus (1473–1543) to the advantage offered his system from agreeing measurements of parameters of the earth's orbit from several retrograde motion phenomena of the other planets (1596/1981). In his classic marshaling of fit to the impressive body of naked eye instrument observation data by Tycho Brahe (1546–1601), Kepler appealed to this advantage as well as qualitative intuitions about plausible causal stories and intuitions about cosmic harmony to arrive at his ellipse and area rules (1609/1992). He later arrived at his harmonic rule (1619/1997). His Rudolphine Tables of 1627 were soon known to be far more accurate than any previously available astronomical tables (Wilson 1989).
Galileo Galilei (1564–1642) described his discovery of Jupiter's moons and exciting new information about our moon in the celebrated report of his telescope observations (1610/1989). His later observations of phases of Venus provided direct observational evidence against Ptolemy's system, though not against Tycho's geoheliocentric system. This was included in his argument for a Copernican heliocentric system in his famously controversial Dialogue (1632/1967).
Galileo's study of gravity faced the challenge that because of complicating factors such as air resistance one could not expect the kind of precise agreement with measurement that was available in astronomy. In his celebrated Two New Sciences (1638/1914), Galileo proposed uniformly accelerated fall as an exact account of idealized motion that would obtain in the absence of any resistant medium, even though the idealization is impossible to actually implement. He argues that the perturbing effects of resistance are too complex to be captured by any theory, but that the considerations he offers, including inclined plane experiments that minimize the effects of resistance, support his idealized uniformly accelerated motion as the principal mechanism of such terrestrial motion phenomena as free fall and projectile motion.
An important part of what distinguishes what we now characterize as the natural sciences is the method exemplified in the successful application of universal gravity to the solar system. Isaac Newton (1642–1727) characterizes his laws of motion as accepted by mathematicians and confirmed by experiments of many kinds. He appeals to propositions inferred from them as resources to make motion phenomena measure centripetal forces. These give systematic dependencies that make the areal law for an orbit measure the centripetal direction of the force maintaining a body in that orbit, that make the harmonic law for a system of orbits about a common center, and that make the absence of orbital precession (not accounted for by perturbations) for any such orbit, measure the inverse square power for the centripetal force. His inferences to inverse-square forces toward Jupiter, Saturn, and the sun from orbits about them are inferences to inverse-square centripetal acceleration fields backed up by such measurements.
Newton's moon-test shows that the length of a seconds pendulum at the surface of the earth and the centripetal acceleration of the moon's orbit count as agreeing measurements of a single earth-centered inverse-square acceleration field. On this basis Newton identified the force maintaining the moon in orbit with terrestrial gravity. His first two rules endorse this inference. Rule number one states "no more causes of natural things should be admitted than are both true and sufficient to explain their phenomena" (Newton 1726/1999, p. 794). Rule number two adds that, therefore, "the causes assigned to natural effects of the same kind must be, so far as possible, the same" (Newton 1726/1999, p. 795).
Newton argues that all bodies gravitate toward each planet with weights proportional to their masses. He adduces a number of phenomena that give agreeing measurements of the equality of the ratios of weight to mass for bodies at equal distances from planets. These include terrestrial pendulum experiments and the moon-test for gravitation toward the earth, as well as the harmonic laws for orbits about them for gravitation toward Saturn, Jupiter, and the sun. They also include the agreement between the accelerations of Jupiter and its satellites toward the sun, as well as between those of Saturn and its satellites and those of the earth and its moon toward the sun.
His third rule endorses the inference that these all count as phenomena giving agreeing measurements of the equality of the ratios of weight to mass for all bodies at any equal distances from any planet whatsoever. Rule number three states that "those qualities of bodies that cannot be intended and remitted (i.e., qualities that cannot be increased and diminished) and that belong to all bodies on which experiments can be made should be taken as qualities of all bodies universally" (Newton 1726/1999, p. 795).
Newton's fourth rule added that "In experimental philosophy propositions gathered from phenomena by induction should be considered either exactly or very nearly true notwithstanding any contrary hypothesis until yet other phenomena make such propositioins either more exact or liable to exceptions" (Newton 1726/1999, p. 796). This rule was added to justify treating universal gravity as an established scientific fact, notwithstanding complaints that it was unintelligible in the absence of a causal explanation of how it results from mechanical action by contact.
Newton's inferences from phenomena exemplify an ideal of empirical success as convergent accurate measurement of a theory's parameters by the phenomena to be explained. In rule four, a mere hypothesis is an alternative that does not realize this ideal of empirical success sufficiently to count as a serious rival.
Rule four endorses provisional acceptance. Deviations count as higher order phenomena carrying information to be exploited. This method of successive corrections guided by theory mediated measurement led to increasingly precise specifications of solar system phenomena backed up by increasingly precise measurements of the masses of the interacting solar system bodies.
This notion of empirical success as accurate convergent theory mediated measurement of parameters by empirical phenomena clearly favors the theory of general relativity of Albert Einstein (1879–1955) over Newton's theory (Harper 1997). Moreover, the development and application of testing frameworks for general relativity are clear examples of successful scientific practice that continues to be guided by Newton's methodology (Harper 1997, Will 1986 and 1993). More recent data such as that provided by radar ranging to planets and lunar laser ranging provide increasingly precise post Newtonian corrections that have continued to increase the advantage over Newton's theory that Newton's methodology would assign to general relativity (Will 1993).
In the preface to his Treatise on Light, Christian Huygens (1629–1695) provided a nice characterization of the hypothetico-deductive (H-D) alternative to Newton's method:
There will be seen in it demonstrations of those kinds which do not produce as great a certitude as those of Geometry, and which even differ very much therefrom, since whereas the Geometers prove their Propositions by fixed and incontestable Principles, here the Principles are verified by the conclusions to be drawn from them; the nature of these things not allowing of this being done otherwise. It is always possible to attain thereby to a degree of probability which very often is scarcely less than complete proof. To wit, when those things which have been demonstrated by the Principles that have been assumed correspond perfectly to the phenomena which experiment has brought under observation; especially when there are a great number of them, and further, principally, when one can imagine and foresee new phenomena which ought to follow from the hypotheses which one employs, and when one finds that therein the fact corresponds to our prevision.
(huygens 1690/1962, p. vi and vii)
Thus H-D method construes empirical success as success in prediction. The limitation of empirical success to prediction alone has suggested to some philosophers of science that distinguishing between theories that agree on predictions would have to be based on nonempirical criteria.
Predicted Fit to Future Data
Given plausible assumptions about errors in data, a model that fits a given body of data too closely is likely to be tracking random errors in the data in addition to the lawlike phenomenon under investigation. Statisticians refer to this as "overfitting the data." They have designed many criteria to reveal cases where a simpler model has better expected fit-to-future data generated by repetitions of an experiment than a more complex model that better fits the data so far. Among philosophers of science, Malcolm Forster and Elliott Sober have appealed to the Akaike Information Criterion to challenge the assumption that fit-to-past data exhausts the criteria for scientific inference. This criterion is not sufficient to recover Newton's method (Myrvold and Harper 2002). The extent to which other such proposals can recover Newton's method is an open question.
Central to the Bayesian methods is epistemic probability, a rational agent's degree of belief. A number of arguments have been put forward to defend the probability axioms as coherence conditions for rational degrees of belief, in analogy to the way logical consistency can be taken as a coherence condition for rational acceptance. Dutch book arguments have shown that degrees of belief violating the probability axioms would assign positive expectations to each bet in a system of bets and conditional bets that would result in sure loss if they were all made together. A number of other arguments for this synchronic condition on rational degrees of belief have been advanced (particularly by Frank Plumpton Ramsey, Leonard J. Savage, Abner Shimony, Bas van Fraassesn, Richard T. Cox, Irving John Good, and J. Aczel).
David Lewis (1941–2001) provided a diachronic Dutch book argument (published in Teller 1976) to defend the Bayesian conditionalization learning model, according to which assigning new degrees of belief given by P′ (B) = P(B&A)/P(A) is the appropriate response to a learning experience in which the total relevant empirical input is to accept A as new evidence. In 1984 van Fraassen (1941–) extended this diachronic Dutch book argument to defend a condition he called reflection. His proposal to treat the reflection condition as a constraint on degrees of belief that could be counted as rational has led to much controversy.
One central Bayesian theme has been to investigate conditions under which evidence leads to convergence of opinion. Bruno de Finetti (1906–1985) specified conditions that would lead Bayesian agents, who update by repeated conditionlization on the outcomes of the same observations, to converge toward agreement in their degrees of belief, however otherwise divergent their prior degrees of belief may have been (1937/1980). Brian Skyrms (1990) has given what is probably the most general possible version of de Finetti's condition for convergence.
In 2003 Wayne Myrvold (1963–) argued that, for Bayesians, the degree to which a hypothesis unifies phenomena contributes to the degree to which these phenomena support the hypothesis. This suggests that Bayesians can recover important aspects of Newton's method. It may well be that investigating the representation of Newton's method of provisional acceptance in a Bayesian model will result in enriching the Bayesian framework to make it offer more resources for illuminating scientific method.
Causation, Correlation, Experimentation
In his famous methods (1843), John Stuart Mill (1806–1873) combined ideas about causal inference previously proposed by John Duns Scotus (1265/66–1308), William Ockham (1280–1349) and Francis Bacon (1561–1626). The work of twentieth century statisticians such as Jerzy Neyman (1894–1981), Karl Pearson (1857–1936), and Ronald A. Fisher (1890–1962) addressed two major shortcomings of Mill's method.
First, Mill assumed that we would observe deterministic causal relationships: Given the cause, the effect must follow every time. However, in a complex situation we typically do not have a complete specification of all operative causes, so we expect to observe trends rather than necessary relationships. For example, although smoking causes lung cancer, it does not do so in every person, because people's physiology varies. Rather, what we observe is a strong association between smoking and lung cancer: Among smokers, the incidence of lung cancer is much higher than among nonsmokers. To define precisely the intuitive notion of "strong association," statisticians developed the concept of correlation, which defines degrees of association (DeGroot 1975).
A second deficiency in Mill's methods is that they fail in the presence of common causes (confounders in statistical terminology). For example, suppose we observe that children who play violent video games are more prone to aggressive behavior than children who do not. Mill's logic would lead us to infer that playing violent video games causes aggressive behavior. But another possibility is that the correlation is because of personality traits: that children with an aggressive nature are drawn to violent video games and tend toward aggressive behavior; a preference for violent video games does not cause the behavior, but is merely a symptom of preexisting aggressive tendencies. If this alternative explanation is true, then Mill's methods lead us to the wrong conclusion. The policy implications are significant: If there is a direct causal relationship between video games and aggressive behavior, we expect to reduce aggressive behavior by restricting the availability of video games. But if personality is the underlying common cause of both, restricting access to video games should not decrease aggressive behavior.
A great advance for the problem of unobserved common causes was Fisher's revolutionary idea of the randomized experiment. Suppose that we have the ability to randomly assign half of a group of children to playing violent video games (the treatment group) and the other half to playing something else (the control group). For example, we might flip a coin for each participating child to make the assignment. Then we expect that personality traits, such as a tendency to aggression, would be randomly distributed in each half so that the children playing the video games would, on average, have no more aggressive personalities than the children playing something else. Under those circumstances, if we still find that significantly more of the video game players engage in aggressive behavior than the children playing something else, we can infer a direct causal relationship.
The idea of using randomization to rule out unobserved common causes has been applied in countless practical problems of causal inference, from clinical studies of the effectiveness of medical treatments to experiments for agricultural methods. It has been a most effective tool for addressing the problem of unobserved common causes that besets many of the traditional philosophical proposals for causal inference.
The power of randomization is available only when we have the ability to experimentally create the conditions we wish to investigate. In many settings of interest, we cannot perform experiments but can only passively gather data (these are called "observational studies" in statistics). A prominent physical science based on passive observation is astronomy. Many examples occur in the social sciences and economics. For instance, an economist cannot randomly assign inflation rates to various countries to study how inflation affects employment. A recent set of examples comes from computer science: While many companies gather vast amounts of data about their customers and the transactions they engage in, they rarely have the ability to assign customers randomly to various conditions (e.g., household income).
Philosophers continued to refine their understanding of the relationship between correlation and causation in nonexperimental settings. The work of Hans Reichenbach (1891–1953), published in 1956, was seminal. Reichenbach expounded the common cause principle: roughly, for every correlation between two events A and B, there is some causal explanation that posits either that one is a cause of the other (e.g., A causes B) or that A and B share a common cause. Reichenbach argued that the assumption that significant associations or correlations have causal explanations is deeply ingrained in our scientific and everyday reasoning. Another important concept of Reichenbach was the notion of screening off. The purpose of this concept is to capture the distinction between immediate and intermediate causes in terms of correlations.
For example, suppose that tar content in lungs is the direct cause of cancer, while smoking directly causes tar to accumulate in the lungs, and thereby indirectly causes lung cancer. Then we would observe a correlation between smoking and lung cancer; but knowing the tar content of the lung would make smoking irrelevant to lung cancer. By contrast, even if we knew whether a subject smokes, the tar content of one's lungs would still be relevant to, or correlated with, the subject getting lung cancer. In Reichenbach's terms, information about tar content screens off information about smoking from conclusions about lung cancer. Because tar content screens off smoking from lung cancer, but not vice versa, Reichenbach suggested that such evidence rules out smoking as a direct cause of lung cancer, and allows us to infer that the effects of smoking are mediated through tar in the lungs.
The philosophers of science—Peter Spirtes, Clark Glymour, and Richard Scheines—developed Reichenbach's ideas about the relationships between correlation and causation using the framework of causal graphs or diagrams (Spirtes 1993). A causal graph is an intuitive representation of causal relationships, in which direct causes are connected with their effects by arrows pointing from cause to effect.
Using the language of causal graphs, Spirtes, Glymour, and Scheines gave a precise formulation of Reichenbach's precept that direct causes screen off indirect ones, known as the Markov condition (I-map in computer science terminology). The common cause principle—that there is no correlation without causation—can be formulated as another principle about diagrams, termed faithfulness (perfect I-map in computer science terminology). Given these principles relating causation and correlation, it is possible to characterize when valid inferences about causal relationships can be drawn from passive observation of associations. The theory is powerful and precise enough to develop computer programs that perform these inferences automatically (the TETRAD system, for instance). With such a program, we can analyze the kind of large datasets that we find in practice, realizing the vision of Bacon and Mill of applying causal inference methods to extensive observation histories.
In computer science, causal diagrams (often called Bayes Nets) have been firmly established as a scheme to capture and reason about associations and causal relationships, giving rise to thriving commercial developments with many practical applications (Pearl 1988, 2000). Econometrics, the study of statistical methods for economic problems, has a rich tradition of developing methods for nonexperimental causal inference going back to the early twentieth century (path diagrams and structural equation models). It turns out that many of these ideas and techniques can be seen as instances of causal diagram methods (Pearl 2000). While the theory of causal inference from passive observation is not yet as firmly established as the methodology based on randomization, at the beginning of the twenty-first century we see a common framework emerging shared and sustained by philosophy, computer science, and economics.
Aczel, J. Lectures on Functional Equations and Their Applications. New York: Academic Press, 1966.
Cox, R. The Algebra of Probable Inference. Baltimore, MD: John Hopkins Press, 1961.
de Finetti, B. "Foresight: Its Logical Laws, Its Subjective Sources" (1937). Translated by H. E. Kyburg and H. Smokler. In Studies in Subjective Probability. Huntington, NY: Kreiger, 1980.
DeGroot, Morris H. Probability and Statistics. Reading, MA: Addison-Wesley, 1975.
Earman, J. Bayes or Bust?: A Critical Examination of Bayesian Confirmation Theory. Cambridge, MA: MIT press, 1992.
Forster, M., and E. Sober. "How to Tell When Simpler, More Unified, or Less Ad Hoc Theories Will Provide More Accurate Predictions." In British Journal for The Philosophy of Science 45 (1994): 1–35.
Galilei, G. The Sidereal Messenger (1610). Translated by A. Van Helden. Chicago: University of Chicago Press, 1989.
Galilei, G. Two New Sciences. Translated by H. Crew and A. De Salvio. New York: Dover, 1914.
Good, I. J. Probability and The Weighing of Evidence. London: Griffin, 1950.
Harper, W. L. "Isaac Newton on Empirical Success and Scientific Method." In The Cosmos of Science: Essays of Exploration, edited by J. Earman and J. D. Norton. Pittsburgh, PA: University of Pittsburgh Press, 1997.
Harper, W. L., and C. A. Hooker, eds. Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science. Vol. 1. Dordrecht; Boston: D. Reidel, 1976
Huygens, C. Treatise On Light (1690). Translated by S. P. Thompson. New York: Dover, 1962.
Jardine, N. The Birth of History and Philosophy of Science: Kepler's "A Defense of Tycho against Ursis. Cambridge, MA: Cambridge University Press, 1984.
Jeffreys, H. Scientific Inference. 3rd ed. Cambridge, MA: Cambridge University Press, 1973.
Kepler, J. The Harmony of The World (1619). Translated by E. J. Aiton, A. M. Duncan, and J. V. Field. Philadelphia: American Philosophical Society, 1997.
Kepler, J. New Astronomy (1609). Translated by W. H. Donahue. New York: Cambridge University Press, 1992.
Kepler, J. The Secret of the Universe (1596). Translated by A.M. Duncan. New York: Abaris Books, 1981.
Kyburg, H. E. Science & Reason. New York: Oxford University Press, 1990.
Myrvold, W. C. "A Bayesian Account of The Virtue of Unification." Philosophy of Science 70 (2003): 399–423.
Myrvold, W. C., and W. Harper. "Model Selection, Simplicity, and Scientific Inference." Philosophy of Science 69 (2002): 135–149.
Pearl, J. Causality. San Mateo, CA: Morgan Kaufmann, 2000.
Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann, 1988.
Popper, K. R. Conjectures and Refutations: The Growth of Scientific Knowledge (1962). New York: Harper & Row, 1963.
Popper, K. R. The Logic of Scientific Discovery (1959). New York: Harper & Row, 1965.
Ramsey, F. P. "Truth and Probability." In The Foundations of Mathematics and Other Logical Essays. London: Kegan Paul, 1931.
Reichenbach, H. The Direction of Time. Berkeley, CA: University of Los Angeles Press, 1956.
Reichenbach, H. The Theory of Probability. London: Cambridge University Press, 1949.
Savage, L. J. Foundations of Statistics. New York: Wiley, 1954.
Shimony, A. Search for a Naturalistic World View: Scientific Method and Epistemology. Vol. 1. Cambridge, U.K.: Cambridge University Press, 1993.
Skyrms, B. The Dynamics of Rational Deliberation. Cambridge, MA: Harvard, 1990.
Spirtes, P., C. Glymour, and R. Scheines. Causation, Prediction, and Search. New York: Springer Verlag, 1993.
Teller, O. "Conditionalization, Observation, and Change of Preference." In Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, edited by W. L. Harper and C. A. Hooker. Dordrecht; Boston: D. Reidel 1 (1976): 205–257
van Fraassen, B. "Belief and The Will." Journal of Philosophy 81 (1984): 235–256.
van Fraassen, B. "Calibration: A Frequency Justification for Personal Probability." In Physics, Philosophy, and Psychoanalysis: Essays in Honor of Adolf Grünbaum, edited by R. S. Cohen and L. Laudan. Dordrecht: Reidel, 1983.
Will, C. M. Theory and Experiment in Gravitational Physics. 2nd ed. Cambridge, U.K.: Cambridge University Press, 1993.
Will, C. M. Was Einstein Right? Putting General Relativity to the Test. New York: Basic Books, 1986.
Wilson, C. "Predictive Astronomy in The Century after Kepler." In The General History of Astronomy: Planetary Astronomy from the Renaissance to the Rise of Astrophysics, Part A: Tycho Brahe to Newton. Vol. 2, edited by R. Taton and C. Wilson. Cambridge, U.K.: Cambridge University Press, 1989.
Willam Harper (2005)
Oliver Schulte (2005)
"Scientific Method." Encyclopedia of Philosophy. . Encyclopedia.com. (August 20, 2018). http://www.encyclopedia.com/humanities/encyclopedias-almanacs-transcripts-and-maps/scientific-method
"Scientific Method." Encyclopedia of Philosophy. . Retrieved August 20, 2018 from Encyclopedia.com: http://www.encyclopedia.com/humanities/encyclopedias-almanacs-transcripts-and-maps/scientific-method