The Behavioral Approach to Diplomatic History
The Behavioral Approach to Diplomatic History
J. David Singer
When it comes to the ability to understand and predict events of importance, students and practitioners of American diplomacy manifest a fair degree of ambivalence. On the one hand, we find many bold efforts to explain why certain events unfolded as they did, and, on the other, we find frequent statements to the effect that these phenomena are so complex as to defy comprehension. According to Henry Kissinger, one of the more celebrated practitioner-scholars, such understanding is often "in the nature of things… a guess." Or, as Robert Bowie put it, "The policymaker works in an uneasy world of prediction and probability." And George F. Kennan put it still another way: "I can testify from personal experience that not only can one never know, when one takes a far-reaching decision in foreign policy, precisely what the consequences are going to be, but almost never do these consequences fully coincide with what one intended or expected."
While there is truth in these statements, such uncertainty may not necessarily inhere in the phenomena we study. It may well be, rather, in the ways in which that study is conducted. At the risk, then, of suggesting that students of diplomatic history—American and otherwise—have plied their trade with less than a full bag of tools, this essay addresses a number of ways in which the behavioral approach might usefully supplement the more traditional procedures.
By behavioral approach, it is not meant to say that we should pay more attention to the behavior of individuals, factions, and states than to their attributes and relationships or to the regional and global environment within which such behavior occurs. If anything, diplomatic history seems to be overly attentive to behavioral phenomena, and insufficiently attentive to the background conditions and ecological constraints within which these phenomena occur. Normally, the behavioral sciences include psychology, anthropology, sociology, economics, and political science, but the range of disciplines embraced can be less interesting than the range of methods, concepts, and findings that might be borrowed from those who labor in those particular vineyards.
SOME PURPOSES OF HISTORICAL RESEARCH
One way to examine those possibilities would be in the context of the various purposes and goals that diplomatic historians might set for themselves. For some, the purpose of research is to locate and present the facts alone: What happened, in what sequence, under what conditions, and who was involved? Others go a step further and try to put those facts into graceful narrative. More typically, we seek not only to tell the story, but to do so in an interpretive fashion. This involves both a selection from among all the facts and an interpretation of them. In interpretive history, once we are persuaded as to the facts, we make certain inferences from them: causes, motives, and likely consequences, as well as missing facts.
Some historians (even some diplomatic historians) consider these missions too modest, and tend to be more ambitious. Among these, there are the "grand theorists," who offer up wide-ranging interpretations of several sets of events, telling us just what it all means, in terms reaching from the plausible to the outrageous. A growing number are, however, beginning to redefine their mission, albeit in a less pretentious direction. Instead of offering sweeping inferences from a limited and selected set of facts, these historians are moving toward the generation of knowledge that may be not only more complex, but more useful than that to which we have been accustomed.
TYPES OF KNOWLEDGE AND RELATED METHODS
The most distinctive characteristic of the behavioral approach is its emphasis on reproducible knowledge. This approach does not belittle or ignore knowledge and evidence of a more intuitive or subjective sort, but it does recognize the very real limits of such knowledge. Without insights and suspicions as to certain historical patterns, there would be no place to begin, no hypotheses to test, and no theoretical models to formulate. But in recognizing the impermanence and contestability of subjective knowledge, the behavioral approach seeks methods that might avoid some of those liabilities. These methods are of several types and are best understood in connection with the types of knowledge sought.
Historical knowledge may be distinguished by two very different sets of criteria. The first are essentially theoretical and substantive in nature: Are we indeed getting at the relevant combination of variables in our search for explanation? The second are epistemological: Assuming that we are on a promising substantive and theoretical path, what is the quality of knowledge that we think has been acquired or that we hope to acquire? Leaving the matter of the relevance of our knowledge aside for the moment, we can focus on the qualitative dimensions of our knowledge. One possible way of evaluating the quality of historical knowledge is to first reduce it to its component assertions or propositions, translate these (if need be) into clear and operational language, and then ascertain where each such proposition or cluster of propositions falls along each of three dimensions.
The first, or accuracy, dimension reflects the degree of confidence that the relevant scholarly community can have in the assertion at a given point in time; this confidence level is basically a function of the empirical or logical evidence in support of the proposition, but may vary appreciably both across time and among different scholars and schools of thought at any particular moment. The second qualitative dimension reflects the generality of the proposition, ranging from a single fact assertion (of any degree of accuracy) to an assertion embracing a great many phenomena of a given class. Third is the existential-correlational-explanatory dimension: Is the assertion essentially descriptive of some existential regularity, is it correlational, or is it largely explanatory? With these three dimensions, an epistemological profile of any proposition or set of propositions can be constructed and a given body of knowledge can be classified and compared with another, or with itself over time.
For many the objective is to move as rapidly as possible on all three dimensions. We seek propositions in which the most competent, skeptical, and rigorous scholars can have a high degree of confidence, although these propositions may have originally been put forth on the basis of almost no empirical evidence at all. They will be propositions that are highly "causal" in form, although they may have been built up from, and upon, a number of propositions that come close to being purely descriptive. And they will be general rather than particular, although the generalizations must ultimately be based on the observation of many particular cases. As to the accuracy dimension, a proposition that seems nearly incontrovertible for decades may be overturned in a day, one that is thought of as preposterous may be vindicated by a single study or a brilliant insight, and those that have stood at the "pinnacle of uncertainty" (that is, a subjective probability of 0.5) may slowly or quickly be confirmed or disconfirmed. Moreover, a statement may enjoy a good, bad, or mixed reputation not only because of its inherent truthfulness or accuracy, but merely because it is not in operational language and is therefore not testable.
Shifting from the degree-of-confidence dimension to that of generality, the assertion (of whose accuracy we are extremely confident) that World War I began on 29 July 1914 is less general than the assertion that more European wars of the past century began in the months of April and October than in others, and this in turn is less general than the assertion (which may or may not be true) that all wars since the Treaty of Utrecht have begun in the spring or autumn. Theory (defined here as a coherent and consistent body of interrelated propositions of fairly high confidence levels) must be fairly general, and no useful theory of any set of historical events can be built upon, or concerned only with, a single case. As Quincy Wright reminds us: "A case history, if composed without a theory indicating which of the factors surrounding a conflict are relevant and what is their relative importance, cannot be compared with other case histories, and cannot, therefore, contribute to valid and useful generalizations. On the other hand, a theory, if not applied to actual data, remains unconvincing." (In the same article, he also noted, "Comparison would be facilitated if quantifications, even though crude, are made whenever possible.")
Existential Knowledge and Data-Generating Methods When we leave the accuracy and the generality dimensions and turn to the third proposed dimension along which a piece or body of knowledge may be described, we run into greater conceptual difficulty. A useful set of distinctions are existential, correlational, and explanatory types of knowledge. Existential knowledge is essentially a data set, or string of highly comparable facts. If, for example, we are told that one army had 1,248 men killed or missing in a given battle and that the enemy had "also suffered heavily," we would have something less than data. Similarly, statements that the United States has had two separate alliances with France since 1815, running a total of forty-seven years, and that American alliances with England and Russia have been nearly the same in number and longevity as those with France, would also be something less than data. That is, data provide the basis for comparison and generalization across two or more cases, situations, nations, and so on, and permit the generation of existential knowledge.
Of course, existential knowledge would not be very useful to the diplomatic historian if restricted only to phenomena that are readily quantified. Most of the interesting phenomena of history are of the so-called qualitative, not quantitative, variety, and it is usually assumed that the world's events and conditions are naturally and ineluctably divided into those two categories. Many phenomena that are thought to be "qualitative in nature" at a given time turn out to be readily quantifiable at a later date. In the physical world, examples might range from the difference between yellow and orange to the amount of moisture in the air; these were originally believed to be qualitative concepts. In the biological world, one thinks of metabolic rate or genetic predispositions. Likewise, in the world of social phenomena a good many allegedly qualitative phenomena turn out to be quite quantitative. Some illustrations might be the "linguistic similarity" of two nations, the extent to which nations gain or lose diplomatic "importance" after war, the changing "cohesion" of work groups, or the national "product" of given economies.
It is one thing to think of a way to measure or quantify a phenomenon that has been considered nonquantifiable and quite another thing to demonstrate that the measurement is a valid one. That is, we may apply the same measuring procedure to the same phenomenon over and over, and always get the same score; that demonstrates that our measure is a reliable one. But there is no way to demonstrate that it is a valid one—that it really gets at the phenomenon we claim to be measuring. The closest we come to validation of a measure (also known as an index or indicator) is a consensus among specialists that it taps what it claims to be tapping, and that consensus will rest upon (a) the "face validity" or reasonableness of the claim; (b) the extent to which it correlates with a widely accepted alternative indicator of the same phenomenon; and (c) the extent to which it predicts some measurable outcome variable that it is—according to an accepted theoretical argument—supposed to predict.
Quantification, however, may take a second, and more familiar, form. That is, in addition to assigning a numerical value to each observation of a given phenomenon, one can quantify by merely (a) assigning each such case or observation to a given nominal or ordinal category, and then (b) counting the number of observations that fall into each such category. The nominal category pertains to a set of criteria that are used to classify events and conditions; an ordinal category refers to the criteria used to rank them. To illustrate, generalizing about the American propensity to form alliances might require distinguishing among defense, neutrality, and entente commitments. Once the coding rules have been formulated and written down in explicit language (with examples), a person with limited specific knowledge could go through the texts and contexts of all American alliances and assign each to one of those three categories.
The same could be done, for example, if one wanted to order a wide variety of foreign policy moves and countermoves, in the context of comparing the effects of different strategies upon the propensity of diplomatic conflicts to escalate toward war. The judgments of a panel of experts could be used to ascertain which types of action seem to be high, medium, or low on a conflict-exacerbating dimension. The earlier distinction between the reliability and validity of measures is quite appropriate here. There might be almost perfect agreement among experts that economic boycotts are higher on such a dimension than ultimata, since the latter are merely threats to act. But if one examined a set of diplomatic confrontations and found that those in which boycotts were used seldom ended in war, whereas those characterized by ultimata often did end in war, one might be inclined to challenge the validity of the ordinal measure.
So much, then, for existential knowledge. Whether merely acquired in ready-made form from governmental or commercial statistics, or generated by data-making procedures that are highly operational and reproducible, propositions of an existential nature are the bedrock upon which we can build correlational and explanatory knowledge.
Correlational Knowledge and Data Analysis Methods. Although many diplomatic historians will be quite content to go no further than the acquisition of existential knowledge, there will be others who will not only want to generalize, but also to formulate and test explanations. To do so, it is necessary to begin assembling two or more data sets and to see how strongly one correlates with the other(s). Correlation or covariation may take several forms and may be calculated in several ways, depending on whether the data sets are in nominal, ordinal, or interval (that is, cardinal number) form.
In general, a correlational proposition is one that shows the extent of coincidence or covariation between two (or more) sets of numbers. If these sets of numbers are viewed as the varying or fluctuating magnitudes of each variable, the correlation between them is a reflection of the extent to which the quantitative configuration of one variable can be ascertained when the configuration of the other is known. Or, in statistical parlance, the coefficient of correlation, which usually ranges from +1.00 to –1.00, indicates how accurately one can predict the magnitudes of all the observations in one data set once one knows the magnitudes in the other set of observations. Even though the measured events or conditions occurred in the past, we still speak of "prediction," since we know only that those phenomena occurred, but do not know the strength of association until the correlation coefficient has been computed.
Another way to put it is that the correlation between two sets of data is a measure of their similarity, whether they are based on pairs of simultaneous observations or ones in which variable Y was always measured at a fixed interval of time after each observation or measurement of variable X. If they rise and fall together over time or across a number of cases, they are similar, and the correlation between them will be close to +1.00; but if Y rises every time X drops, or vice versa, they are dissimilar, giving a negative correlation of close to –1.00. Finally, if there is neither a strong similarity nor dissimilarity, but randomness, the correlation coefficient will approach zero. There are many different measures or indices of correlation, usually named after the scholar who developed and applied them, but two of them can serve as good examples. Although any correlation coefficient can be calculated with pencil and paper or a calculator, the most efficient method is the computer, which can be programmed so that it can automatically receive two or more sets of data along with instructions as to which correlation formula to use, and almost instantaneously produce coefficient scores. Looking, then, at the very simple "rank order" correlation, we note that it is used to calculate the similarity or association between two sets of ranked data. It is particularly appropriate when we can ascertain only the orderings, from high to low or top to bottom, of two data sets and cannot ascertain with much confidence the distances or intervals between those rank positions. The rank order statistic is also especially appropriate for checking the validity of two separate measures or indicators and ascertaining whether they "get at" the same phenomena.
To illustrate, if we suspect that a fairly good index of a nation's power is simply the absolute amount of money it allocates to military preparedness—regardless of its population, wealth, or industrial capability—we might investigate how strongly that index correlates with an alternative measure. And, since power is itself a vague and elusive concept, we might decide to derive the second measure by having the nations ranked by a panel of diplomatic historians. When these two listings—one based on a single, simple index and the other based on the fallible human judgments of scholarly specialists—are brought together, we then compute the rank order correlation between them. The results of any such computation can in principle, as noted earlier, range from +1.00 to –1.00, with 0.00 representing the midpoint. If there is absolutely no pattern of association between the two rankings, we say there is no correlation, and the figure would indeed be zero. Further, if each nation has exactly the same rank position in both columns, the rank order correlation between the two variables is +1.00, and if the orders are completely reversed (with the nation at the top of one column found at the bottom of the other, and so on), it would be –1.00. None of these three extreme cases is likely to occur in the real world, of course, and on a pair of variables such as these, a rank order correlation of approximately +0.80 is pretty much what we would expect to find when the computation has been done.
The above example illustrates how a rank order correlation might be used to estimate the similarity between two different rankings. While a high positive correlation would increase confidence in the validity of military expenditure levels as a measure of power, we assumed no particular theoretical or causal connection between the two data sets. Now, however, suppose that we believed (that is, suspected, but did not know with very much confidence) that the war-proneness of a nation was somehow or other a consequence of its level of industrialization. If we only know how many wars a nation has been involved in during a given number of decades, we have a rather crude indicator of its war-proneness. Such a number does not discriminate between long and short wars; wars that led to a great many or very few fatalities; and wars that engaged all of its forces or only a small fraction. Thus, we would be quite reluctant to say that a nation that fought in eight wars is four times more war-prone than one that experienced only two military conflicts in a given period. We would even be reluctant to say that the difference between two nations that participated in six and four wars respectively is the same as that between those nations that fought in seven and five wars. In sum, we might be justified in treating such a measure of warproneness as, at best, ordinal in nature.
Suppose, further, that our measure of industrialization is almost as crude, based, for example, on the single factor of iron and steel production. Even though we might have quite accurate figures on such production, we realize that it is a rather incomplete index, underestimating some moderately powerful nations that have little coal or ore and therefore tend to import much of their iron and steel. In such a case, we would again be wise to ignore the size of the differences between the nations and settle for only a rank order listing. Depending on the magnitude of the resulting coefficient of correlation between these two rank orderings, we could make a number of different inferences about the relationship between industrialization and war-proneness. Suppose now that we were working with much better indices than those used in the two illustrations above, and that we could measure our variables with considerably greater confidence. That is, we now have a basis for believing that our indicators or measures are not only valid (and that has no bearing on the statistical tests that can be applied to a variable) but reliable and quite precise. If one variable were the amount of money spent for the operation of IGOs (intergovernmental organizations) in the international system each half-decade, and the other were the number of nation-months of war that ended in each previous half-decade, and such interval scale data appeared to be very accurate, we could employ a more sensitive type of statistical test, such as Pearson's product moment correlation.
The reason that a product moment type of correlational analysis is more sensitive is that its computation does not—because it need not—ignore the magnitude of the differences between the rank positions on a given pair of listings. Whereas rank order data merely tell us that the nation (or year, or case, or observation) at one position is so many positions or notches above or below another, interval scale data tell us how much higher or lower it is on a particular yardstick. The magnitude of those interrank distances carries a lot of useful information, and when the data are of such a quality to give us that additional information, it is foolish to "throw it away." Thus, when the measures of the variables permit, we generally use a product moment rather than a rank order correlation. As we might expect, certain conditions regarding the normality of the distributions, independence of the observations, randomness of the sample, and so on, must be met before we can use this more sensitive measure of statistical association. Once we have computed the rank order or product moment correlation coefficient between any two sets of measures, several inferences about the relationship between the variables become possible, providing that one additional requirement is met. If the correlation score is close to zero, we can—for the moment, at least—assume that there is little or no association between the variables and tentatively conclude that (a) one measure is not a particularly good index of the other (when validation of a measure is our objective), or (b) that one variable exercises very little impact on the other (when a correlational proposition is our objective). If, however, the correlation coefficient is about 0.50 or higher, either positive (+) or negative (–), we would want to go on and ask whether the above-mentioned requirement has been met.
That requirement is that the correlation be high enough to have had a very low probability of occurring by chance alone. That is ascertained by computing (or looking up in a standard text) the statistical significance of the correlation. When we have very few pairs of observations (or cases) in our analysis, even a correlation as high as 0.90 can occur by sheer chance. And when we have a great many cases, even a figure as low as 0.30 can be statistically significant. To illustrate with what is known as the Z-test, statisticians have computed that a product moment correlation would have to be as high as 0.65 if the association between 12 sets of observations were to be thought of as having only a 1 percent probability of being mere coincidence. Conversely, if there were as many as 120 cases, they calculate that a correlation as low as 0.22 would also have only a 1 percent probability of being mere coincidence. In statistical parlance, we say that for a given number of cases, a given correlation score is "significant at the 1 percent (or 2 percent or 5 percent) level."
Once we have ascertained that the strength of a given correlation, as well as its statistical significance, is sufficiently high (and the evaluation of "sufficiently" is a complex matter, still debated by statisticians and scientists), we can then go on to make a number of inferences about the predictive or the explanatory association between the variables being examined. The nature of those inferences and the justification for them is explored in the next section. Suffice it to say here that when two variables are strongly correlated, and one of them precedes the other in time, we have a typical form of correlational knowledge but are not yet able to say very much of an explanatory nature.
Explanatory Knowledge and Causal Inference It should now be quite clear that operational classification and enumeration, combined with statistical analysis of the resulting data sets, can eventually produce a large body of correlational knowledge. Further, it should be evident that correlational knowledge can indeed provide a rather satisfactory basis for foreign policy prediction, despite the limitations noted above. But the major limitation lies in the difference between predictions based on correlations from the past, and predictions based on theories. Without a fairly good theory (which, it will be recalled, is more than either a hunch or a model), our predictions will often be vulnerable on two counts.
First, there is the problem that has often intrigued the philosopher of science and delighted the traditional humanist. If the decision makers of nation A have a fair idea what predictions are being made about them by the officials of nation B, they can often confound B by selecting a move or a strategy other than the one they think is expected. A good theory, however, has built into it just such contingencies, and can often cope with the "we think that they think that we think, etc." problem. Second, a good theory increases our ability to predict in cases that have no exact (or even approximate) parallel in history. That is, it permits us to first build up—via the inductive mode—a general set of propositions on the basis of the specific cases for which we do have historical evidence, and then to deduce—from the theory based on those general propositions—back down to specific cases for which we do not have historical evidence.
If theories are, then, quite important in the study of foreign policy, how do we go about building, verifying, improving, and consolidating them? To some extent, the answer depends on one's definition of a theory, and the word has, unfortunately, disparate meanings. To the layman, a theory is often nothing more than a hunch or an idea. Worse yet, some define theory as anything other than what is real or pragmatic or observable; hence the expression that such and such may be true "in theory, but not in practice." The problem here is that—and this is the second type of definition—a number of scientists also imply that same distinction by urging that a theory need not be true or accurate, as long as it is useful. To be sure, many theories do turn out to be useful (in the sense that they describe and predict reality) even though they are built upon assumptions that are not true. One example is in the field of economics, where some very useful theories rest on the assumption that most individuals act on the basis of purely materialistic, cost-versus-benefit calculations. We are fairly certain that a great many decisions are made on the basis of all sorts of noneconomic and nonrational considerations, but, somehow or other, the market or the firm nevertheless tends to behave as if individual shoppers, investors, and so on do make such calculations. The important point here is that the theory itself is not out of line with reality, but that the assumptions on which it rests may be untrue without weakening the predictive power of the theory.
This leads to the need for distinguishing between theories that are adequate for predictive purposes and those of a more comprehensive nature that seek to not only predict, but to explain. While the dividing line between them is by no means sharp and clear, we can nevertheless make a rough distinction between those theories that are supposed to tell us what happens, or will happen under certain conditions, and those that tell us why it happens. Even in economics, it is recognized that the predictive power of its major theories can be improved, and their explanatory adequacy markedly enhanced, by looking into and rectifying the psychological or other assumptions on which they rest.
Thus, even though short-run needs may be served by theories that are merely predictive, the concern here is with theories that are capable of explaining why certain regularities (and deviations therefrom) are indeed found in human affairs. To repeat the definition suggested earlier, a theory is a logically consistent body of existential and correlational propositions, most of which are in operational and testable form, and many of which have been tested and confirmed. This definition requires that all of the propositional components in the theory be, in principle at least, true; further, if the theory is to explain why things occur as they do, the propositions underlying it must also be true. Given these stringent requirements, small wonder that that there is so little in the way of explanatory theory in the social sciences.
SOME BEHAVIORAL CONCEPTS
Shifting now from some of the methods associated with the behavioral approach, one of the more serious obstacles to a richer and more subtle understanding of diplomatic history may well be the rather restricted set of concepts used in seeking to put together predictive and explanatory models. To a considerable extent, concepts are limited to those used by the practitioners, their spokesmen, and the journalists who cover diplomatic events. Are there in the behavioral science literature some concepts that might provide new insights or suggest more powerful ways of thinking about diplomatic history?
First, there are several conceptual schemes that have developed to such a degree that they might qualify under the rubric of "theories"; indeed, they are so labeled by many of those from whose disciplines they emerge. Perhaps most promising is that set of notions that are called general systems theory. Proceeding from the assumption that there are structural similarities in different fields, and correspondences in the principles that govern the behavior of entities that are intrinsically widely different, this approach seeks to identify those similarities and correspondences (as well as dissimilarities) that might be found in the universes of all the scientific disciplines. In its search for an integrated theory of behavior, the general systems approach postulates the existence of a system, its environment, and its subsystems. Some of the key concepts employed are feedback, homeostasis, network, entropy, and information, reflecting a considerable intellectual debt to cybernetics. By thinking of the states as subsystems within the international system, which in turn has a particular environment of physical and social dimensions, we are provided with a rather fruitful taxonomy that suggests, in turn, a fascinating array of hypotheses. Within the same context, the idea of homeostasis is particularly suggestive to those concerned with balance, stability, and equilibrium in the international system.
Another set of concepts that seems to offer real promise is that employed in the theory of games. The clearest model postulates two or more players (individuals, groups, states, coalitions) pursuing a set of goals according to a variety of strategies. If the goals are perceived by the players as incompatible, that is, only one player may win, we have a so-called "zero-sum" or win-lose game, with the players tending to utilize a "minimax" strategy. If, however, they perceive a possible winwin outcome, their strategies tend to deviate sharply from the conservative minimax pattern, in which they place prime emphasis on minimizing their maximum losses. The appropriateness of such a model for an enduring rivalry seems rather evident.
We now turn from these very general conceptual schemes to some of the more limited concepts found in the specific behavioral disciplines. Looking first at psychology, from learning theory, stimulus-response theory, and the concepts associated with reinforcement, a wide range of models can be adapted and modified and could ultimately shed useful light on diplomatic influence, a central aspect of international relations. For example, is a major power more likely to shape the policies of a weaker neighbor by punishment, reward, denial, threat, promise, or calculated detachment? Or, in seeking to explain the way in which public opinion in a given state ultimately influenced a certain policy decision, we might find some valuable suggestions in reference-group theory, the concepts of access and role-conflict, or some of the models of communication nets. To take another problem area, if one were concerned with the emerging attitudinal characteristics of the international environment, such notions as acculturation, internalization, relative deprivation, self-image, or consensus might prove to be highly productive.
Or consider the discipline of sociology, from which many contemporary researchers in foreign affairs have borrowed heavily. If we seek to better understand the foreign policy of the United States or any other nation, we may want to think of the international system (regional or global) as similar to other social systems, but with national states—rather than individuals or groups—as the component units. Such systems manifest certain characteristics, and as these change, the behavior of the component units might also be expected to change. For example, certain social systems are highly stratified at certain times, in the sense that people who rank high on wealth are also high on education, prestige, and political power. Under such conditions, one might expect more conflict because the underdogs are deprived on every dimension. Might it also be that when the international system is highly stratified—with a few nations ranking at the top in wealth, resources, population, military capability, industrial output, and diplomatic status—the likelihood of sharp conflict goes up?
Remaining with sociological concepts, but shifting down from the systemic to the unit level of aggregation, certain individuals tend to be much more mobile than others, and as a result may acquire power more easily, or perhaps experience more conflict. That is, lateral mobility—by which is meant the rate at which individuals move in and out of certain cliques or associations—may also apply to nations, reflecting the rate at which they move in and out of blocs, alliances, or international organizations. Similarly, rapid vertical (upward or downward) mobility might be expected to get nations, as well as individuals, into more conflict than if they occupied a constant niche or moved up or down very gradually.
In the same vein, the concept of status inconsistency and its relationship to "deviant" behavior might merit closer examination. For example, if an individual ranks high on education or some other status-relevant dimension but low on political influence, he should—according to some sociologists—show a fair amount of deviant behavior. Do nations that rank high in certain prestige or status dimensions but low in power, manifest more odd and unpredictable behavior than those that are status-consistent?
As an example from the discipline of economics, consider the concepts of monopoly and oligopoly, reflecting the extent to which a given market is dominated by one firm or a handful of firms. The concentration of economic power may have its parallel at the international level, with a regional, functional, or global system manifesting a high degree of concentration as one or two nations enjoy most of the trade, industrial output, energy consumption, or military might in that system. The consequences of such high concentration, among firms or among nations, could be quite profound in its effects on such phenomena as conflict and cooperation, vertical mobility or stagnation, or the formation and dissolution of coalitions.
The range and variety of concepts that have been developed in the behavioral sciences is impressive indeed, as is the extent to which those concepts have often helped to differentiate, clarify, synthesize, or explain phenomena that had hitherto been quite baffling.
SOME BEHAVIORAL FINDINGS
Turning to the third possible sector in which the behavioral science approach might enhance our comprehension of diplomatic history, let us consider briefly some of the findings that emerge from these disciplines. By findings, we mean either existential or correlational propositions that seem to enjoy some standing in their home disciplines, and on the basis of which explanatory theories might be articulated.
One can hardly exaggerate the importance of these findings for diplomatic historians, and, of course, for practitioners. That is, those interested in foreign policy rest many of their interpretations, analyses, and predictions on behavioral science propositions that may or may not be accurate. First, they often extrapolate from the individual to the group or national level of aggregation, assuming that what holds for the individual will also hold for the collectivity. This is for purposes of speculation and hypothesis only. That is, in the absence of evidence to the contrary, it is probably economical to assume that if, for example, individuals tend to be more cooperative in the face of reward rather than in the face of punishment, so will corporations or nations.
On the other hand, there are some fundamental differences between individuals and collectivities. The primary difference, of course, is that individuals (or, more precisely, rational, intelligent, and informed ones) can be thought of as purposive, problem-solving entities, trying to maximize their particular values. Collectivities, on the other hand, exactly because they are made up of such individuals—each pursuing a mix of private and public goals—cannot be so conceived. The group or organization will, almost inevitably, pursue a range of goals reflecting a compromise and amalgam of the often incompatible goals of its more powerful individuals and subgroups. Thus, it is essential to be sufficiently familiar with the findings of such microsocial disciplines as psychology and the macrosocial ones of economics, sociology, and political science and to know something of the discontinuities between the individual and the collective levels of aggregation. The second way in which we rest analyses and predictions on behavioral science findings is more direct, with many models depending heavily upon the accuracy of assumptions about individual and collective behavior. This dependence is quite heavy, whether the focus is upon public opinion, elite recruitment, executive-legislative relationships, bureaucratic responsibility for policy execution, or the decision process itself. In each of these areas of activity, individuals and groups—with considerable propensities toward regular and consistent behavior—are playing key roles, and to the extent that there is an unawareness of the findings that reflect those regularities and consistencies, the accuracy and completeness of analyses is seriously limited.
Rather than select some limited number of existential and correlational propositions from the behavioral sciences and summarize the evidence in support or contravention, we can turn for a large number of these findings to the general source International Encyclopedia of the Social Sciences (Sills, 1979, which replaced the 1930–1935 and 1967 editions). Each section of the encyclopedia is written by a leading authority, and virtually all topics in the field are covered. Embracing nearly a dozen disciplines, however, rather than only part of one, it runs to sixteen volumes plus an index.
In the Encyclopedia, one finds summaries of the existential and correlational knowledge on such concepts as acculturation, aggression, anxiety, avoidance, business cycles, charisma, coalition formation, cognitive dissonance, mass communication, cybernetics, conformity, conditioning, conflict, cultural diffusion, decision making, defense mechanisms, demography, deviant behavior, diplomacy, disarmament, dominance, dreams, ecology, economic equilibrium, elites, ethnology, ethology, evolution, family structure, fatigue, fertility, forgetting, frustration, geopolitics, gestalt, motivation, homeostasis, identity, ideology, imperialism, income distribution, influence, inflation, interest groups, interpersonal interaction, kinship, land tenure, language, leadership, learning, legitimacy, loyalty, migration, social mobility, monopoly, norms, national character, neurosis, oligopoly, pacifism, paranoid reactions, perception (ten separate articles), personality, persuasion, pluralism, prejudice, prestige, propaganda, psychoanalysis, public opinion, punishment, race relations, reciprocity, reference groups, response set, roles, sanctions, self-image, sex differences, social stratification, stereotypes, stress, sympathy and empathy, thinking, traits, utilitarianism, utility, voluntary associations, voting, wages, war, and worship. (One also finds in the encyclopedia articles on such methodological matters as content analysis, contingency table analysis, curve-fitting, experimental design, multivariate analysis, statistical distributions, factor analysis, field work, forecasting, game theory, historiography, hypothesis testing, index construction, statistical inference, Markov chains, observation, panel studies, probability, rank correlation, scaling, simulation, spectral analysis, statistical inference, survey analysis, time series, typologies, and validity.)
Another very general source, although seriously outdated, is Human Behavior: An Inventory of Scientific Findings (Berelson and Steiner, 1964). After discussing the six most frequently cited procedures for generating the findings they report, the compilers go on to summarize what they consider to be the more interesting propositions to have emerged from research in the behavioral sciences. The substantive topics covered are behavioral development (meaning biological, emotional, and cognitive change as individuals mature); perceiving; learning and thinking; motivation; the family; small face-to-face groups; organizations; institutions; social stratification; ethnic relations; mass communications; opinions, attitudes, and beliefs; the society; and culture.
There are also collections of articles, summarizing the correlational and explanatory knowledge in the specific disciplines or problem areas. Among the more relevant are: Handbook of Developmental Psychology (Wolman, 1982); Handbook of Personality Theory and Research (Pervin, 1999); Handbook of Psychiatry (Solomon and Patch, 1974); Small Group Research: A Handbook (Hare, 1994); Handbook of Social Psychology (Gilbert, Fiske, and Lindzey, 1998); World Handbook of Political and Social Indicators (Taylor and Jodice, 1983).
There are two other—if dated—anthologies that not only summarize a good many concepts and findings from these related disciplines, but select and organize the articles on the basis of their applicability to specific topics in international affairs—Man and International Relations (Zawodny, 1966) and Human Behavior and International Politics (Singer, 1965).
Two collections that bring together the findings of research in foreign policy and international politics are Beyond Conjecture in International Politics: Abstracts of Data-Based Research (Jones and Singer, 1972) and Empirical Knowledge on World Politics (Gibbs and Singer, 1993). In these, the compilers attend only to published articles in English that generate, or rest upon, reproducible evidence. No effort is made to interpret, integrate, or evaluate the 300 or so studies that are covered, but they are very systematically arranged. Further, each is abstracted in accordance with a checklist that includes the following: query, spatial-temporal domain, outcome variable, predictor variable(s), data sources, data-making operations, data preparation and manipulation, data analysis procedure, findings, and related research. In addition, there is a recent compilation that brings together the ideas and research findings of both the behavioral scientists and the diplomatic historians in the very useful three-volume collection Encyclopedia of Violence, Peace, and Conflict (Kurtz, 1999).
When the first edition of the Encyclopedia of American Foreign Policy was published in 1978, the behavioral movement was just getting under way in the foreign policy and world politics fields. There were relatively few data-based findings to report. Since then, the number of scholars working in the behavioral science mode has risen from a mere dozen or so to about two hundred worldwide; these scholars have written perhaps four hundred articles and books, almost exclusively in English and largely designed to help account for war. Most of these have been summarized, and modestly integrated, in Nations at War: A Scientific Study of International Conflict by Daniel S. Geller and J. David Singer (1998).
World War I and the Iraq-Iran War of 1980– 1988 are two examples to be used to ascertain and illustrate the extent to which such examples conform to the patterns that emerge from the many studies that have looked at the effects of only two or three variables at a time across many historical cases since 1916. In the case of the Iran-Iraq War (1980–1990), there are the following specific instances of the more general patterns found in the larger literature: geographical contiguity, the absence of joint democratic regimes, the absence of joint advanced economies, a rapid shift in the joint relative capabilities, and, finally, the existence of an enduring rivalry characterized by seventeen militarized disputes during the half century run-up to war. Similarly, the case of World War I is marked by quite a few of the more general statistical findings: major powers on both sides, contiguity, shifting capabilities and an unstable balance, highly autocratic regimes on both sides, and, again, the longstanding rivalries.
There are several ways in which one might react to the foregoing information and suggestions. One might, for example, paraphrase that observer who told us that "history is bunk" and assert that "social science is bunk." Less frivolously, one might see little value in trying to apply the behavioral sciences to the study of diplomatic history, concluding that the investment will far exceed the likely gain. For those who conclude otherwise, it may nevertheless appear to represent a radical break with traditional style, and thus one that should not be taken lightly.
Not only can we benefit considerably by attending to the behavioral sciences, but to do so represents only a logical expansion of practices and procedures that for decades have been the stock-in-trade of historians. First, we note that the scientific method has been utilized for centuries in the solution of all sorts of physical and biological problems. But for a variety of reasons, ranging from religious taboos and superstition to the allegedly greater complexity of social phenomena, we have shied away from (if not vigorously resisted) its application to the study of social problems. That orientation has, however, been gradually eroded, partly through the work of courageous and creative scholars and partly because of the increasingly obvious need to replace folklore with knowledge.
In addition to the fact that social science is merely an extension of a given intellectual style already well established in the study of physical and biological phenomena, it is also quite nonrevolutionary in that it is little more than an extension of certain problem-solving processes that have always been used. While it is clearly an extension, the fact is that human beings have used a combination of logic and sensory observation for centuries in coping with social problems. In trying to understand what people did under certain conditions and why they did it, philosophers, kings, merchants, and soldiers have often employed a rudimentary form of scientific method. That is, they have tried (a) to identify and classify a variety of social events and conditions; (b) to ascertain the extent to which they occurred together or in sequence; and (c) to remember those observed co-occurrences.
But since they seldom have used explicit criteria in classification, they often placed highly dissimilar events and conditions in the same category; and since they seldom used constant criteria, they often forgot which criteria they had used for earlier classifications, with the same garbled results. Moreover, because one could not put social events on a scale, or measure the length and breadth of a social condition, their basic belief that social phenomena were not tangible, and therefore not measurable, was reinforced. This failure to measure and scale further reinforced the philosophic notion that whereas physical (and later, biological) phenomena were inherently quantitative, those of a social nature were inherently qualitative. Given this widespread belief, there was little effort to develop either the instruments of observation or the tools of measurement.
For centuries, then, social phenomena could be studied in a no more reliable or accurate fashion than if physical ones were studied without yardsticks, balance scales, or telescopes. To put it another way, the primitive essentials of scientific method were used but the critical refinements were ignored. Instead of aiding and enhancing their natural capacities to observe, remember, and reason, observers made a virtue of these very frailties and inadequacies by arguing that the incomprehensibility of social phenomena was inherent in the events and conditions themselves, rather than in the grossly inadequate methods used in that effort to comprehend. Modern social science, then, is nothing more than an application of methods already found useful in the other sciences and an extension and refinement of the basic methods always used. As in the familiar cliché, we have been "speaking prose" all along, but prose of a rather poor quality.
To be sure, the study of foreign relations remains as it was in the 1970s and 1980s. But this is not necessarily good news. First, there was the lively and early interest in the behavioral sciences approach among certain scholars and practitioners. In the early days of the peace research movement, for instance, one found copies of the Journal of Conflict Resolution and the Journal of Peace Research on the desks and shelves in certain self-selected offices in the Departments of State and Defense. Further, such agencies as the Advanced Research Projects Agency or the Office of Naval Research were practically caught up in the early enthusiasm of the 1960s for computer simulation, game theory analyses, or even the wide-ranging survey research and field interview strategies, as in the U.S. Army's Project Camelot. And the years following the Cuban missile crisis also saw moderate levels of involvement between U.S. and Soviet groups around a variety of conflict resolution conferences and field studies in Washington, New York, London, Moscow, and Ann Arbor, Michigan. But worth noting is, first, the relatively limited reflection of these interests in the scholarly literature of diplomatic and military history, and, second, the impact of U.S. intervention in the Vietnam War. By the early 1970s, the behavioral science enthusiasm had pretty much disappeared from both the policy and academic scene, with almost no residue in the scholarly literature.
Worse yet, with the demise of the Cold War, the early curiosity and experimentalism of the Cold War–Cuban missile crisis–Vietnam epoch was gradually replaced with a nouveau vague interest in approaches that were not only nonscientific but explicitly and ideologically antiscientific. For reasons not yet clearly evident, the collapse of the Soviet Union in the late 1980s culminated in the flowering of a scholarly literature of remarkable vitality and intellectual vacuity. Reference, of course, is to the postmodernist movement, embracing such variations as poststructural, postpositivist, and, perhaps, postbehaviorial. These orientations are found primarily in the humanities and those of the more humanistic social sciences, including, of course, history. In addition to the appearance of a new vocabulary in which words like "discourse," "contested," and "social construction" figure prominently, it is not surprising that the discipline of history is paying less attention to diplomatic and military phenomena and more to gender, race, and social class. While such variables were admittedly underrepresented in the study of foreign affairs and world politics during the twentieth century, this radical shift in both the theoretical and the methodological hardly bodes well for the future of the discipline.
Some might suggest that none of this matters a great deal, given how modest has been the contribution of the behavioral sciences. Scholars such as John L. Gaddis (1992) have gone out of their way to remind us that with all of the modern scientific paraphernalia in their toolbox, the behaviorists utterly failed to predict either the Soviet collapse or the end of the Cold War. Three responses seem appropriate. First, the behavioral science researchers in international politics were not alone in being asleep at the switch. Second, those who should have been alert to the Soviet demise were the specialists in the Cold War, Kremlinology, and contemporary diplomacy. Third, there were some who did indeed predict the end of the Cold War (as J. David Singer did in his 1986 article "The Missiles of October—1988: Resolve, Reprieve, and Reform").
In sum, it is very difficult to quarrel with Robin G. Collingwood's early recognition (1922) of the intellectual similarity between history and science:
The analysis of science in epistemological terms is thus identical with the analysis of history, and the distinction between them as separate kinds of knowledge is an illusion…. When both areregarded as actual inquiries, the difference of method and of logic wholly disappears…. Thenineteenth century positivists were right in thinking that history could and would become more scientific.
Barbieri, Katherine. Trade and Conflict. Ann Arbor, Mich., 2000. A careful and systematic examination of the historical pattern between trade and conflict.
Berelson, Bernard, and Gary A. Steiner. Human Behavior: An Inventory of Scientific Findings. New York, 1964. Although dated, this inventory summarizes much of the flowering research of the post–World War II period.
Brams, Steven J. Theory of Moves. New York and Cambridge, 1994. One of the richest and most accessible works on the game theory approach for the nonmathematical scholar.
Brecher, Michael, and Jonathon Wilkenfield. A Study of Crisis. Ann Arbor, Mich., 1997. An original and ambitious analysis of most diplomatic crises since 1919.
Bremer, Stuart A., and Thomas R. Cusack, eds. The Process of War: Advancing the Scientific Study of War. Amsterdam, 1995. Papers from a Correlates of War conference emphasizing the strengths of thinking of war as part of a process.
Bueno de Mesquita, Bruce, and David Lalman. War and Reason: Domestic and International Imperatives. New Haven, Conn., 1992. A strong presentation of the rational choice interpretation of the decisions for war.
Collingwood, Robin G. "Are History and Science Different Kinds of Knowledge?" Mind (1922). A prediction that historians would begin to embrace the procedures of scientific method.
Cusack, Thomas R., and Richard J. Stoll. Exploring Realpolitik: Probing International Relations Theory with Computer Simulation. Boulder, Colo., 1990. An exemplary examination of how computerized simulations of international interactions can shed light on underlying dynamics.
Diehl, Paul F., and Gary Goertz. War and Peace in International Rivalry. Ann Arbor, Mich., 2000. The most comprehensive investigation into the dynamics of international rivalries and the propensity to go to war.
Gaddis, John L. "International Relations Theory and the End of the Cold War." International Security 16 (1992): 5–58. A widely cited article that sought to understand Western failures to predict the end of the Cold War.
Geller, Daniel S., and J. David Singer. Nations at War: A Scientific Study of International Conflict. Cambridge and New York, 1998. A thorough codification of data-based findings on the correlates of international war.
Gochman, Charles S., and Alan Ned Sabrosky, eds. Prisoners of War? Nation-States in the Modern Era. Lexington, Mass., 1990. An anthology of papers summarizing databased findings on war as of the late 1980s.
Goldstein, Joshua S. Long Cycles: Prosperity and War in the Modern Age. New Haven, Conn., 1988. Examines the concept of cycles of a half-century or longer and how war and economic cycles correlate.
Houweling, Henk, and Jan G. Siccama. Studies of War. Dordrecht, Netherlands, 1988. Exemplary papers testing certain correlates of war models.
Huth, Paul K. Standing Your Ground: Territorial Disputes and International Conflict. Ann Arbor, Mich., 1996. Models and findings on the role of territorial disputes in the onset of war.
Janis, Irving L. Victims of Groupthink: A Psychological Study of Foreign-Policy Decisions and Fiascoes. Boston, 1972. Applies psychological findings to foreign policy decision making and its high error rate.
King, Gary, Robert O. Keohane, and Sidney Verba. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton, N.J., 1994. A broad examination of the quantitative-qualitative relationship in political science.
Lemke, Douglas, and Jacek Kugler, eds. Parity and War: Evaluations and Extensions of the War Ledger. Ann Arbor, Mich., 1996. Data-based studies of the impact of material capabilities of states, and rates of change therein, on the probabilities of war.
Leng, Russell J. Bargaining and Learning in Recurring Crises: The Soviet-American, Egyptian-Israeli, and Indo-Pakistani Rivalries. Ann Arbor, Mich., 2000. Examines which strategies make crises more or less escalation prone.
Maoz, Zeev. Paradoxes of War: On the Art of National Self-Entrapment. Boston, 1990. Fascinating discussion of the ways foreign policy decision makers generate outcomes quite different from those preferred.
Midlarsky, Manus I. Handbook of War Studies II. Ann Arbor, Mich., 2000. Excellent articles reviewing some of the central issues in the study of war and peace.
Morgan, T. Clifton. Untying the Knot of War: A Bargaining Theory of International Crises. Ann Arbor, Mich., 1994. Use of bargaining and game theory to understand the ways in which conflicts can lead to war.
Ray, James Lee. Democracy and International Conflict: An Evaluation of the Democratic Peace Proposition. Columbia, S.C., 1995. Evaluation of the evidence for and against the role of democratic regimes and reducing the incidence of war.
Russett, Bruce M. Grasping the Democratic Peace: Principles for a Post-Cold War World. Princeton, N.J., 1993. Systematic evaluation of the democratic peace proposition.
Singer, J. David, ed. The Correlates of War II: Testing Some Realpolitik Models. New York, 1980. Anthology of data-based studies of some correlates of international war.
——. "The Missiles of October—1988: Resolve, Reprieve, and Reform." Scandinavian Journal of Development Alternatives 5, no. 2 (1986): 5–13. A science fiction prediction that the Cold War would end in 1988.
Stam, Allan C., III. Win, Lose, or Draw: Domestic Politics and the Crucible of War. Ann Arbor, Mich., 1996. A suggestive, rare, valuable look into the role of domestic politics in decisions for war and peace.
Thompson, William R. On Global War: Historical-Structural Approaches to World Politics. Columbia, S.C., 1988. A long cycle historical interpretation of economic and strategic factors in war and the evolving hierarchy in the global system.
Vasquez, John A. The War Puzzle. Cambridge and New York, 1993. An influential effort to assemble and interpret data-based findings on the explanation of war.
Wayman, Frank W., and Paul F. Diehl, eds. Reconstructing Realpolitik. Ann Arbor, 1994. Collection of studies that scrutinized some of the more widely believed realpolitik models.
Wright, Quincy. "Design for a Research Proposal on International Conflicts." Western Political Quarterly 10 (1957). Some early proposed research foci by one of the pioneers of scientific research on world politics.
See also Decision Making; Public Opinion .
"THE NEXT ASSIGNMENT"
"[H]istorians, having dedicated their lives to the exploration and understanding of the past, are apt to be suspicious of novelty and ill-disposed toward crystal-gazing. In the words of my distinguished predecessor, they lack the 'speculative audacity' of the natural scientists, those artisans of brave hypotheses. This tendency on the part of historians to become buried in their own conservatism strikes me as truly regrettable. What basically may be a virtue tends to become a vice, locking our intellectual faculties in the molds of the past and preventing us from opening new horizons as our cousins in the natural sciences are constantly doing. If progress is to be made we must certainly have new ideas, new points of view, and new techniques."
— From William L. Langer, "The Next Assignment, American Historical Review 63, no. 2 (1958): 283–304 —