Measurement and Measurement Theory

views updated


Metrology in general and measurement theory in particular, have grown from various roots into fields of great diversity in the natural and social sciences, engineering, commerce, and medicine. Informally, and in its widest empirical sense, a measurement of a property, exhibited by stereotype objects in variable degrees or amounts, is an objective process of assigning numbers to the objects in such a way that the order-structure of the numbers faithfully reflects that of degrees or amounts of the measured property. Measuring instruments with pointers and calibrated scales for reading are the basic empirical means by which numerical assignments are realized. Abstractly, a particular way of assigning numbers as measures of extents of a property in objects is called a quantity scale. In the natural sciences, the results of measurement on a quantity scale are expressed in the form of denominate numbers, each comprised of a numerical value (magnitude) and a physical unit. Nominalists support the view that the results of measurement are not denominate numbers but numerals and perhaps other symbols.

Classical Temperature Measurement

To illustrate this morass of preliminary definitions, consider classical temperature measurement. Temperature is a local thermodynamic property of physical substances, linked to the transfer of thermal energy (heat) between them. From the standpoint of statistical mechanics, heat in a physical substance is a macroscopic manifestation of the random motion of the constitutive atoms or molecules. An increase of temperature in the substance matches the increase of rate of molecular motion, so that temperature can be rigorously conceived as a measure of the kinetic energy of molecules.

It is important to emphasize that classical temperature measurement does not depend on any of these deep underlying physical theories. In 1592 Galileo Galilei was able to measure temperature in a theory-independent way, using the contraction of air that drew water up a calibrated tube. Approximately a century later, Daniel G. Fahrenheit invented the mercury-in-glass thermometer, again without understanding energy conservation laws that were discovered and firmly established only after 1850. These remarks, however, are not all that obvious and must be taken with a grain of salt. Precise construction of thermometers and their calibration certainly relies on theories of heat and the correct representation of (freezing and boiling) reference points. Immediately a foundational question arises: Is measurement theory-laden? The answer to this question is subtle and depends on how measurement is modeled. Because modeling of numerical quantification of measurable properties makes no commitments to and assumptions about quantitative laws and substantive scientific theories, a straight answer must be in the negative. However, measurement theory addresses many issues that go well beyond the construction of quantity scales, including prominent relationships among quantity scales of measurable properties, studied by well-established scientific theories.

From the inception of quantifying temperature and other variable properties, the concept of measurement has proved to be a steady source of methodological difficulties. For example, it would be false to conclude that today it was twice as warm as yesterday because today the local temperature at noon was balmy ninety degrees and it was only forty-five degrees yesterday. The inference may appear correct because on the Fahrenheit scale indeed there is 90°F = 2 × 45°F. But to the opposite effect, a meteorologist equipped with a Celsius thermometer observed at the same site that the temperature today was 32.2°C and it was 7.2°C yesterday, inferring that today's temperature was approximately 4.6 times higher than yesterday. Based on the familiar conversion formula b °C = 5/9(a °F 32) from Fahrenheit to the Celsius scale, the meteorologist quickly obtains the equalities 32.2°C = 90°F and 7.2°C = 25°F, further corroborating that today's temperature on the Celsius scale is not twice as high as it was yesterday. Simple physical experiments show that it is not meaningful to make scale-independent comparative statements of the form above"yesterday was n times as warm as today," if the temperature is measured traditionally on an interval scale (including Celsius, Fahrenheit, Reaumur, and Rankine) in the sense of Stanley Smith Stevens (1960) and the definition recalled below. Science has little use for observational statements whose truth depends on the choice of quantity scales. In all cases of quantitative observation, the main interest is in those measurement data that are invariant under scale transformations. Louis Narens discusses many other examples of a similar nature in his Theory of Meaningfulness (2002).

A performance of any empirical observation is usually a complex activity that is impossible and (fortunately) unnecessary to report completely. The structure of a measurement-based observation that an experimenter is able to extract and analyze formally with some success is best captured by a measurement model. For example, in the simplest and best-known physical situation of temperature measurement, the experimenter assumes that the temperature-bearing entities (e.g., substances in vessels) can, at least conceptually, be identified and distinguished one from another, and then appropriately labeled or described. As common in other branches of mathematics, the experimenter next conceives of collecting such labels or mathematical descriptions of substances into a set, to be called a measurement domain and denoted M. Because this domain furnishes a mathematical basis for modeling the scale structure of measurable properties, care must be exercised in its selection. To simplify the preceding pedantic language in what follows the discussion will often refer to M as a domain of substances, objects, or events, when in actuality we mean a set of their mathematical labels or descriptions.

Galileo and Fahrenheit were able to order effectively many substances at given time instances in accordance with their exhibited degrees of the temperature property, here denoted t, without recourse to any antecedently established thermodynamical theories. This suggests that the scaling model of temperature measurement should be based on a designated comparative relation t , where the associated atomic formula "x t y " is meant to express that substance y is at least as warm as substance x, for all substances x and y belonging to the underlying domain M.

The Measurement Model

The resulting deceptively simple measurement model, commonly symbolized by the ordered pair (M, t ), captures the ordering of substances with respect to degrees of their temperature property t at a specified time instant. It should be clear that a similar model can be used to characterize the comparison of substances with respect to their mass property. In many measurement-theoretic applications, the foregoing comparative relation t , henceforth abbreviated to , enjoys the following pair of measurability properties for all elements x and y in the given domain M :

(i) Transitivity : If x y and y z, then x z.

(ii) Connectedness : x y or y x.

We associate with every comparative relation a canonical indiscernibility equivalence relation , defined by
x y iff x y and y x
for all x and y in M. Here the notation "iff" is a standard abbreviation for "if, and only if." Under the foregoing intended interpretation, the atomic formula "x y " encodes the fact that substances x and y have the same degree of temperature. It should be obvious that the relation partitions the domain M into equivalence classes of substances, where each class contains precisely those substances whose degrees of temperature coincide.

At this point we may ask: What are measurement models good for and how do we know that they are adequate? In measurement theory, measurement models have four basic functions: upholding numerical representation, specifying the uniqueness of representation, and capturing quantitative and qualitative meaningfulness.

representational role of measurement models

In their representational role, measurement models provide a mathematical basis for numerical quantification of extents, degrees, or amounts of measurable properties of objects. For example, in the case of temperature measurement, the possibility of numerical quantification of the variable temperature property t comes down to the existence of a quantity scale, rendered precise by a real-valued function, denoted Φ: M R, that assigns to each substance x in M a unique real number Φ(x ) in R (interpreted as the degree of temperature of substance x ) in such a way that the numerical order in the host field (R, ) of real numbers agrees with the comparative relation specified in the measurement model. Formally, we have the order-embedding representational condition
x y iff Φ(x ) Φ(y )
for all x, y in M. In general, there is no guarantee that an order-embedding function Φ exists. A major task of representational measurement theory is to find a body of empirically meaningful constraintsconstraining the structure of (M, ), usually called the representation axioms, such that they are necessary and sufficient for the existence of a quantity scale (order-embedding function) Φ. The preceding transitivity and connectedness properties are usually included in the collection of representation axioms, but generally they are not sufficient for the existence of a quantity scale. In essence, this is the way the experimenter expects to achieve a theoretically justified passage from qualitative observations (x is t -er than y ) to quantitative data that may be processed further by various computational and statistical means. It should be clear that the foregoing low-complexity measurement model is totally ineffective in characterizing the measurement of television violence, unemployment, and many other highly complicated attributes studied in the social sciences.

Not surprisingly, quantity scales (if they exist) are seldom unique. We have already seen that two arbitrary temperature measurement scales Φ: M R (e.g., for Celsius degrees) and Φ: M R (e.g., for Fahrenheit degrees) are always linked via functional composition of the form Φ(x ) = f (Φ(x )) for all substances x, where f : R R is an affine (positive linear) permissible transformation, specified by f (r ) = ar + b with a > 0 for all real numbers r. From the standpoint of algebra, the totality of permissible transformations between temperature quantity scales forms a numerical affine group. In general, a property is said to be measured on an interval scale provided that its family of permissible transformations is the affine group. Along similar lines, a property is measured on a ratio scale just in case its family of permissible transformations is the similarity group of all functions f : R R, specified by f (r ) = ar with a > 0 for all real numbers r. So the apparent relativism and arbitrariness in the choice of measurement methods and accompanying quantity scales are factored out by invoking pertinent scale-transformations. In addition to guaranteeing the existence of a quantity scale, representation axioms specify the correct group of permissible transformations between scales. Thus if the experimenter intends to draw conclusions about objective temperature values, he or she must consider the associated affine group of scale-transformations and ensure that they preserve all numerical relationships of interest.

detection of meaningless observational statements

Measurement models are instrumental in detecting meaningless observational statements; meaningfulness has long been a favorite of measurement theorists. We begin with the simplest characterization. Given a binary numerical relation ρ on the real line R, we say that ρ is quantitatively meaningful for the measurement model (M, ) just in case for all quantity scales Φ, Φ: M R the equivalence
Φ(x ) ρ Φ(y ) iff Φ(x ) ρ Φ(y )
holds for all elements x and y in M. It is easily seen that this definition automatically generalizes to n -place relations. For example, for any pair of temperature scales Φ (e.g., Celsius) and Φ (e.g., Fahrenheit) the equivalence
Φ(x ) Φ(y ) < Φ(z ) Φ(w ) iff Φ(x ) Φ(y ) < Φ(z ) Φ(w )
holds for all substances x, y, z, and w. The concept of quantitative meaningfulness is extremely useful in determining the applicability of statistical concepts (including sample averages and standard deviation) in the world of measurement data.

There is a closely related concept of qualitative meaningfulness that is based on the notion of automorphism. Recall that an order-embedding map α : M M of the domain of a measurement model (M, ) to itself is called a measurement automorphism precisely when it is one-to-one and onto. Briefly, a binary relation ρ on the measurement domain M is said to be qualitatively meaningful for the model (M, ) provided that for each measurement automorphism α : M M and for all x and y in M the equivalence
x ρ y iff α (x ) ρ α (y )
holds. Less formally, a binary relation ρ on M is measurement-theoretically meaningful for (M, ) if the exact identity of ρ -related objects is irrelevant. The only thing that matters is that the objects in M possess the measured property in equal amounts. In general, quantitative and qualitative meaningfulness are not coextensive. The notion of qualitative meaningfulness is important in delineating the class of model-definable relations. It is easy to check that the omnipresent indiscernibility relation is qualitatively meaningful for (M, ).

representation axioms

Finally, in addition to securing a quantity scale and its uniqueness (up to permissible transformations), representation axioms of a measurement model can also be viewed as capturing the overall empirical content under consideration, encountered in testing the measurement model's adequacy. In this context, measurement axioms are usually classified into rationality (design) axioms (including transitivity)assumed to be automatically true under the intended interpretation; structural (technical) axioms (e.g., the Archimedean axiom), crucial in establishing powerful representation theorems; and various testable empirical axioms, characterizing (often in a highly idealized way) specific measurement methods.

To appreciate the striking simplicity of measurement models, it is important to realize that these models represent the observational structure of a measurable property in such a way that most of the empirical detail of the actual observation is ignored. Here the experimenter is interested only in a basic abstraction that is based on comparisons of extents of given measurable properties, sufficient for a suitable order-preserving numerical quantification.

Representational Theory of Measurement

Measurement theory in general (as a branch of applied mathematics) and representational measurement theory in particular, are mainly based on work summarized in Foundations of Measurement (vol. 1, 1971) by David Krantz and others; Foundations of Measurement (vol. 2, 1989) by Patrick Suppes and others; and in Foundations of Measurement (vol. 3, 1990) by Duncan Luce and others. These authors use a model-theoretic (semantic) conception of empirical theories. In brief, instead of conceiving measurement theory as a deductively organized body of empirical claims, the semantic conception views a theory as a way of specifying a class of set-theoretic relational structures that represents various aspects of reality. The principal objectives of measurement theory are the study of set-based models of measurable properties of empirical objects, maps between them, and the representation of measurement models in terms of convenient numerical structures, with special regards to the relationships between the latter and affiliated quantitative theories of empirical objects.

Representational measurement theory studies many species of measurement models. In his Physics: The Elements, Norman Campbell (1920) noted that in modeling extensive properties (including, e.g., length, area, volume, mass, and electric charge), the above specified order-theoretic measurement model (M, ) has a powerful algebraic enrichment, typically symbolized by (M, , ), where is a binary composition operation on M, satisfying the following partially testable empirical conditions for all x, y, z, and w in M :

(i) Commutativity : x y y x.

(ii) Associativity : (x y ) z x (y z ).

(iii) Monotonicity : x y iff x z y z.

(iv) Positivity : x x y and not x y x.

(v) Strongly Archimedeanness : If x y and not x y, then for any z and w there exists a positive integer n such that n x z n y w, where n x is defined inductively by setting 1 x = x and (n + 1) x n x x.

In the case of length measurement, the measurement domain M consists of suitable and to some extent idealized length-bearing entities (e.g., straight, rigid rods) that can be properly identified and distinguished one from another. Because length measurement is modeled within a classical framework, relativistic reminders that length is not an intrinsic property of rods but something relationalrelative to inertial reference frameswill not be of concern.

To measure length in a basic way, independently of any application of laws, the experimenter operationalizes the comparative "at least as long as" relation by placing two rods side by side in a straight line, with one end of the rods coinciding, and observing which one extends at the other end. In this manner the experimenter has an effective way of determining whether the relational formula "x y " holds for virtually any pair of rods x and y in M. Of course if rod x is a physical part of rod y or is equal to y, then the validity of "x y " is accepted by default. The composition x y of rods x and y is understood to be the rod obtained by the operation of placing rods x and y end to end in a straight line. Thus we take the abutted combination of rods x and y to be the whole composite rod x y.

We know from David H. Krantz and others (1971, p. 73) that the representation axioms above are necessary and sufficient for the existence of a real-valued, order-embedding, additive scale function Φ: M R, satisfying the representational condition
Φ(x y ) = Φ(x ) + Φ(y )
for all x, y in M. We see that the representation axioms not only justify a numerical quantification of amounts or extents of measurable properties, they capture the structure of the associated extensive measurement process itself.

In his basic concepts of measurement Brian Ellis (1966) addresses the question whether the preceding interpretation of composition operation is intrinsic to physical measurement of length or is perhaps just a convenient convention. Ellis points out that the representation axioms listed above remain valid even if the experimenter uses an orthogonal concatenation of rods. Specifically, this time the composite rod x ' y is obtained somewhat artificially as a rod formed by the hypotenuse of the right triangle, whose sides are the rods x and y. Thus here the experimenter is abutting x and y perpendicularly rather than along a straight line. Not surprisingly, because the operational peculiarities of respective compositions in a straight line versus orthogonally are not visible in the representation axioms, the corresponding enriched measurement models (M, , ) and (M, , ') are measurement-theoretically indiscernible. Ellis holds a conventionalist view of measurement, in the sense that measurable properties do not exist independently of their methods of measurement.

The technical problem of "x x " is circumvented by using an unlimited supply of copies of x (so that x x x y, where x y ) or by passing to a partial composition operation. Ontological objections against using models with infinitely many objects are obvious. Another problem is whether the comparative relation and composition of a measurement model (M, , ) are directly observable. Scientific realists in particular argue that in general the representation axioms treat the empirical structures of measurement models as something decisively theoretical.

There are several ways to develop a general theory of derived measurement. In some ways the most natural place to start is with the notion of fundamental measurement, covered earlier. A measurable property is said to be fundamental or basic provided that its measurement does not depend on the measurement of anything else. Simply, a measurement theorist starts with a measurement model (M, , ) of a basic property together with the characterizing representation axioms and then proves the existence and uniqueness of the quantity scale. No other measurement models are needed.

In contrast, a derived measurable property is measured in terms of other previously established quantity scales and measurement models. A classical example in physics is density, measured as a ratio of separate measurements of mass and volume. To avoid conceptual confusion, it is not suggested that a fundamental measurement of density is impossible. When mass and volume are known, there are offsetting advantages to working with a derived notion of density. Another question is whether any measurement is truly basic.

A Brief History of Measuring Devices

It is invariably difficult to trace the origins of measurement devices. Weights and measures were among the earliest tools, invented and used in primitive societies. Ancient measurements of length were based on the use of parts of the human body (e.g., the length of a foot, the span of a hand, and the breath of a thumb). Time was measured by water clocks, hourglasses, and sundials.

The earliest weights were based on objects frequently weighed (e.g., seeds, beans, and grains). Comparisons of capacities of containers were performed indirectly by filling gourds and vessels with plant seedswhich were later countedand water. These qualitative measurement methods, used in conjunction with crude balance scales, formed a basis of early commerce. There was an enormous proliferation of local and national measurement systems and units (e.g., Egyptian around 3000 BCE; Babylonian around 1700 BCE; Greek in 500 BCE; and Roman around 100 BCE). Romans adapted the Greek system that was later adopted with local variations throughout Europe as the Roman Empire spread. As these methods of associating numbers with physical objects were growing, it became possible to compare the objects abstractly by comparing the associated numbers and to combine them by manipulating numbers. In the presence of standardized units accepted by the whole community it became possible to replace accidental comparatives of the form "five times the width of my finger" with more universal but still unit-dependent "3.75 inches."

In England in the early thirteenth century, measures and weights (strongly influenced by the Roman system) quickly evolved along the lines of strict standardization. In France, standardization of measures and weights came several centuries later. In 1670 Gabriel Mouton, a French priest, proposed the establishment of a decimalized metrology of weights and measures. The unit of length that was finally decided on was one ten-millionth part of a meridional quadrant of the earth. Weight of a cubic decimeter of distilled water at maximum density temperature of 4°C was adopted as the kilogram. (During the second half of the twentieth century there was a shift away from standards based on particular artifacts toward standards based on stable quantum properties of systems.) The adoption of the metric system in France and generally in Europe was slow and difficult, until the International Bureau of Weights and Measures, formed in 1875, recommended the universal adoption of the MKS metric system in European countries that was subsequently signed in seventeen states. In the modern SI (Système International d'Unites) version of the metric system, there are seven base units (length, mass, time, temperature, electric charge, luminous intensity, and phase angle) from which all other units of measurement are derived.

One impressive feature of modern science is the rapidity with which new measuring instruments are being developed. For example, in the case of time measurement, and starting from imprecise ancient water clocks and hourglasses, people in the Middle Ages built town clocks (maintained by hand) to display local time. In 1656 Christian Huyghens built the first accurate pendulum clock; less than a century later John Harrison presented the first nautical chronometer. In 1928 Joseph Horton and Warren Morrison built the first quartz crystal oscillator clock. And finally, in 1950, Harold Lyons developed an atomic clock based on the quantum mechanical vibrations of the ammonia molecule. Cesium atomic clocks measure time with an accuracy of 1015 seconds.

Experimental science has progressed thanks in great part to the speedy development of highly accurate measuring devices in nearly all branches of science, engineering, and medicine. The symbiotic relationship between theoretical research and measurement methodology continues to be a fundamental factor in the development of science. Philosophically, measurement is important because it provides empirical foundations for the construction of quantitative scientific theories, necessary for reliable prediction and explanation of vast categories of empirical phenomena.

See also Decision Theory; Experimentation and Instrumentation; Quantum Mechanics; Suppes, Patrick.


Campbell, Norman. Physics: The Elements. Cambridge, U.K.: Cambridge University Press, 1920.

Ellis, Brian David. Basic Concepts of Measurement. London: Cambridge University Press, 1966.

Heisenberg, Werner. The Physical Principles of the Quantum Theory. Translated into English by Carl Eckart and Frank C. Hoyt. Chicago: University of Chicago Press, 1930.

Krantz, David H., R. Duncan Luce, Patrick Suppes, and Amos Tversky. Foundations of Measurement. Vol. 1: Additive and Polynomial Representations. New York: Academic Press, 1971.

Luce, R. Duncan, David H. Krantz, Patrick Suppes, and Amos Tversky. Foundations of Measurement. Vol. 3: Representation, Axiomatization and Invariance. New York: Academic Press, 1990.

Narens, Louis. Theories of Meaningfulness. Mahwah, NJ: Erlbaum, 2002.

Stevens, Stanley Smith. "On the Theory of Scales of Measurement." In Philosophy of Science, edited by Arthur Danto and Sidney Morgenbesser, New York: Meridian, 1960.

Suppes, Patrick, David H. Krantz, R. Duncan Luce, and Amos Tversky. Foundations of Measurement. Vol. 2: Geometrical, Threshold and Probabilistic Representations. New York: Academic Press, 1989.

Zoltan Domotor (2005)