Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Encyclopedia of SociologyVol._1

.pdf
Скачиваний:
23
Добавлен:
23.03.2015
Размер:
6.29 Mб
Скачать

CASTE AND INHERITED STATUS

digging graves, handling corpses, and guarding tombs.

The outcasts were formally emancipated in 1871, by the Meiji government (1868–1912). The descendants of Burakumins were identified as ‘‘new common people.’’ However, they continued to face discrimination as their identity could be revealed through the household register system that included the ancestry of all Japanese families. Although now the household register is not made available to the public without the permission of the family, the identity of individuals is frequently revealed when families and employers conduct investigations for marriage purposes and hiring (Ishida 1992).

Rwanda, a country just south of the Equator in Africa and which has witnessed violent ethnic conflict is another example of a system of caste stratification. Before European colonization, political power was concentrated in the hands of the king and the pastoral aristocracy (Tutsi). The Tutsis constituted only about 10 percent of the population. Hutus, the lower caste of agriculturalists, formed the vast majority of the population. The lowest caste, known as Twa, were a small minority and worked as potters, court jesters, and hunters. No intermarriage was permitted between the groups. The Tutsis used their political and military power to maintain the hierarchical system (Southall 1970), particularly in central parts of the country.

This hierarchical relationship was later reinforced under colonial rule and lasted until it was brought to an end in the 1950s (Newbury 1988).

As Gerald Berreman (1981) has argued, the blacks in America and South Africa, the Burakumin of Japan, the Dalit of India, and the Hutu and Twa of Rwanda, all live in societies that are alike in their structure and in their effect on the life experiences of those most oppressed. However, those at the bottom do not accept their condition willingly. Often too powerless to revolt openly, structurally similar forms of domination create common forms of infrapolitics—of surreptitious resistance. Rituals of aggression, tales of revenge, use of carnival symbolism, gossip, and rumor are all examples of the strategic form of resistance the subordinates use for open defiance under severely repressive conditions (Scott 1990). Finally, similarities also exist in the political consequences of preferential policies that culturally distinct societies such as the

United States and India have adopted to reduce group disparities (Weiner 1983).

REFERENCES

Beidelman, T. O. 1959 A Comparative Analysis of the Jajmani System. Monograph VII of the Association of Asian Studies. Locust Valley, N.Y.: J.J. Augustin.

Berreman, Gerald 1981 Caste and Other Inequities Delhi:

Manohar.

Beteille, Andre 1969 Caste: Old and New. Bombay: Asia

Publishing House.

Blair, Harry W. 1980 ‘‘Rising Kulaks and Backward Castes in Bihar: Social Change in the Late 1970’s.’’

Economic and Political Weekly 12 (Jan.):64–73.

Brass, Paul R. 1983 Caste, Faction and Party in Indian Politics. Faction and Party, vol.I. Delhi: Chanakya Publications.

——— 1985 Caste, Faction and Party in Indian Politics. Election Studies, vol. II. Delhi: Chanakya Publications.

Breman, Jan 1974 Patronage and Exploitation. Berkeley:

University of California Press.

Dumont, Louis 1970 Homo Hierarchicus. Chicago: Uni-

versity of Chicago Press.

Freeman, James M. 1986 ‘‘The Consciousness of Freedom Among India’s Untouchables.’’ In D. K. Basu and R. Sisson, eds., Social and Economic Development in India. Beverly Hills, Calif.: Sage.

Galanter, Marc 1984 Competing Equalities: Law and the Backward Classes in India. Berkeley: University of California Press.

Gerth, H. H., and C. Wright Mills 1958 From Max Weber: Essays in Sociology. New York: Oxford University Press.

Ghurye, G. S. 1969 Caste and Race in India, 5th ed. Bombay: Popular Prakashan.

Gould, Harold A. 1987 The Hindu Caste System: The Sacralization of a Social Order. Delhi: Chanakya Publications.

Horowitz, Donald L. 1985 Ethnic Groups in Conflict. Berkeley: University of California Press.

Hutton, J. H. 1961 Caste in India, 4th ed. Oxford, Eng.: Oxford University Press.

Ishida, Hiroshi 1992 ‘‘Stratification and Mobility: The Case of Japan.’’ In Myron L. Cohen, ed., Asia Case Studies in the Social Sciences. New York: M.E. Sharpe.

Kolenda, Pauline 1978 Caste in Contemporary India: Beyond Organic Solidarity. Menlo Park, Calif.: Benjamin/Cummings.

254

CAUSAL INFERENCE MODELS

Kothari, Rajni 1970 Caste in Indian Politics. New Delhi:

Orient Longman.

Leach, E. R. (ed.) 1960 Aspects of Caste in South India, Ceylon, and North-West Pakistan. Cambridge, Eng.: Cambridge University Press.

Mahar, J. M. (ed.) 1972 The Untouchables in Contemporary India. Tucson: University of Arizona Press.

Rudolph, L. I., and S. H. Rudolph 1960 ‘‘The Political Role of India’s Caste Associations.’’ Pacific Affairs 33(1):5–22.

——— 1987 In Pursuit of Laxmi. Chicago: University of Chicago Press.

Scott, James C. 1990 Domination And The Arts Of Resistance: Hidden Transcripts. New Haven, Conn.: Yale University Press.

Sheth, D. L. 1987 ‘‘Reservation Policy Revisited.’’ Economic and Political Weekly, vol. 22, Nov.14.

Southall, Aidan W. 1970 ‘‘Stratification In Africa.’’ In Leonard Plotnicov and Arthur Tuden, eds., Essays in Comparative Social Stratification. Pittsburg, Pa.: University of Pittsburg Press.

Srinivas, M.N. 1962 Caste in Modern India. London: Asia

Publishing House.

——— 1969 Social Change in Modern India. Berkeley: University of California Press.

Weiner, Myron 1983 ‘‘The Political Consequences of Preferential Policies: A Comparative Perspective.’’

Comparative Politics 16(1):35–52.

Zelliot, Eleanor 1970 ‘‘Learning the Uses of Political Means: The Mahars of Maharashtra.’’ In Rajni Kothari, ed., Caste in Indian Politics. Delhi: Orient Longman.

RITA JALALI

CAUSAL INFERENCE MODELS

NOTE: Although the following article has not been revised for this edition of the Encyclopedia, the substantive coverage is currently appropriate. The editors have provided a list of recent works at the end of the article to facilitate research and exploration of the topic.

The notion of causality has been controversial for a very long time, and yet neither scientists, social scientists, nor laypeople have been able to think constructively without using a set of explanatory concepts that, either explicitly or not, have

implied causes and effects. Sometimes other words have been substituted, for example, consequences, results, or influences. Even worse, there are vague terms such as leads to, reflects, stems from, derives from, articulates with, or follows from, which are often used in sentences that are almost deliberately ambiguous in avoiding causal terminology. Whenever such vague phrases are used throughout a theoretical work, or whenever one merely states that two variables are correlated with one another, it may not be recognized that what purports to be an

‘‘explanation’’ is really not a genuine theoretical explanation at all.

It is, of course, possible to provide a very narrow definition of causation and then to argue that such a notion is totally inadequate in terms of scientific explanations. If, for example, one de-

fines causation in such a way that there can be only a single cause of a given phenomenon, or that a necessary condition, a sufficient condition, or both must be satisfied, or that absolute certainty is required to establish causation, then indeed very few persons would ever be willing to use the term.

Indeed, in sociology, causal terminology was almost deliberately avoided before the 1960s, except in reports of experimental research. Since that time, however, the notion of multivariate causation, combined with the explicit allowance for impacts of neglected factors, has gradually replaced these more restrictive usages.

There is general agreement that causation can never be proven, and of course in a strict sense no statements about the real world can ever be ‘‘proven’’ correct, if only because of indeterminacies produced by measurement errors and the necessity of relying on evidence that has been filtered through imperfect sense organs or fallible measuring instruments. One may accept the fact that, strictly speaking, one is always dealing with causal models of real-world processes and that one’s inferences concerning the adequacy of such models must inevitably be based on a combination of empirical evidence and untested assumptions, some of which are about underlying causal processes that can never be subject to empirical verification. This is basically true for all scientific evidence, though the assumptions one may require in making interpretations or explanations of the underlying reality may be more or less plausible in view of

255

CAUSAL INFERENCE MODELS

supplementary information that may be available. Unfortunately, in the social sciences such supplementary information is likely to be of questionable quality, thereby reducing the degree of faith one has in whatever causal assertions have been made.

In the causal modeling literature, which is basically compatible with the so-called structural equation modeling in econometrics, equation systems are constructed so as to represent as well as possible a presumed real-world situation, given whatever limitations have been imposed in terms of omitted variables that produce unknown biases, possibly incorrect functional forms for one’s equations, measurement errors, or in general what are termed specification errors in the equations. Since such limitations are always present, any particular equation will contain a disturbance term that is assumed to behave in a certain fashion. One’s assumptions about such disturbances are both critical for one’s inferences and also (for the most part) inherently untestable with the data at hand. This in turn means that such inferences must always be tentative. One never ‘‘finds’’ effects, for example, but only infers them on the basis of findings about covariances and temporal sequences and a set of untested theoretical assumptions. To the degree that such assumptions are hidden from view, both the social scientist and one’s readers may therefore be seriously misled to the degree that these assumptions are also incorrect.

In the recursive models commonly in use in sociology, it is assumed that causal influences can be ordered, such that one may designate an X1 that does not depend on any of the remaining variables in the system but, presumably, varies as a result of exogenous causes that have been ignored in the theory. A second variable, X2, may then be found that may depend upon X1 as well as a different set of exogenous factors, but the assumption is that X2 does not affect X1, either directly or through any other mechanism. One then builds up the system, equation by equation, by locating an X3 that may depend on either or both of X1 or X2, plus still another set of independent variables (referred to as exogenous factors), but with the assumption that neither of the first two X’s is affected by X3. Adding still more variables in this recursive fashion, and for the time being assuming linear and additive relationships, one arrives at the system of equations shown in equation system 1,

X1 = ε1

X2 = β21X1+ ε2

 

X3 = β31X1+ β32X2+ε3

(1)

 

Xk = βk1X1+ βk2X2+ βk3X3++βk,k−1Xk−1+εk

in which the disturbance terms are represented by the εi and where for the sake of simplicity the constant terms have been omitted.

The essential property of recursive equations that provides a simple causal interpretation is that changes made in any given equation may affect subsequent ones but will not affect any of the prior equations. Thus, if a mysterious demon were to change one of the parameters in the equation for

X3, this would undoubtedly affect not only X3 but also X4, X5, through Xk, but could have no effect on either of the first two equations, which do not depend on X3 or any of the later variables in the system. As will be discussed below, this special property of recursive systems does not hold in the more general setup involving variables that may be reciprocally interrelated. Indeed, it is this recursive property that justifies one’s dealing with the equations separately and sequentially as single equations. The assumptions required for such a system are therefore implicit in all data analyses

(e.g., log-linear modeling, analysis of variance, or comparisons among means) that are typically discussed in first and second courses in applied statistics.

Assumptions are always critical in causal analyses or—what is often not recognized—in any kind of theoretical interpretation of empirical data. Some such assumptions are implied by the forms of one’s equations, in this case linearity and additivity.

Fortunately, these types of assumptions can be rather simply modified by, for example, introducing secondor higher-degree terms, log functions, or interaction terms. It is a mistake to claim—as some critics have done—that causal modeling requires one to assume such restrictive functional forms.

Far more important are two other kinds of assumptions—those about measurement errors and those concerning the disturbance terms representing the effects of all omitted variables. Simple causal modeling of the type represented by

256

CAUSAL INFERENCE MODELS

equation system 1 requires the naive assumption that all variables have been perfectly measured, an assumption that is, unfortunately, frequently ignored in many empirical investigations using path analyses based on exactly this same type of causal system. Measurement errors require one to make an auxiliary set of assumptions regarding both the sources of measurement-error bias and the causal connections between so-called true scores and measured indicators. In principle, however, such measurement-error assumptions can be explicitly built into the equation system and empirical estimates obtained, provided there are a sufficient number of multiple indicators to solve for the unknowns produced by these measurement errors, a possibility that will be discussed in the final section.

In many instances, assumptions about one’s disturbance terms are even more problematic but equally critical. In verbal statements of theoretical arguments one often comes across the phrase ‘‘other things being equal,’’ or the notion that in the ideal experimental design all causes except one must be literally held constant if causal inferences are to be made. Yet both the phrase ‘‘other things being equal’’ and the restrictive assumption of the perfect experiment beg the question of how one can possibly know that ‘‘other things’’ are in fact equal, that all ‘‘relevant’’ variables have been held constant, or that there are no possible sources of measurement bias. Obviously, an alert critic may always suggest another variable that indeed does vary across settings studied or that has not been held constant in an experiment.

In recursive causal models this highly restrictive notion concerning the constancy of all possible alternative causes is relaxed by allowing for a disturbance term that varies precisely because they are not all constant. But if so, can one get by without requiring any other assumptions about their effects? Indeed not. One must assume, essentially, that the omitted variables affecting any one of the X’s are uncorrelated with those that affect the others. If so, it can then be shown that the disturbance term in each equation will be uncorrelated with each of the independent variables appearing on the right-hand side, thus justifying the use of ordinary least-squares estimating procedures. In practical terms, this means that if one has had to omit any important causes of a

given variable, one must also be willing to assume that they do not systematically affect any of its presumed causes that have been explicitly included in our model. A skeptic may, of course, be able to identify one or more such disturbing influences, in which case a modified model may need to be constructed and tested. For example, if ε3 and ε4 contain a common cause that can be identified and measured, such a variable needs to be introduced explicitly into the model as a cause of both X3 and

X4.

Perhaps the five-variable model of Figure 1 will help the reader visualize what is involved. To be specific, suppose X5, the ultimate dependent variable, represents some behavior, say, the actual number of delinquent acts a youth has perpetrated. Let X3 and X4, respectively, represent two internal states, say, guilt and self-esteem. Finally, suppose X1 and X2 are two setting variables, parental education and delinquency rates within the youth’s neighborhood, with the latter variable being influenced by the former through the parents’ ability to select among residential areas.

The fact that the disturbance term arrows are unconnected in Figure 1 represents the assumption that they are mutually uncorrelated, or that the omitted variables affecting any given Xi are uncorrelated with any of its explicitly included causes among the remaining X’s. If ordinary least squares is used to estimate the parameters in this model, then the empirically obtained residuals ei will indeed be uncorrelated with the independent X’s in their respective equations, but since this automatically occurs as a property of least-squares estimation, it cannot be used as the basis for a test of our a priori assumptions about the true disturbances.

If one is unwilling to accept these assumptions about the behavior of omitted variables, the only way out of this situation is to reformulate the model and to introduce further complexities in the form of additional measured variables. At some point, however, one must stop and make the (untestable) assumption that the revised causal model is ‘‘closed’’ in the sense that omitted variables do not disturb the patterning of relationships among the included variables.

Assuming such theoretical closure, then, one is in a position to estimate the parameters, attach

257

CAUSAL INFERENCE MODELS

 

Parental

Neighborhood

 

 

education

delinquency

 

ε1

 

X1

 

X2

 

ε2

 

 

 

ε4

ε3 X3 X4 Self-esteem

Guilt

Delinquent

acts X5 ε5

Figure 1. Simple Recursive Model

their numerical values to the diagram, and also evaluate the model in terms of its consistency with the data. In the model of Figure 1, for instance, there are no direct arrows between X2 and X3, between X4 and both X1 and X2, and between X5 and both X1 and X3. This means that with controls for all prior or intervening variables, the respective partial correlations can be predicted to be zero, apart from sampling errors. One arrives at the predictions in equation system 2.

r23.1 = 0

r14.23 = 0

r24.13 = 0

r15.234 = 0

r35.124 = 0

(2)

 

Thus, for each omitted arrow one may write out a specific ‘‘zero’’ prediction. Where arrows have been drawn in, it may have been possible to predict the signs of direct links, and these directional predictions may also be used to evaluate the model. Notice a very important property of recursive models. In relating any pair of variables, say,

X2 and X3, one expects to control for antecedent or intervening variables, but it is not appropriate to introduce as controls any variables that appear as subsequent variables in the model (e.g., X4 or X5). The simple phrase ‘‘controlling for all relevant variables’’ should therefore not be construed to mean variables that are presumed to depend on both of the variables being studied. In an experimental setup, one would presumably be unable to

carry out such an absurd operation, but in statistical calculations, which involve pencil-and-paper controlling only, there is nothing to prevent one from doing so.

It is unfortunately the case that controls for dependent variables can sometimes be made inadvertently through one’s research design (Blalock 1985). For example, one may select respondents from a list that is based on a dependent variable such as committing a particular crime, entering a given hospital, living in a certain residential area, or being employed in a particular factory. Whenever such improper controls are introduced, whether recognized explicitly or not, our inferences regarding relationships among causally prior variables are likely to be incorrect. If, for example, X1 and X2 are totally uncorrelated, but one controls for their common effect, X3, then even though r12 = 0, it will turn out that r12.3 0.

Recursive models also provide justifications for common-sense rules of thumb regarding the conditions under which it is not necessary to control for prior or intervening variables. In the model of Figure 1, for example, it can be shown that

although r24.13 = 0, it would be sufficient to control for either X1 or X3 but not both in order for the

partial to disappear. Similarly, in relating X3 to X5, the partial will be reduced to zero if one controls for either X2 and X4 or X1 and X4. It is not necessary

258

CAUSAL INFERENCE MODELS

µ1

 

 

X1

ρ21

 

 

X2

 

 

µ2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ρ31

 

 

 

 

 

 

 

 

µ3

 

 

X3

 

 

ρ43

 

 

 

 

 

X4

 

µ4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ρ52

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ρ54

 

 

 

 

 

 

 

 

 

 

X5

 

 

µ5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 2. Model of Figure 1, with Path Coefficients and Standardized Variables

to control for all three simultaneously. More generally, a number of simplifications become possible, depending on the patterning of omitted arrows, and these simplifications can be used to justify the omission of certain variables if these cannot be measured. If, for example, one could not measure X3, one could draw in a direct arrow from X1 to X4 without altering the remainder of the model. Without such an explicit causal model in front of us, however, the omission of variables must be justified on completely ad hoc grounds.

The important point is that pragmatic reasons for such omissions should not be accepted without theoretical justifications.

PATH ANALYSIS AND AN EXAMPLE

Sewall Wright (1934, 1960) introduced a form of causal modeling long before it became fashionable among sociologists. Wright, a population geneticist, worked in terms of standardized variables with unit variances and zero means. Expressing any given equation in terms of what he referred to as path coefficients, which in recursive modeling are equivalent to beta weights, Wright was able to derive a simple formula for decomposing the correlation between any pair of variables xi and xj. The equation for any given variable can be written as xi

= pi1x1+pi2x2+…+pikxk+ui, where the pij represent standardized regression coefficients and where

the lower-case x’s refer to the standardized variables. One may then multiply both sides of the equation by xj, the variable that is to be correlated

with xi. Therefore, xixj = pi1x1xj + pi2x2xj +…+ pikxkxj + uixj. Summing over all cases and dividing by the

number of cases N, one has the results in equation system 3.

r

xixj

xlxj

x2xj

xkxj

uixj

= –––– = p ––––+p

––––++p ––––+––––

ij

N

il N

i2 N

ik N

N

 

= p r + p r ++ p r + 0 =

p r

(3)

 

il lj

i2 2j

ik kj

k ik kj

 

The expression in equation system 3 enables one to decompose or partition any total correlation into a sum of terms, each of which consists of a path coefficient multiplied by a correlation coefficient, which itself may be decomposed in a similar way. In Wright’s notation the path coefficients are written without any dots that indicate control variables but are indeed merely the (partial) regression coefficients for the standardized variables. Any given path coefficient, say p54, can be interpreted as the change that would be imparted in the dependent variable x5, in its standard deviation units, if the other variable x4 were to change by one of its standard deviation units, with the remaining explicitly included independent variables (here x1, x2, and x3) all held constant. In working with standardized variables one is able to simplify these expressions owing to the fact that rij

259

CAUSAL INFERENCE MODELS

Father's

 

Respondent's

 

 

 

education

 

education

 

 

 

X1

 

 

 

X3

 

 

0.859

 

 

0.310

 

 

 

0.753

 

 

 

 

 

0.394

 

 

 

 

 

 

0.516

 

0.279

0.440

 

X5

 

 

 

 

 

 

Occupation

 

 

0.115

 

 

 

 

 

in 1962

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.281

 

X2

0.224

 

X4

0.818

 

Father's

 

 

 

 

First

 

 

 

 

 

 

occupation

 

 

job

 

 

 

Figure 3. Path Diagram for Blau-Duncan Model

= xixj/N, but one must pay the price of then having to work with standard deviation units that may vary across samples or populations. This, in turn, means that two sets of path coefficients for different samples, say men and women, cannot easily be compared since the standard deviations

(say, in income earned) may be different.

In the case of the model of Figure 2, which is the same causal diagram as Figure 1, but with the relevant pij inserted, one may write out expressions for each of the rij as shown in equation system 4.

r12 = p21r11 = p21(1) = p21

r13 = p31r11 = p31

r23 = p31r12 = p31p21 = r13r12 (or r23.1= 0)

r14 = p43r13 = p43p31

r24 = p43r23 = p43p31p21

(4)

r34 = p43r33 = p43

r15= p52r21+p54r41= p52p21+p54p43p31

r25= p52r22+p54r42= p52p54p43p31p21

r35= p52r23+p54r43= p52p31p21+ p54p43

r45= p52r24+p54r44= p52p43p31p21+ p54

In decomposing each of the total correlations, one takes the path coefficients for each of the arrows coming into the appropriate dependent variable and multiplies each of these by the total correlation between the variable at the source of

the arrow and the ‘‘independent’’ variable in which one is interested. In the case of r12, this involves multiplying p21 by the correlation of x1 with itself, namely r11 = 1.0. Therefore one obtains the simple result that r12 = p21. Similar results obtain for r13 and r34. The decomposition of r23, however, re-

sults in the expression r23 = p31r12 = p31p21 = r12r13, which also of course implies that r23.1 = 0.

When one comes to the decomposition of correlations with x5, which has two direct paths into it, the expressions become more complex but also demonstrate the heuristic value of path analysis. For example, in the case of r35, this total correlation can be decomposed into two terms, one representing the indirect effects of x3 via the intervening variable x4, namely the product p54p43, and the other the spurious association produced by the common cause x1, namely the more complex product p52p31p21. In the case of the correlation between x4 and x5 one obtains a similar result except that there is a direct effect term represented by the single coefficient p54.

As a numerical substantive example consider the path model of Figure 3, which represents the basic model in Blau and Duncan’s classic study,

The American Occupational Structure (1967, p. 17).

Two additional features of the Blau-Duncan model may be noted. A curved, double-headed arrow has been drawn between father’s education and father’s occupation, indicating that the causal paths between these two exogenous or independent variables have not been specified. This means that

260

CAUSAL INFERENCE MODELS

there is no p21 in the model, so that r12 cannot be decomposed. Its value of 0.516 has been inserted into the diagram, however. The implication of a failure to commit oneself on the direction of causation between these two variables is that decompositions of subsequent rij will involve expressions that are sometimes combinations of the relevant p’s and the unexplained association between father’s education and occupation. This, in turn, means that the indirect effects of one of these variables ‘‘through’’ the other cannot be assessed.

One can determine the direct effects of, say, father’s occupation on respondent’s education, or its indirect effects on occupation in 1962 through first job, but not ‘‘through’’ father’s education. If one had, instead, committed oneself to the directional flow from father’s education to father’s occupation, a not unreasonable assumption, then all indirect effects and spurious connections could be evaluated. Sometimes it is indeed necessary to make use of double-headed arrows when the direction of causation among the most causally prior variables cannot be specified, but one then gives up the ability to trace out those indirect effects or spurious associations that involve these unexplained correlations.

The second feature of the Blau-Duncan diagram worth noting involves the small, unattached arrows coming into each of the ‘‘dependent’’ variables in the model. These of course represent the disturbance terms, which in a correctly specified model are taken to be uncorrelated. But the magnitudes of these effects of outside variables are also provided in the diagram to indicate just how much variance remains unexplained by the model. Each of the numerical values of path coefficients coming in from these outside variables, when squared, turns out to be the equivalent of 1 − R2, or the variances that remain unexplained by all of the included explanatory variables. Thus there is considerable unexplained variance in respondent’s education (0.738), first job (0.669), and occupation in 1962 (0.567), indicating, of course, plenty of room for other factors to operate. The challenge then becomes that of locating additional variables to improve the explanatory value of the model. This has, indeed, been an important stimulus to the development of the status attainment literature that the Blau-Duncan study subsequently spawned.

The placement of numerical values in such path diagrams enables the reader to assess, rather easily, the relative magnitudes of the several direct effects. Thus, father’s education is inferred to have a moderately strong direct effect on respondent’s education, but none on the respondent’s occupational status. Father’s occupation is estimated to have somewhat weaker direct effects on both respondent’s education and first job but a much weaker direct effect on his later occupation. The direct effects of respondent’s education on first job are estimated to be only somewhat stronger than those on the subsequent occupation, with

first job controlled. In evaluating these numerical values, however, one must keep in mind that all variables have been expressed in standard deviation units rather than some ‘‘natural’’ unit such as years of schooling. This in turn means that if variances for, say, men and women or blacks and whites are not the same, then comparisons across samples should be made in terms of unstandardized, rather than standardized, coefficients.

SIMULTANEOUS EQUATION MODELS

Recursive modeling requires one to make rather strong assumptions about temporal sequences.

This does not, in itself, rule out the possibility of reciprocal causation provided that lag periods can be specified. For example, the actions of party A may affect the later behaviors of party B, which in turn affect still later reactions of the first party.

Ideally, if one could watch a dynamic interaction process such as that among family members, and accurately record the temporal sequences, one could specify a recursive model in which the behaviors of the same individual could be represented by distinct variables that have been temporally ordered. Indeed Strotz and Wold (1960) have cogently argued that many simultaneous equation models appearing in the econometric literature have been misspecified precisely because they do not capture such dynamic features, which in causal models should ideally involve specified lag periods. For example, prices and quantities of goods do not simply ‘‘seek equilibrium.’’ Instead, there are at least three kinds of autonomous actors— producers, customers, and retailers or wholesal- ers—who react to one another’s behaviors with varying lag periods.

261

CAUSAL INFERENCE MODELS

In many instances, however, one cannot collect the kinds of data necessary to ascertain these lag periods. Furthermore, especially in the case of aggregated data, the lag periods for different actors may not coincide, so that macro-level changes are for all practical purposes continuous rather than discrete. Population size, literacy levels, urbanization, industrialization, political alienation, and so forth are all changing at once. How can such situations be modeled and what additional complications do they introduce?

In the general case there will be k mutually interdependent variables Xi that may possibly each directly affect the others. These are referred to as endogenous variables, with the entire set having the property that there is no single dependent variable that does not feed back to affect at least one of the others. Given this situation, it turns out that it is not legitimate to break the equations apart in order to estimate the parameters, one equation at a time, as one does in the case of a recursive setup. Since any given variable may affect the others, this also means that its omitted causes, represented by the disturbance terms εi will also directly or indirectly affect the remaining endogenous variables, so that it becomes totally unreasonable to assume these disturbances to be uncorrelated with the ‘‘independent’’ variables in their respective equations. Thus, one of the critical assumptions required to justify the use of ordinary least squares cannot legitimately be made, meaning that a wide variety of single equation techniques discussed in the statistical literature must be modified.

There is an even more serious problem, however, which can be seen more readily if one writes out the set of equations, one for each of the k endogenous variables. To this set are added another set of what are called predetermined variables,

Zj, that will play an essential role to be discussed below. Our equation set now becomes as shown in equation system 5.

The regression coefficients (called ‘‘structural parameters’’) that connect the several endogenous variables in equation system 5 are designated as ßij and are distinguished from the γij representing the direct effects of the predetermined Zj on the relevant Xi. This notational distinction is made because the two kinds of variables play different roles

x1 = β12x2 + β13x3 ++ βlkxk + y11z1 + y12z2

++ y1mzm + ε1

x2 = β21x1 + β23x3 ++ β2kxk + y21z1 + y22z2

++ y2mzm + ε2

x3 = β31x1 + β32x2 ++ β3kxk + y31z1 + y32z2

 

++ y3mzm + ε3

(5)

 

xk = βk1x1 + βk2x2 ++ βk,k-1xk-1 + yk1z1 +

 

yk2z2 ++ ykmzm + εk

 

in the model. Although it cannot be assumed that the disturbances εi are uncorrelated with the endogenous X’s that appear on the right-hand sides of their respective equations, one may make the somewhat less restrictive assumption that these disturbances are uncorrelated with the predetermined Z’s.

Some Z’s may be truly exogenous, or distinct independent variables, that are assumed not to be affected by any of the endogenous variables in the model. Others, however, may be lagged endogenous variables, or prior levels of some of the X’s. In a sense, the defining characteristic of these predetermined variables is that they be uncorrelated with any of the omitted causes of the endogenous variables. Such an assumption may be difficult to accept in the case of lagged endogenous variables, given the likelihood of autocorrelated disturbances, but we shall not consider this complication further. The basic assumption regarding the truly exogenous variables, however, is that these are uncorrelated with all omitted causes of the X’s, though they may of course be correlated with the

X’s and also possibly each other.

Clearly, there are more unknown parameters than was the case for the original recursive equation system (1). Turning attention back to the simple recursive system represented in equation system 1, one sees that the matrix of betas in that equation system is triangular, with all such coefficients above the main diagonal being set equal to zero on a priori grounds. That is, in equation system 1, half of the possible betas have been set equal to zero, the remainder being estimated using ordinary least squares. It turns out that in the more general equation system 5, there will be too

262

 

 

 

 

 

CAUSAL INFERENCE MODELS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Parental

Neighborhood

 

education

delinquency

 

 

 

 

Z1

 

Z2

 

 

 

 

 

 

 

 

 

 

 

 

y11

 

y32

 

 

 

 

 

β12

 

X1

 

 

 

 

 

 

 

X2 Self-esteen

 

 

 

 

 

 

 

Guilt

 

β21

 

 

 

 

 

 

 

 

 

 

β13

 

 

 

 

 

 

β32

X3

Delinquent acts

Figure 4. Nonrecursive Modification of Figure 1

many unknowns unless additional restrictive assumptions are made. In particular, in each of the k equations one will have to make a priori assumptions that at least k − 1 coefficients have been set equal to zero or some other known value (which cannot be estimated from the data). This is why one needs the predetermined Zi and the relevant gammas. If one is willing to assume that, for any given endogenous Xi, certain direct arrows are missing, meaning that there are no direct effects coming from the relevant Xj or Z variable, then one may indeed estimate the remaining parameters. One does not have to make the very restrictive assumptions required under the recursive setup, namely that if Xj affects Xi, then the reverse cannot hold. As long as one assumes that some of the coefficients are zero, there is a chance of being able to identify or estimate the others.

It turns out that the necessary condition for identification can be easily specified, as implied in the above discussion. For any given equation, one must leave out at least k − 1 of the remaining variables. The necessary and sufficient condition is far more complicated to state. In many instances, when the necessary condition has been met, so will the sufficient one as well, unless some of the

equations contain exactly the same sets of variables (i.e., exactly the same combination of omitted variables). But since this will not always be the case, the reader should consult textbooks in econometrics for more complete treatments.

Returning to the substantive example of delinquency, as represented in Figure 1, one may revise the model somewhat by allowing for a feedback from delinquent behavior to guilt, as well as a reciprocal relationship between the two internal states, guilt and self-esteem. One may also relabel parental education as Z1 and neighborhood delinquency as Z2 because there is no feedback from any of the three endogenous variables to either of these predetermined ones. Renumbering the endogenous variables as X1, X2, and X3, one may represent the revised model as in Figure 4.

In this kind of application one may question whether a behavior can ever influence an internal state. Keeping in mind, however, that the concern is with repeated acts of delinquency, it is entirely reasonable to assume that earlier acts feed back to affect subsequent guilt levels, which in turn affect future acts of delinquency. It is precisely this very frequent type of causal process that is ignored whenever behaviors are taken, rather simply, as

‘‘dependent’’ variables.

263

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]