# A Running SEM and CFA Example

Jeremy Albright

Posted on
SEM CFA

## Industrialization and Democracy (Bollen, 1989)

The CFA and SEM examples are taken from data analyzed in Kenneth Bollen’s (1989) book Structural Equations with Latent Variables. The goal of the model is to determine how democracy and industrialization in 1960 are associated with democracy in 1965. The problem is that there are no good single items that capture the entirety of the concepts of democracy and industrialization. Hence, these concepts are treated as latent (unobserved) variables imperfectly measured by a set of observed indicators.

The data consist of 11 variables related to democracy and industrialization for 75 countries. Specifically, the variables are the following:

• y1: Freedom of the Press (1960)
• y2: Freedom of Political Opposition (1960)
• y3: Fairness of Elections (1960)
• y4: Effectiveness of Elected Legislature (1960)
• y5: Freedom of the Press (1965)
• y6: Freedom of Political Opposition (1965)
• y7: Fairness of Elections (1965)
• y8: Effectiveness of Elected Legislature (1965)
• x1: GNP per capita (1960)
• x2: Energy Consumption per Capita (1960)
• x3: Percentage of Labor Force in Industry (1960)

The first step will be to test Bollen’s measurement model for democracy. One latent variable is from 1960, the other from 1965. The model looks like the following: The circles represent the latent variables ($$\xi_1$$ = Democracy in 1960, $$\xi_2$$ = Democracy in 1965). The squares represent the observed variables. The one-headed arrows are implicit directions of causality, where the true democracy level influences the observed indicators. The two-headed arrows represent correlations. Note that the subscripts for the $$\lambda$$ for both latent variables are the same. This represents the constraint that the 1965 loadings will equal the 1960 loadings. In addition, each observed variable in 1960 has an error term that is correlated with its corresponding 1965 term, reflecting the fact that indiosyncracies in the measure at one time point will likely be present at the later time point. Finally, the errors between $$Y_2$$ and $$Y_4$$, as well as $$Y_6$$ and $$Y_8$$, are allowed to be correlated.

This measurement model is then included in a full SEM as follows: Here, $$\eta_2$$ is the latent variable for democracy in 1965, which is modeled as a function of industrialization in 1960 ($$\xi_1$$) and democracy in 1960 ($$\eta_1$$). Industrialization in 1960 is also a latent concept measured by three observed variables.