next up previous
Next: References Up: MPR-online 1998Vol.3, No.2 Previous: Data Example

Discussion

It was the goal of this article to show that in CFA applications the selection of an appropriate base model is crucial for proper interpretation of results. It was shown that (1) different base models can lead to the detection of different patterns of types and antitypes, and (2) the sampling scheme that underlies a research design determines, in part, what base model can be considered. More specifically, the distinction of the multinomial and product multinomial sampling schemes was used to show that

  1. Whenever a product multinomial sampling scheme is used, the uni- and multivariate marginals that are fixed must be considered. Therefore, zero-order CFA which compares the observed cell frequencies to an expected uniform distribution, can be employed in a product multinomial sampling design only of the design is balanced in the product multinomial variables. This applies accordingly to first and higher order CFA.
  2. When a multinomial sampling scheme is used, the effects considered in the base model depend only on the substantive hypotheses. There are no constraints that the sampling scheme poses on effects.
  3. When a mixed sampling scheme is used, the above arguments apply to the variables depending on what sampling was used.

We showed that PCFA can be applied in particular to data that were collected under sampling schemes where the predictors are fixed. Under the assumption that the predictor effects can manifest in (1) criteria main effects, (2) interactions among the criteria, and (3) predictor-criterion interactions, a problem was solved that was typical of the often unreflected standard application of CFA models (Krauth, 1996), that is, the problem that the base models of ISA and PCFA are the same although interpretation is very different at the substantive level. The present approach leads to base models that are different for ISA and PCFA. As was shown using the suicide data example, there is only one instance where ISA and PCFA base models can be the same.

More specifically, only if a base model is saturated in both the predictors and the criteria, the distinction between PCFA and ISA disappears. Base models that (1) meet the three criteria put forth in Section 2 and (2) are more parsimonious on the criterion side are PCFA models and thus differ from ISA models. The distinction between base models for ISA and PCFA has important consequences. Two consequences will be discussed here. First, researchers now can statistically test assumptions concerning the causal or predictive relationships between variable groups. The types and antitypes identified by PCFA can dramatically differ depending on whether the standard ISA model or some custom PCFA model was specified. This was illustrated in Section 3 of this article.

It may be important to realize that, from a log-linear modeling perspective, it is trivial to note that residuals are confounded with models. In the context of CFA, however, the selection of base models reflects two important considerations, both of which are crucial for interpretability and meaningfulness of identified types and antitypes. The first consideration is the aim of the study. Depending on aim, researchers select a CFA model, for example, PCFA instead of ISA. The second consideration concerns the status of variables as fixed or random, as predictors or criteria etc. The present article presented rules that can be used to specify a particular CFA base model. Thus, emerging types and antitypes are not just confounded with a CFA model, but with a model that reflects some scientific theory and thinking. From this perspective, the present article removes some of the arbitrariness from the process of selecting a CFA base model.

A second consequence is that researchers have to match PCFA base model and substantive hypotheses. It is not appropriate to simply use the base model that is saturated on both the predictor and the criterion sides. One consequence of selecting an unnecessarily complex model on the criterion side is that types and antitypes can remain undetected and thus, justice is not done to the predictive or causal characteristics of the variables on the predictor side.

The flexibility of base models for PCFA discussed here applies also to other CFA models that distinguish groups of variables. One such model is 2- or more-sample CFA. This method compares two or more pre-existing groups of cases using a number of discriminant variables. Differences manifest in form of discriminant types (and antitypes, in three or more group comparisons). In many applications, the pre-existing groups are fixed in number, that is, by design. Therefore, discriminant CFA can also fruitfully be viewed as an approach for which mixed-sampling base models are most appropriate. That is, the uni- and multivariate marginals of the grouping variables must be reproduced by design. In contrast, the uni- and multivariate marginals of the discriminant variables must be only reproduced if required by some substantive hypothesis.

It is important to realize that the first criterion of selecting base models is fulfilled in mixed-sampling PCFA also. The first criterion states that CFA base models must leave only one option for deviations open. In mixed-sampling PCFA this option involves effects of the dependent variables. Consider the case where only predictor effects are part of the base model (cf. von Eye, 1985). This base model can be contradicted and thus lead to types and antitypes only if there are effects on the side of the dependent variables. Such effects include main effects, interactions among the dependent variables, and interactions among the dependent and the independent variables. This applies accordingly to fixed-effect PCFA.

The base models that can be considered for PCFA range from a no effect base model, also called zero-order CFA base model on the criterion side to a saturated model on the criterion side (see Table 3). These two base models can be viewed as the poles of a continuum of complexity. The base model that is saturated in the criteria is the least parsimonious and requires special justification. The same applies to the most parsimonious model that is the model of no effects or zero order CFA on the criterion side (see Equation 5). A justification must (1) be grounded in substantive considerations and (2) conform to the three criteria listed in Section 2 of this article.

Future research needs to explore the consequences of considering sampling schemes in both more (1) depth and (2) breadth. First, one can ask whether other sampling schemes than the ones considered here also have consequences for the specification of CFA base models. Examples of such sampling schemes include but are not limited to hypergeometric sampling, stratification sampling, cluster sampling, and complex sampling which includes both stratification and cluster sampling. Second, one can ask whether the existing classification of CFA base models into global and regional models can be fruitfully expanded to also accommodate the sampling schemes and their consequences. In addition, one needs to determine the consequences of the present discussion for the recently introduced Bayesian CFA models (Wood et al., 1994; Gutiérrez-Peña & von Eye, in preparation).

From a more general research strategy perspective, this new criterion of considering sampling schemes (cf. von Eye & Schuster, in preparation) sheds new light on the characteristics of CFA as an exploratory approach, and on exploratory research in general. CFA, while in principle useable in explanatory research, is typically applied in exploratory contexts. Most researchers consider CFA a method that is largely free of assumptions that need to be made for proper application. However, even exploratory methods require specific conditions to be met. The present paper suggests that exploratory application of CFA cannot regress to blind application of base models that are unrelated to the substantive conceptions of the status of variables as predictors and criteria. The sampling and design characteristics of the variables must be taken into account. The data example in Section 4 illustrates how far results can be from meaningfully interpretable if these data characteristics are ignored.

In the development of CFA as a statistical method, the ``borrowing from log-linear modeling'' was routine. This practice went so far that defensively formulated articles were deemed in order that defended CFA as a method that allows on to answer questions that cannot be answered using log-linear modeling (see, for example, Lehmacher, 1984, or the debate between Langeheine, 1980, and Krauth, 1980). Recently, the picture has changed. CFA has become more and more independent of log-linear modeling and variants of CFA have been proposed that are quite different than standard log-linear residual analysis (Gutiérrez-Peña & von Eye, in preparation) or use fewer and fewer elements of log-linear modeling (von Eye, Spiel, & Rovine, 1995).

The present article presents a third element in this development. This article introduces concepts that are of relevance for the selection of base models in CFA. These concepts are well- known in the log-linear literature (Christensen, 1997). However, these concepts have, thus far, been chiefly discussed in regard to their implications for the estimation of parameters and odds ratios. After stating that the estimation of neither parameters nor odds ratios and their significance tests are not affected, most authors move on and talk about different topics. We believe, however, that the results presented in this article have implications for model selection in log-linear modeling as well. Specifically, when models are selected in which predictor-criteria relationships are modeled and/or some of the margins are fixed, the selection of the proper log-linear model is of colossal importance for the interpretation of results. While this is a topic for another discussion, it can be stated at this point that in log-linear modeling as well as in CFA, the selection of models is restricted depending on the nature of variables.

The present article discussed the selection of base models from the perspective of status of variables in PCFA. The main criteria for model specification were the status of variables as fixed versus random and as predictors versus criteria. The result of the application of the rules that were formulated was a base model that then was subjected to standard CFA routines. Yet, there are more criteria that can be used to select a base model. For instance, one can ask whether for the random variables on the criterion side, more parsimonious models than the saturated one can be considered, and how to identify these models. Table 3 indicates already that more parsimonious models may be possible. This applies accordingly when the predictors are random and the criteria are fixed, as is the case in Discriminant CFA (DCFA; see von Eye, Schuster, & Gutiérrez-Peña, in preparation). One can try to fit log-linear models to either only the predictors (PCFA) or only the criteria (DCFA) and thus arrive at a more parsimonious base model. The gain of this approach will be twofold. First, there will be more statistical power. Thus, it is more likely that types and antitypes will be detected. Second, and for the interpretation of types and antitypes equally important, these parsimonious models will show what main effects and interactions do versus do not exist in the criteria when the table is collapsed with respect to the predictors (criteria) thus creating a natural basis for interpretation. Future work will have to develop this argument in more detail (Schuster & von Eye, in preparation).

The present article focused on the model of Prediction CFA. PCFA is only one of a number of regional CFA models. These are models where variables can have a different status such as, for instance, predictors and criteria. Other examples of regional CFA models include Discrimination CFA where variables can be fixed on the predictor side. The consequences that different sampling schemes have for Discrimination CFA are not exactly the same as the consequences for PCFA. Therefore, a parallel paper is being prepared where sampling schemes are discussed for Discrimination CFA (von Eye, Schuster, & Gutiérrez-Peña, in preparation). Later works will have to address the issue of sampling schemes and their consequences for all of CFA.


next up previous
Next: References Up: MPR-online 1998Vol.3, No.2 Previous: Data Example

Methods of Psychological Research 1998 Vol.3 No.2
© 1999 Pabst Science Publishers