In this design stimuli and treatment are completely crossed, with
subjects nested under treatment levels. In other words, the design is
within stimuli and between subjects. The linear model for a single score
of subject m on stimulus j in treatment condition i
is given by Equation (2):
![]()
In contrast to Equation (1), subjects instead of stimuli are nested
under the treatment conditions. The resulting variance components of
the models with stimuli treated as fixed versus random effects is
presented in Table 3. The main difference between the random- and the
fixed-effects model consists in the variance component
, that is,
the interaction between stimuli and treatment. This interaction is
included in the expected means of the treatment effect in the random-
but not the fixed-effects model. Consequently, the appropriate
Quasi-F ratio has to include the respective variance component
in the error term
(e.g., Hopkins, 1984). In contrast, the fixed-effects model resembles
mathematically a oneway ANOVA for the aggregated scores of the stimuli,
thus ignoring differential treatment influences on the stimuli.
|
Source | df | E(MS) |
| T | p-1 | |
| St | q-1 | |
| Su/T | p(n-1) | |
|
| (p-1)(q-1) | |
|
| p (q-1)(n-1) | |
The present design closely resembles the repeated measurements design
with subjects as random effects. In this previously discussed design the
treatment-by-subjects interaction variance component
(
) serves as an
estimation of the amount of chance fluctuations in the reactions of the
subjects - together with the true or manifest treatment-by-subjects
interaction caused by "real" differential effectiveness of the treatment.
In contrast, with respect to stimuli this conclusion is not valid.
Stimuli - as opposed to subjects which are essentially "open systems" - are
physically identical at different points of measurement. As a consequence,
it is not possible to have a reasonable notion of chance fluctuations with
respect to stimuli in the same way as it is for subjects. Stated differently:
The treatment-by-stimulus interaction (
)
is completely manifest.
Consequently,
is an exclusive indicator of
whether the treatment
influences all stimuli in the same way and to the same extent. As a result,
this measure has no implications for the question of internal validity. To
answer the question of whether an observed effect can be causally related to
the treatment variation at all or if it can also be explained by chance, it
is irrelevant whether this effect is identical across all stimuli. The
latter is a question of external or - in analogy to the concept of
population- validity (Hager & Westermann, 1983) - stimulus validity.
To accomplish a high level of stimulus validity one first of all has to
secure that the stimuli investigated are actually a sample from the
population of stimuli for which the tested hypothesis demands validity.
This assumption is trivially met, if the theory tested does not restrict
the population of stimuli to some subpopulation. The second aspect of
stimulus validity is the question whether the treatment causally influences
all stimuli in the same direction. That is, whether disordinal
interactions
exist between treatment and (subgroups of) stimuli. In analogy to the notion
of "aptitude-treatment interaction", one could speak of
stimulus-treatment interactions.
Only with respect to this question
becomes important.
Therefore,
should at least be reported descriptively.
Additionally,
one should investigate whether partitioning of the stimuli -
according to some control variables - is
possible and accounts for relevant proportions of
.
Moreover, it might be helpful to report and compare different
measures of effect size on the basis of intraclass correlations (also
denoted as "generalizability" coefficients, e.g. Cronbach, Gleser, Nanda,
& Rajaratnam, 1972). In effect, what is considered here is whether the
treatment has the same effect on all stimuli, that is, whether an
aggregation across different stimuli leads to valid conclusions. If
disordinal stimulus-treatment interactions exist, aggregation across
subgroups of stimuli is no valid approach (see Iseler, 1996a, 1996b for
a discussion of the deductive relation of single case and statistical
aggregate hypotheses and the methodological and statistical implications).
In order to allow for a more differentiated terminology concerning
questions of validity in experimental research, this second aspect of
stimulus validity - or population validity for that matter - might be
referred to as aggregation validity.
This kind of validity, however, is
entirely different from the question of "pure" internal validity, that is,
whether the treatment has causal effects at all.