Next: Collateral information
Up: Collateral information and Mixed
Previous: Introduction
Let the random variable
, taking values
, denote the response
of person
to item
. To make things concrete, assume
if the correct
answer is given and
if the wrong answer is given. Furthermore, let
denote the latent trait, and
denote the item difficulty
(or easiness) for members of latent class
.
We assume that the probability of a correct response, on item
by a
person
who is member of latent class
, follows the Rasch model.
(1)
The probability of the response pattern of person
, assuming local stochastic independence, is
(2)

where
denotes the number of correct responses, or sum score. From the above
formula we see that
is sufficient for
within a latent class. In other words, the
(nuisance) parameter
can be eliminated by conditioning on
in latent class
.
More explicitly, let
denote the set of response patterns
with sum score
. The probability of a response pattern in this set for person
who belongs to class
,
is

where
denotes the symmetric basis function, with
the class specific vector of item difficulty parameters (
) as argument. Now by conditioning on
and
we get

Note that this expression is independent of
. Thus a more general notation can be adopted
where the person index
is replaced by an index denoting a response pattern, say
. Furthermore,
if the latent class is dichotomous, taking values
, then
can be reparameterized
as
. Then, the
of the marginal probability is
(3)

The response pattern frequencies have expected values given by the model
(4)

where
denotes the expected frequencies of response pattern
in latent class
.
the main parameters
are the item difficulty parameters, and the interaction parameters
are the differences in item difficulties between the latent classes.
We cannot fit a quasi independence model (the items are independent within the sum score
latent
class cells), because class membership is not observed. Note that if
would be observed we simply have
a Rasch Model with an observed grouping variable. Unobserved variables can be handled with the
EM-algorithm. The idea is to make an initial guess of the complete unobserved table (marginals),
estimate the parameters using this table, and with these parameters we can construct a new complete table
(expected table given the current parameter estimates and the observed incomplete table). We repeat this
procedure until differences become sufficiently small. The algorithm thus splits the observed table into
unobserved subtables that are most likely given the model. It is clear that the amount of difference in
probability structure of these subtables is related to the ability of the algorithm to disentangle the observed
tables in meaningful subtables. Experimenting with these models reveals that if differences in probability
structure become 'too' small the algorithm is splitting up the table to incorporate a few outliers. Typically,
in these cases, solutions are obtained with very small class sizes and extreme parameter estimates within
these classes. These solutions are no more than capitalization on chance.
Next: Collateral information
Up: Collateral information and Mixed
Previous: Introduction
Methods of Psychological Research 1999,
Vol.4, No.3
© 2000 Pabst Science Publishers