next up previous
Next: Collateral information Up: Collateral information and Mixed Previous: Introduction

The Loglinear Mixed Rasch model

Let the random variable $X_{ij}$, taking values $\{0,1\}$, denote the response of person $i$ to item $j$. To make things concrete, assume $X_{ij}=1$ if the correct answer is given and $X_{ij}=0$ if the wrong answer is given. Furthermore, let $\theta_{im}$ denote the latent trait, and $\delta_{jm}$ denote the item difficulty (or easiness) for members of latent class $m$. We assume that the probability of a correct response, on item $j$ by a person $i$ who is member of latent class $m$, follows the Rasch model.

(1) \begin{eqnarray}\html{eqn0}
	P_{ij\vert m} \equiv P \left( X_{ij} = 1 \; \vert ...
	...im} + \delta _{jm} } }
	{ 1 + e^{ \theta_{im} + \delta_{jm} } }
	\end{eqnarray}


The probability of the response pattern of person $i$, assuming local stochastic independence, is

(2) \begin{eqnarray}\html{eqn1}
P_{i\vert m} &=& \prod_{j=1}^{n} P_{ij\vert m} =
...
...{ \prod_{j} \left( 1 + e^{ \theta_{im} + \delta_{jm} } \right) }
\end{eqnarray}


where $t_{i} = \sum_{j} x_{ij}$ denotes the number of correct responses, or sum score. From the above formula we see that $t_{i}$ is sufficient for $\theta_{im}$ within a latent class. In other words, the (nuisance) parameter $\theta_{im}$ can be eliminated by conditioning on $t_{i}$ in latent class $m$. More explicitly, let $S_{t} = \{ \boldsymbol{x} : \sum_{j} x_{j} = t \}$ denote the set of response patterns with sum score $t$. The probability of a response pattern in this set for person $i$ who belongs to class $m$, is

\begin{eqnarray}\html{eqn2}
P_{t\vert m} &=& \sum_{\boldsymbol{x} \in S_{t}} P...
...} \right) }
\gamma_{t}\left( \boldsymbol{ \delta }_{m} \right)
\end{eqnarray}



where $\gamma_{t} \left( \boldsymbol{ \delta }_{m} \right)$ denotes the symmetric basis function, with the class specific vector of item difficulty parameters ( $\boldsymbol{\delta}_{m} = \delta_{m1},\cdots,
\delta_{mn}$) as argument. Now by conditioning on $t$ and $m$ we get

\begin{eqnarray}\html{eqn4}
P_{i\vert tm} &=& \frac{P_{itm}}{P_{tm}} = \frac{P...
...jm} } }
{ \gamma_{t}\left( \boldsymbol{ \delta }_{m} \right) }
\end{eqnarray}



Note that this expression is independent of $\theta_{im}$. Thus a more general notation can be adopted where the person index $i$ is replaced by an index denoting a response pattern, say $\nu$. Furthermore, if the latent class is dichotomous, taking values $\{0,1\}$, then $\delta_{jm}$ can be reparameterized as $\delta_{j} + m \Delta_{j}$ . Then, the $\log$ of the marginal probability is

(3) \begin{eqnarray}\html{eqn5}
\log P_{\nu m} &=& \log P_{\nu\vert tm} P_{tm} \no...
...{j} \left( x_{\nu j} \delta_{j} + x_{\nu j} m \Delta_{j} \right)
\end{eqnarray}


The response pattern frequencies have expected values given by the model

(4) \begin{eqnarray}\html{eqn6}
\log f_{\nu m} =\mu + \mu_{t}^{T} + \mu_{m}^{M} + ...
...1}^{n} \left( \mu_{x_j}^{X_j} + \mu_{x_{j}m}^{X_{j}M} \right)
\end{eqnarray}


where $f_{\nu m}$ denotes the expected frequencies of response pattern $\nu$ in latent class $m$. the main parameters $\mu_{x_j}^{X_J}$ are the item difficulty parameters, and the interaction parameters $\mu_{x_{j}m}^{X_{J}M}$ are the differences in item difficulties between the latent classes. We cannot fit a quasi independence model (the items are independent within the sum score $\times $ latent class cells), because class membership is not observed. Note that if $m$ would be observed we simply have a Rasch Model with an observed grouping variable. Unobserved variables can be handled with the EM-algorithm. The idea is to make an initial guess of the complete unobserved table (marginals), estimate the parameters using this table, and with these parameters we can construct a new complete table (expected table given the current parameter estimates and the observed incomplete table). We repeat this procedure until differences become sufficiently small. The algorithm thus splits the observed table into unobserved subtables that are most likely given the model. It is clear that the amount of difference in probability structure of these subtables is related to the ability of the algorithm to disentangle the observed tables in meaningful subtables. Experimenting with these models reveals that if differences in probability structure become 'too' small the algorithm is splitting up the table to incorporate a few outliers. Typically, in these cases, solutions are obtained with very small class sizes and extreme parameter estimates within these classes. These solutions are no more than capitalization on chance.
next up previous
Next: Collateral information Up: Collateral information and Mixed Previous: Introduction

Methods of Psychological Research 1999, Vol.4, No.3
© 2000 Pabst Science Publishers