... strict1
See e.g. Westermann & Hager (1986) or Erdfelder & Bredenkamp (1994) for the concept of strict and fair hypothesis testing. Although the reference of these authors to numerical 'probabilities' of confirmation resp. refutation of a hypothesis may be considered problematic, the basic ideas remain acceptable: A test is strict, if the prediction is daring (i.e., its fulfillment is unlikely, unless the tested hypothesis is valid), and it is unfair, if the judgment of confirmation is made dependent upon properties, which may very well fail to occur in situations, where the hypothesis under study is valid.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... condition?2
E.g., a problem, which has been solved under a condition aisn't any more a problem for the subject, and it doesn't make sense to observe a solution time of the same subject under another condition b, which is supposed to evoke longer solution times.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... covariates.3
As in every statistical prediction, we have to add tacidly 'notwithstanding the well known risks of erroneous decisions resulting from random sampling errors, which are handled by the usual methodology of inferential statistics'. Note that a 'representative random sample' (i.e., a sampling distribution with equal selection probabilities for all elements of a population) is required only if a sample is used to get information about the distribution of a variable in a population. But if an aggregate hypothesis is only a means of testing a hypothesis referring to all individuals (of a domain D), then this role can also be taken by a hypothesis referring to the process of selecting a unit and observing it. In this situation, the data of a sample can be regarded as results of repeated realizations of this process, which are independent up to effects of sampling without replacement (this limitation being shared with 'representative' random samples). See Iseler (1996a [12]) for a more detailed discussion of these issues.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... probabilities.4
In the notation of Steyer et al. (1995 [23], 1996 [24]), we may define a random variable Zx for every $x \in R$ with the following interpretation: If $Y \le x$ holds for the random variable Y (the dependent variable of the experiment), then Zx=1, and otherwise Zx=0. In other words, Zx is an indicator variable for the event $Y \le x$. Then our probabilities fu(c,x) and y(c,x) can be identified with conditional expectations of Zx by the equations $f_u(c,x)={\cal E}(Z_x \vert \,{U=u}, \,{X=c})$, and $y(c,x)={\cal E}(Z_x \vert X=c)$, where U and X are random variables with values in D resp. C indicating the selected unit resp. the experimental condition. Special problems resulting from zero probabilities of the conditioning events will be discussed in the sequel (see Footnote 8).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...),5
In the notation of Footnote 4, these expectations can be defined by $\mu{}_{uc} := {\cal E}(Y \vert \,U=u,\, X=c)$, and $\mu{}_c := {\cal E}(Y \vert X=c)$. The existence and finiteness of these expectations is assumed.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... [18]).6
See Steyer et al. (1995 [23], 1996 [24]) for a listing of further references for this issue.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... expectation.7
The denotation of the difference $\mu{}_b - \mu{}_a$ as an average causal effect could be a matter of debate in situations where the treatment variable is confounded with an organismic variable. But this issue can be left open for the present article, since an assumption of independence excluding such confoundings will be introduced immediately.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....8
See theorem 1 in Steyer et al. (1996) [24], and note that the expectation of a random variable is greater than 0, if the value of the random variable is almost surely greater than 0.  - The authors avoid problems with zero probabilities of conditioning events by the assumption of non-zero probabilities for these events. But the proof of the result $\mu{}_b - \mu{}_a > 0$ can be transferred easily to the following reformulated hypothesis: There is a version of the conditional expectation ${\cal E}(Y \vert U, X)$ such that the inequality $\mu{}_{ub} - \mu{}_{ua} > 0$ holds for every $u \in D$. In a situation of this kind, there may be other versions with $\mu{}_{ub} - \mu{}_{ua} \le 0$ for some $u \in D$, if the event $(U=u) \wedge (X=c)$ has a zero probability for some $u \in D$ and $c \in C$; but well known results of probability theory referring to conditional expectations (see e.g. Bauer, 1991, p. 119 [5]) can be applied to the present situation to obtain the following result: If there is a version of the conditional expectation ${\cal E}(Y \vert U, X)$ such that the inequality ${\cal E}(Y \vert U=u, X=a) < {\cal E}(Y \vert U=u, X=b)$ holds for every $u \in D$, then every other version will be $\pi$-almost-surely identical with that version. In other words, the inequality $\mu{}_{ub} - \mu{}_{ua} >_{\pi-a.s.} 0$ will hold for every version, and this is sufficient to derive the result $\mu{}_b - \mu{}_a > 0$ (i.e., ${\cal E}(Y \vert X=b) - {\cal E}(Y \vert X=a) > 0$) under the assumption of independence of U and X, if the probabilities of the events X=a and X=b are non-zero.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... realized.9
Of course, condition c is not a third condition in addition to a and b, but c is a token, which can stand for a or b.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....10
A $\sigma$-Algebra ${\cal A}_D$ in D is suitable for an application of Eq. (2), if the mapping $u \rightarrow f_u(c,x)$ is ${\cal A}_D$-${\cal B}$-measurable for every $c \in C$ and $x \in R$ (${\cal B}$ being the $\sigma$-algebra of Borel sets in R).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... way.11
It should be noted that that the relation between probabilities characterizing individuals and aggregates, which is formalized by Eqs. (1) and (2), follows immediately from basic assumptions of the theories of conditional expectations and of mixture distributions. However, there are also other models of aggregation, e.g. those underlying the technique of 'Vincentizing' proposed by Ratcliff (1979) [19] or Thomas and Ross (1980) [25], and these assumptions would lead to conclusions differing from those of the subsequent sections. It would lead too far to discuss whether there are situations where aggregation is modelled more adequately by these assumptions than by our Eqs. (1) and (2). So these equations have to be regarded formally as axioms; i.e., the conclusions to be drawn in subsequent sections are valid in situations, where the aggregation is modelled adequately by these equations.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... property.12
The concept of stability under aggregation resumes a topic studied some decades ago by Sidman (1952) [], Bakan (1954 [2], 1955 [3]) and Estes (1956) [8]. But the pioneering results of the last one of these authors cannot be applied to our problem, since they are based on the assumption that the functions under discussion belong to a parametric family of functions differing only in a finite number of parameters. E.g., the general aggregation stability of positive differences in expectations cannot be covered by this approach, since the maps fu(c,x) with this property differ in an infinite number of degrees of freedom, if the set of possible values of the dependent variable is infinite. See Iseler (1996b) [13] for a more general class of properties, which are stable under aggregation, but cannot be covered by the results of Estes.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...$y_\pi$.13
Certainly, the set A must also be an element of a suitable $\sigma$-algebra in D underlying the selection distribution $\pi$. (Otherwise, there would be no selection probability $\pi(A)$).  - In the language of probability spaces, aggregation stability can be defined as follows: A property H is stable under aggregation, iff property H follows for the map $y_\pi$ of every selection distribution $\pi$, where the map fU has this property $\pi$-almost-surely for every random variable U with distribution $\pi$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... interval14
Notational convention: If x' and x'' are real numbers, then the open interval ]x',x''[ is the set of all real numbers greater than x' and smaller than x''.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...$\mu{}_{\pi a} < \mu{}_{\pi b}$,15
Note for the conditional expectations approach that the independence of the random variables U and X is granted by the fact that the expectations $\mu{}_{\pi a}$ and $\mu{}_{\pi b}$ are based on the same selection distribution $\pi$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... median16
Note that there is no commonly accepted definition of the median of a random variable X with $P(X<x')<0.5<P(X \le x')$or $P(X<x')=0.5=P(X \le x'')$ for x' < x''.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... effect17
See Section 3 for a generalization of the concept of individual causal effects.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... distributions.18
See Footnote 16 for an explication of these problems.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....19
The mixture distribution approach would start with an interpretation of fu(c,x) as the probability of getting a ticket with a number up to x in the process of drawing a ticket with the fixed colour c from the fixed urn u.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... median.20
To prevent problems of meaningfulness (in the understanding of measurement theory), it should be pointed out that speaking of 'differences' in properties like variabilities or medians has not to be understood as a reference to algebraic differences, but only to the fact that these properties may differ (in a specific direction) under the conditions a and b.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... order'21
More precisely, the well introduced concept of 'stochastic order' (see e.g. Lehmann, 1955 [14]) would only imply $f_u(a,x) \ge f_u(b,x)$for every $x \in R$. Following the usual terminology for orders, the specification 'strict' excludes cases with fu(a,x) = fu(b,x)for all $x \in R$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... variable22
Recall that fu(c,x) is a cumulative probability.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...=0.5.23
A situation of this kind would consist of curves like those for $y_\pi(a,x)$ and $y_\pi(b,x)$ in Fig. 1 with interchanged roles of a and b. Note that in this figure the neighbourhood of p=0.5, where the pth order quantile is greater under condition a, can be made arbitrarily small, if all parameters $\theta_{uc}$ and $\eta_{uc}$underlying the maps fi and fj (according to Eq. (10) in the appendix) are multiplied by a constant greater than 1. On the other side, Eqs. (8) and (9) imply that the medians are not changed by this operation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... 2.2.24
Note that this problem would be unsolvable for inconsistent median values, e.g. $\mu{}^*_{\pi a} < \mu{}^*_{ia} < \mu{}^*_{ja}$. On the other side, the approach to be presented subsequently can also be used to construct other (and more extreme) examples of the median paradox.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...$Y=30 \cdot{}\psi(Y')$.25
In fact, the paradox was constructed first for the random variable Y', and then the transformation was applied to enable a graphical representation covering the entire part of the ogives with probabilities greater than 0 and less than 1.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... medians,26
An application of Eqs. (8) and (9) for $u \in \{i,j\}$ and $c \in \{a,b\}$ leads to 6 equations for 8 parameters. A system of equations with a unique solution results from the additional (arbitrary) specifications $\theta_{ia}=\theta_{jb}$ and $\psi^{-1}(f_i(a,x_i))=1$, where xi is the abscissa of the intersection point of the ogives for unit i. (More precisely, these specifications are arbitrary under the perspective of obtaining the given medians; but of course the first one is introduced to obtain the neat radial symmetry of Fig. 1. The reader may verify that it implies $\theta_{ib}=\theta_{ja}$ for the given symmetric arrangement of medians.) Defining x'i:=h(xi) and using Eq. (7), we can introduce x'i as a ninth unknown number (in addition to the 8 parameters) and rewrite the second specification as $\theta_{ia} \cdot{}x'_i+ \eta_{ia} = \theta_{ib} \cdot{}x'_i+\eta_{ib} = 1.$ It can be left to the reader to derive (10) from the resulting system of 9 independent equations.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Methods of Psychological Research 1997 Vol.1 No.4
© 1997 Pabst Science Publishers