next up previous contents
Next: Testing strategies for monotonic Up: Qualitative Trends And Trend Previous: Monotonic trend as an

Some testing strategies aiming at monotonic trends

A variety of testing strategies come in question when considering statistical hypotheses about monotonic trends. The application of a global F test of analysis of variance, followed by a differential and data-based interpretation seems to be the most widespread. As has been argued above, this procedure is problematic, however, as the differential interpretation is not in accordance with the acceptance of the tex2html_wrap_inline1405 of the global F test and besides, an uncontrollable inflation of statistical error probabilities will occur. Thus the probability of a wrong decision in favor of strict monotonicity can be significantly increased, exceeding the pre-chosen tex2html_wrap_inline1429 by a substantial amount. Furthermore, this manner of testing a particular qualitative trend hypothesis does not fulfill the criteria of appropriateness (directional differences have been predicted, but the test refers to non-directional differences) and exhaustiveness (directional differences between all or at least some means, but the F test can also come out significant, if a large difference is associated with an inversion of rank order).

Another procedure, which is sometimes recommended, is to perform an F test for the hypothesis about the quantitative trend that is formulated to match the qualitative trend of interest: In place of the hypothesis of a strictly monotonic trend the test refers to the respective linear component through the orthogonal polynomials (see, e.g., Levin & Marascuilo, 1972, pp. 372-373 [40]). However, as has been stated above, the test can turn out statistically significant even if one or more rank inversions occur and since it may remain insignificant even if the rank order of means meets the prediction, but the differences among the means are inhomogeneous. The latter case is in accordance to the (qualitative) trend hypothesis of interest, the former is not. Thus, the interpretation of the results is ambiguous with respect to the hypothesis of strict monotonicity. But what would the consequences be if Testing Strategy TS 1, outlined above, is applied?

Applying Testing Strategy TS 1, the SH-mon is decomposed into the testable conjunction of hypotheses 'tex2html_wrap_inline1771' from Expression (2). If both hypotheses are accepted the trend is strictly monotonic by implication, although it is not possible to infer that all distances are large enough to reach statistical significance if tested separately as has been chosen as an additional criterion above. If, on the other hand, the two tests lead to accepting one of the conjunctions of hypotheses 'tex2html_wrap_inline1773' or 'tex2html_wrap_inline1775', respectively, it can be concluded that there is no monotonic trend. Yet if the pattern of decisions is 'tex2html_wrap_inline1777' the 'presence' or 'absence' of a strictly monotonic trend cannot be inferred unambiguously and test-based, since deviations from linearity (tex2html_wrap_inline1779) can be caused either by unequal distances among increasing or decreasing ranks or by rank inversions across the J means. Unequal distances again are compatible with strict monotonicity, whereas inverted ranks are not. Additionally, it remains unclear again whether the demand is fulfilled that adjacent means differ significanctly for each pair of means. Overall, the interpretation of the outcomes of the respective tests are ambiguous with respect to strict monotonicity. Thus, Smith and Macdonald (1983, p. 3) [61] conclude with respect to the procedure just outlined that these tests may be 'optimal,' 'when the true state of the world is a linear trend. When the intervals between successive ... (tex2html_wrap_inline1735) are not equal or are not known (and this is very commonly the case in psychology) the linear trend procedure is suspect and alternatives need to be examined.'

The method of orthogonal contrasts, whether to be used following a significant F test or instead of it, is covered in all textbooks and is in frequent use. Therefore, the question arises whether a 'satisfactory' testing strategy can be devised for hypotheses about orthogonal contrasts, enabling a test-based decision about a strictly monotonic trend. Without going into the details it can be stated that a strict rank order across J means cannot be established without supplementing test-based propositions to a large degree with data-based ones (see Hager, 1992, pp. 365-368 [22], for the details). For this reason, further alternatives to the quantitative trend tests described up to point are in demand. Another procedure consists of applying modified (quantitative) trend tests. The modification mainly concerns the choice of a set of 'optimum' contrast coefficients according to the proposals made by Abelson and Tukey (1963 [1]; see for an application Bortz, 1993, pp. 259-260 [5]). For comprehensive and comparative surveys of these and further tests see Berenson (1982) [3] and Smith and Macdonald (1983) [61]. According to Barlow, Bartholomew, Bremner and Brunk (1972, p. 118, p. 194) [2], Berenson (1982, p. 270) [3], and Le (1987, p. 173) [38], the hypotheses (tex2html_wrap_inline1399 and tex2html_wrap_inline1789) tested against each other by these and related tests are:
The alternative tex2html_wrap_inline1791 refers to a weak ordering of parameters, that is, to a weakly monotonic trend. But because of the way in which the respective test statistics are defined the alternative hypothesis, as it is given in (12), may not have been stated appropriately and should be replaced by the following one:
This formulation of the hypothesis takes the fact that the test results may turn up significant when only a single rank order of all tex2html_wrap_inline1753 pairs of means is in agreement with the expectation (j < j') more adequately into account. The alternative hypothesis in expression (12), on the other hand, may suggest that this rank order always concerns adjacent means (j=j'-1). Furthermore, it appears that the tests mentioned above do not even test against the alternative of a weakly monotonic trend, but rather against a more general (and less specific) class of alternatives which allow for rank inversions, thus putting the usefulness of talking of 'monotonic' relations in question (see Berenson, 1982 [3]). This has been shown for the non-parametric trend test devised by Jonckheere (1954) [32] in some detail by Hager (1995, pp. 171-174) [24] and by Bortz (1993, p. 260) [5] for the test proposed by Johnson and Mehrotra (1971) [31]. These tests, therefore, reflect a rather weak (implicit) decision rule which among other things allows for rank inversions. This fact causes many researchers to add data-based decisions to their test results.

The numerous procedures of ordering and selection (see above) and several multi-stage procedures combining the parametric F test with a rank correlation (Chassan, 1960 [9]; Green & Nimmo-Smith, 1982 [20]; Macdonald & Smith, 1983 [42]) seem to test analoguous statistical hypotheses. At least, the statistical hypothesis of a strictly monotonic trend is addressed in neither case, as Macdonald and Smith (1983, p. 25) [42] have pointed out, raising the question of further alternative procedures once again.

Only one of presumably various possibilities is described here, in which the testing of hypotheses about qualitative trends is interpreted as a problem of testing hypotheses by means of planned a priori or focussed contrasts, the manner usually advocated when examining hypotheses formulated in advance (see, among others, Kirk, 1982 [36]; Marascuilo & Levin, 1983, p. 337 [43]; Thompson, 1994 [62]), but rarely applied in psychological research practice. As to my own experience, one of the reasons for this may be that many reviewers demand overall tests which may be followed by a multiple comparison procedure. But Winer et al. (1991, p. 146) [68] clearly state: 'A procedure which is appropriate for a series of planned ... [contrasts] is simply to carry out a series of t tests, where t is appropriately defined for the experimental design used' ['comparisons' replaced by 'contrasts']. But this method seems to be suspect to many researchers who rely on overall tests even if particular contrast hypotheses can be formulated in advance. Maybe a series of t tests is too simple a procedure to be 'scientific', even if there is good reason to perform them?

next up previous contents
Next: Testing strategies for monotonic Up: Qualitative Trends And Trend Previous: Monotonic trend as an

Methods of Psychological Research 1996, Vol.1, No.4
© 1997 Pabst Science Publishers