Marginal Cumulative Logistic Model of General Order for Multi-way Contingency Tables

For multi-way contingency table, Bhapkar and Darroch (1990) considered the marginal symmetry model for order h. The present paper proposes a marginal cumulative logistic model for order h. When h = 1, this model reduces to the marginal logistic model (Agresti 2013). It also gives a theorem that the marginal symmetry model for order h holds if and only if (i) the marginal cumulative logistic model for order h, (ii) the marginal moment equality model for order h, and (iii) the marginal symmetry model for order h − 1 hold. A special case of this theorem with h = 1 is identical to the result of Tahata, Katakura, and Tomizawa (2007).

For the multi-way table with ordinal categories, several studies considered the marginal cumulative probability in order to discuss the inhomogeneity of first-order marginal distribution. Let F (k) i denote the first-order marginal cumulative probability and let L (k) i denote the first-order marginal cumulative logit of X k for i = 1, . . . , r − 1, k = 1, . . . , T ; namely, F where ∆ 1 = 0 (Agresti 2013, p.442). A special case of this model obtained by putting ∆ k = 0 is the M T 1 model. For instance, when T = 2, see McCullagh (1977). Consider the marginal mean equality (ME T ) model defined by E(X 1 ) = · · · = E(X T ). Agresti (2013, p.440) discussed the decomposition of model. That is, generally suppose that model H 3 implies models H 1 and H 2 , model H 3 holds if and only if both models H 1 and H 2 hold. This enables us to see that assuming that model H 1 holds true, the hypothesis that model H 3 holds is equivalent to the hypothesis that the model H 2 holds, and the decomposition of model should be useful to observe the reason for its poor fit when model H 3 does not fit the data well. Tahata et al. (2007) noted that for an r T table, the M T 1 model holds if and only if both the ML T and ME T models hold. For T = 2, see Miyamoto, Niibe, and Tomizawa (2005). For order h with 1 ≤ h < T , denote the hth-order marginal cumulative probability Pr(X s . . , s h ) and i = (i 1 , . . . , i h ) with 1 ≤ s 1 < · · · < s h ≤ T and i k = 1, . . . , r (k = 1, . . . , h). Note that when some i k equal to r, F s h i reduces to the marginal cumulative probability for lower order. For example, when i h = r, Pr(X s 1 ≤ i 1 , . . . , X s h ≤ r) = F s h−1 i . Then, the M T h model may be expressed as for any permutation j = (j 1 , . . . , j h ) of i = (i 1 , . . . , i h ), where i k = 1, . . . , r (k = 1, . . . , h) and for any s h = (s 1 , . . . , s h ) and t h = (t 1 , . . . , t h ). Since F s h i reduces to the marginal cumulative probability for lower order in case where some i k equal to r, the M T h model may also be expressed as for any permutation j = (j 1 , . . . , j l ) of i = (i 1 , . . . , i l ), where i k = 1, . . . , r − 1 (k = 1, . . . , l) and for any s l = (s 1 , . . . , s l ) and t l = (t 1 , . . . , t l ). In order to emphasize, hereafter we refer to the r − 1 as r − 1 in this paper. Note that if the M T h model holds, then the M T h−1 model holds, but the converse does not always hold. Hence we are interested in proposing the model we need in order that the M T h model holds when the M T h−1 model holds. The ML T model focuses on the fixed (h = 1) order marginal distributions, and describes the inhomogeneity structure based on the logits of F In this paper, we propose a marginal cumulative logistic model of general order, and give a decomposition of the M T h model by using the proposed model. Section 2 proposes the hth-order marginal cumulative logistic model. Section 3 gives the decompositions of the M T h model. Section 4 presents the goodness-of-fit test. Section 5 shows some examples. Finally, Section 6 provides concluding remarks.

Models
For a fixed h (1 ≤ h < T ), consider a model defined by for any permutation j = (j 1 , . . . , j h ) of i = (i 1 , . . . , i h ) and l h = (1, . . . , h), where i k = 1, . . . , r − 1 (k = 1, . . . , h) and for any s h = (s 1 , . . . , s h ) with 1 ≤ s 1 < · · · < s h ≤ T , where ∆ l h = 0. We shall refer to this model as the hth-order marginal cumulative logistic the ML T h model can be expressed as a logistic function, for any s h = (s 1 , . . . , s h ), for 1 ≤ s 1 < · · · < s h ≤ T . We shall refer to this model as the hth-order marginal moment equality (ME T h ) model. When h = 1, the ME T h model is identical to the ME T model. Tahata et al. (2007) showed the decomposition of the M T 1 model. We shall consider the decomposition of the M T h model for an r T table. Let X * k = r + 1 − X k for k = 1, . . . , T . First, we obtain the following lemma.

Decompositions of the marginal symmetry model
Proof.
for any 1 ≤ s 1 < · · · < s h ≤ T . Since the M T h−1 model holds, all the ME T k models hold for k = 1, . . . , h − 1. Thus when the M T h−1 model holds, the ME T h model is identical to the equation, From Lemma 1, we obtain the following theorem.
We note that Theorem 1 is the generalization of the result given by Tahata et al. (2007). Also, we obtain the following corollary from Theorem 1.

Goodness-of-fit test
Let n i 1 ...i T denote the observed frequency in the (i 1 , . . . , i T )th cell of the r T table. Assume that a multinomial distribution is applied to the r T table. The maximum likelihood estimates (MLEs) of the expected frequencies under each model can be obtained by the Newton-Raphson method in the log-likelihood equations. Each model can be tested for the goodness-of-fit using, for example, the likelihood ratio chi-squared statistic (denoted by G 2 ) with the corresponding degrees of freedom (df). The test statistic G 2 for model H is given by Table 1 lists df for each model. We note that the number of df for the M T h model is equal to the sum of those for the decomposed models.

Models
Degrees of freedom (1974) information criterion (AIC) is used to choose the preferable model among different models which include non-nested models. For details see Konishi and Kitagawa (2008). Since only the difference between AIC's is required when two models are compared, it is possible to ignore a common constant of AIC. We may use a modified AIC defined by AIC + = G 2 − 2(number of df).
Thus, for the data, the model with the minimum AIC + is the preferable model. This criterion will be used in the next section.

Examples
Consider the data in Tables 2 and 3 Table 2   Table 4 gives the values of G 2 and AIC + for the data in Table 2, and shows that all models fit the data well since these models are accepted at the 0.05 significance level. Since these  models are including non-nested models, we use AIC + to choose the preferable model. Since the ML 3 2 model has a minimum AIC + value, the ML 3 2 model is the preferable model among the models. Thus it is inferred that there is a symmetry structure but not homogeneity for second-order marginal distribution. With regard to the inhomogeneity structure, the MLEs of parameters exp(∆ (1,3) ) and exp(∆ (2,3) ) under the ML 3 2 model are exp(∆ (1,3) ) = 0.98 and exp(∆ (2,3) ) = 1.05. For the inhomogeneity structure, the ML 3 2 model provides, for example, the odds that the opinions of education and the environment are both 'too little' instead of neither 'too little' are estimated to be 0.98 times higher than the odds that the opinions of education and assistance to the poor are both 'too little' instead of neither 'too little'. We can obtain similar results although the details are omitted. This indicates that there is location shift on a logistic scale between the marginal distribution for the opinions of education and the environment and that of education and assistance to the poor. With regard to the case of the environment and assistance to the poor, it can be interpreted in the same way to the case of above.  Table 3   Table 5 gives the values of G 2 and AIC + for models applied to the data in Table 3. These show that the ML 3 1 model fits the data well although the other models fit the data poorly. We see that the poor fit of the M 3 2 model is due to the lack of the ME 3 2 , ME 3 1 , and ML 3 2 models rather than the ML 3 1 model from Corollary 1. Therefore, it is inferred that the poor fit of the M 3 2 model is caused by the influence of the lack of structure of (i) the equality of the second-order moments of (X 1 , X 2 ), (X 1 , X 3 ) and (X 2 , X 3 ), (ii) the equality of means of X 1 , X 2 and X 3 , and (iii) the ML 3 2 model. Under the ML 3 1 , the MLEs of parameters exp(∆ (2) ) and exp(∆ (3) ) are exp(∆ (2) ) = 1.55 and exp(∆ (3) ) = 1.17. Thus, under the ML 3 1 model, the odds that the opinion is 'too little' instead of not 'too little' are estimated to be 1.55 times higher in education than in the environment. In a similar manner, we can see that the odds for the opinion of education is also estimated to be 1.55 times higher than that of the environment in either cases. Furthermore, we can interpret that there is location shift on a logistic scale between the marginal distribution for the opinions of education and that of the environment. With regard to the case of the education and assistance to the poor, it can be interpreted in the same way to the case of above.

Concluding remarks
In this paper, (i) we have proposed the ML T h model, and (ii) given the decomposition of the M T h model. The ML T h model is the extension of the ML T model discussed by Agresti (2013, p.442), and the decomposition by using the ML T h model is the generalization of the result given by Tahata et al. (2007). The decomposition for the M T h model should be useful to explore the reason for the poor fit of the M T h model when the M T h model does not hold for analyzing the data. Meanwhile, Theorem 1 also leads to Corollary 1 that decomposes for the M T h model into more models. The decomposition for the M T h model into more (three or four) models rather than into two models would be useful for exploring the reason for the poor fit in more details when the M T h model does not fit well. In practice, Corollary 1 reveals the origin of the poor fit of the M 3 2 model (see Section 5.2.).