On the Measurement of Model Fit for Sparse Categorical Data
2012 (English)Doctoral thesis, comprehensive summary (Other academic)
This thesis consists of four papers that deal with several aspects of the measurement of model fit for categorical data. In all papers, special attention is paid to situations with sparse data.
The first paper concerns the computational burden of calculating Pearson's goodness-of-fit statistic for situations where many response patterns have observed frequencies that equal zero. A simple solution is presented that allows for the computation of the total value of Pearson's goodness-of-fit statistic when the expected frequencies of response patterns with observed frequencies of zero are unknown.
In the second paper, a new fit statistic is presented that is a modification of Pearson's statistic but that is not adversely affected by response patterns with very small expected frequencies. It is shown that the new statistic is asymptotically equivalent to Pearson's goodness-of-fit statistic and hence, asymptotically chi-square distributed.
In the third paper, comprehensive simulation studies are conducted that compare seven asymptotically equivalent fit statistics, including the new statistic. Situations that are considered concern both multinomial sampling and factor analysis. Tests for the goodness-of-fit are conducted by means of the asymptotic and the bootstrap approach both under the null hypothesis and when there is a certain degree of misfit in the data. Results indicate that recommendations on the use of a fit statistic can be dependent on the investigated situation and on the purpose of the model test. Power varies substantially between the fit statistics and the cause of the misfit of the model. Findings indicate further that the new statistic proposed in this thesis shows rather stable results and compared to the other fit statistics, no disadvantageous characteristics of the fit statistic are found.
Finally, in the fourth paper, the potential necessity of determining the goodness-of-fit by two sided model testing is adverted. A simulation study is conducted that investigates differences between the one sided and the two sided approach of model testing. Situations are identified for which two sided model testing has advantages over the one sided approach.
Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2012. , 22 p.
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Social Sciences, ISSN 1652-9030 ; 79
goodness-of-fit, sparseness, model fit, categorical data, fit statistic, sparse contingency table
Probability Theory and Statistics
Research subject Statistics
IdentifiersURN: urn:nbn:se:uu:diva-173768ISBN: 978-91-554-8394-4OAI: oai:DiVA.org:uu-173768DiVA: diva2:525091
2012-06-14, Hörsal 2, Ekonomikum, Kyrkogårdsgatan 10, Uppsala, 10:15 (English)
Shukur, Ghazi, professor
Sörbom, Dag, docent
List of papers