A test for homogeneity of g 2 covariance matrices is presented when the dimension, p, may exceed the sample size, n(i), i = 1, ..., g, and the populations may not be normal. Under some mild assumptions on covariance matrices, the asymptotic distribution of the test is shown to be normal when n(i), p . Under the null hypothesis, the test is extended for common covariance matrix to be of a specified structure, including sphericity. Theory of U-statistics is employed in constructing the tests and deriving their limits. Simulations are used to show the accuracy of tests.
For a random sample of n iid p-dimensional vectors, each partitioned into b sub-vectors of dimensions pi, i=1,…,b, tests for zero correlation of sub-vectors are presented when pi ≫ n and the distribution need not be normal. The test statistics are composed of U-statistics based estimators of the Frobenius norm measuring the distance between the null and alternative hypotheses. Asymptotic distributions of the tests are provided for n,pi → ∞, with their finite-sample performance demonstrated through simulations. Some related tests are discussed. A real data application is also given.
Ahmad et al. (in press) presented test statistics for sphericity and identity of the covariance matrix of a multivariate normal distribution when the dimension, p, exceeds the sample size, n. In this note, we show that their statistics are robust to normality assumption, when normality is replaced with certain mild assumptions on the traces of the covariance matrix. Under such assumptions, the test statistics are shown to follow the same asymptotic normal distribution as under normality for large p, also whenp >> n. The asymptotic normality is proved using the theory of U-statistics, and is based on very general conditions, particularly avoiding any relationship between n and p.
Ahmad et al. (in press) presented test statistics for sphericity and identity of the covariance matrix of a multivariate normal distribution when the dimension, p, exceeds the sample size, n. In this note, we show that their statistics are robust to normality assumption, when normality is replaced with certain mild assumptions on the traces of the covariance matrix. Under such assumptions, the test statistics are shown to follow the same asymptotic normal distribution as under normality for large p, also whenp greater thangreater than n. The asymptotic normality is proved using the theory of U-statistics, and is based on very general conditions, particularly avoiding any relationship between n and p.
The Gamma Regression Model (GRM) has a variety of applications in medical sciences and other disciplines. The results of the GRM may be misleading in the presence of multicollinearity. In this article, a new biased estimator called James-Stein estimator is proposed to reduce the impact of correlated regressors for the GRM. The mean squared error (MSE) properties of the proposed estimator are derived and compared with the existing estimators. We conducted a simulation study and employed the MSE and bias evaluation criterion to judge the proposed estimator’s performance. Finally, two medical dataset are considered to show the benefit of the proposed estimator over existing estimators.
This article presents the techniques of likelihood prediction for the generalized linear mixed models. Methods of likelihood prediction are explained through a series of examples; from a classical one to more complicated ones. The examples show, in simple cases, that the likelihood prediction (LP) coincides with already known best frequentist practice such as the best linear unbiased predictor. This article outlines a way to deal with the covariate uncertainty while producing predictive inference. Using a Poisson errors-in-variable generalized linear model, it has been shown in certain cases that LP produces better results than already known methods.
This paper presents the techniques of likelihood prediction for the generalized linear mixed models. Methods of likelihood prediction is explained through a series of examples; from a classical one to more complicated ones. The examples show, in simple cases, that the likelihood prediction (LP) coincides with already known best frequentist practice such as the best linear unbiased predictor. The paper outlines a way to deal with the covariate uncertainty while producing predictive inference. Using a Poisson error-in-variable generalized linear model, it has been shown that in complicated cases LP produces better results than already know methods.
In this paper we present several goodness-of-fit tests for the centralized Wishart process, a popular matrix-variate time series model used to capture the stochastic properties of realized covariance matrices. The new test procedures are based on the extended Bartlett decomposition derived from the properties of the Wishart distribution and allows to obtain sets of independently and standard normally distributed random variables under the null hypothesis. Several tests for normality and independence are then applied to these variables in order to support or to reject the underlying assumption of a centralized Wishart process. In order to investigate the influence of estimated parameters on the suggested testing procedures in the finite-sample case, a simulation study is conducted. Finally, the new test methods are applied to real data consisting of realized covariance matrices computed for the returns on six assets traded on the New York Stock Exchange.
This paper describes testing for periodicity in the presence of FD processes. We
propose two approaches for testing the periodicity based on Fisher’s test. The first
one is performed using the periodogram which has been divided into different parts.
The second one is based on the discrete wavelet transform. Properties of the tests are
illustrated by means of Monte Carlo simulations.
In this article, we propose some diagnostic techniques for the inverse Gaussian regression model (IGRM), which are appropriate for modeling the response variable that undertakes positively skewed continuous dataset. Moreover, two new diagnostic methods are mainly proposed for the IGRM, which named as covariance ratio (CVR) and Welsch?s distance (WD). The comparison of our proposed methods of influence diagnostics with the existing approaches has been made through Monte Carlo simulation under different factors. In addition, the benefit of the proposed methods is assessed using a real application. Based on the simulation and empirical application results, we observed that the performance of the proposed method is better than the existing methods for detection of influential observations.
The Poisson distribution is here used to illustrate Bayesian inference concepts with the ultimate goal to construct credible intervals for a mean. The evaluation of the resulting intervals is in terms of mismatched priors and posteriors. The discussion is in the form of an imaginary dialog between a teacher and a student, who have met earlier, discussing and evaluating the Wald and score confidence intervals, as well as confidence intervals based on transformation and bootstrap techniques. From the perspective of the student the learning process is akin to a real research situation. The student is supposed to have studied mathematical statistics for at least two semesters.
In many applications of linear regression models, randomness due to model selection is commonly ignored in post-model selection inference. In order to account for the model selection uncertainty, least-squares frequentist model averaging has been proposed recently. We show that the confidence interval from model averaging is asymptotically equivalent to the confidence interval from the full model. The finite-sample confidence intervals based on approximations to the asymptotic distributions are also equivalent if the parameter of interest is a linear function of the regression coefficients. Furthermore, we demonstrate that this equivalence also holds for prediction intervals constructed in the same fashion.
Linear mixed models are standard models to analyze repeated measures or longitudinal data under the assumption of normality for random components in the model. Although the mixed models are often used in both frequentist and Bayesian inference, their evaluation from robustness perspective has not received as much attention in Bayesian inference as in frequentist. The aim of this study is to evaluate Bayesian tests in mixed models for their robustness to normality. We use a general class of exponential power distributions, EPD, and particularly focus on testing fixed effects in longitudinal models. The EPD class contains both light and heavy tailed distributions, with normality as a special case. Further, we consider a new paradigm of Bayesian testing decision theory where the hypotheses are formulated as a mixture model, with subsequent testing based on the posterior distribution of the mixture weights. It is shown that the EPD class provides a flexible alternative to normality assumption, particularly in the presence of outliers. Real data applications are also demonstrated.
A new method for constructing absolutely continuous two-dimensional copulas by differential equations is presented. The copulas are symmetric with respect to reflection in the opposite diagonal. The support of the copula density may be prescribed to arbitrary opposite symmetric hypographs of invertible functions, containing the diagonal. The method is applied to toxicological probit modeling, where new compatibility conditions for the probit parameters are derived.
In this article we suggest a new multivariate autoregressive process for modeling time-dependent extreme value distributed observations. The idea behind the approach is to transform the original observations to latent variables that are univariate normally distributed. Then the vector autoregressive DCC model is fitted to the multivariate latent process. The distributional properties of the suggested model are extensively studied. The process parameters are estimated by applying a two-stage estimation procedure. We derive a prediction interval for future values of the suggested process. The results are applied in an empirically study by modeling the behavior of extreme daily stock prices.
This article proposes a simplification of the model for dependent binary variables presented in Cox and Snell (1989). The new model referred to as the simplified Cox model is developed for identically distributed and dependent binary variables. Properties of the model are presented, including expressions for the log-likelihood function and the Fisher information. Under mutual independence, a general expression for the restrictions of the parameters are derived. The simplified Cox model is illustrated using a data set from a clinical trial.
This paper introduces a new approach to incorporating time-dependent overdispersion for Poisson-related regression models. To handle the added flexibility in conditional heteroskedasticity in time series count data models, some well-known estimators are adapted, and a generalized method of moments (GMM) estimator is suggested. The estimators are applied to a time series of self-feeding activity in Arctic charr. There is strong support for both a dynamic conditional mean function and a dynamic model for the overdispersion.
Lagrange multiplier (LM) test statistics are derived for testing a linear moving average model against an asymmetric moving average model and an LM type test against an additive smooth transition moving average model. The latter model is introduced in the paper. The small sample performance of the proposed tests are evaluated in a Monte Carlo study and compared to Wald and likelihood ratio statistics. The size properties of the Lagrange multiplier test are better than those of other tests.
The count data model studied in the paper extends the Poisson model by allowing for overdispersion and serial correlation. Alternative approaches to estimate nuisance parameters, required for the correction of the Poisson maximum likelihood covariance matrix estimator and for a quasi-likelihood estimator, are studied. The estimators are evaluated by finite sample Monte Carlo experimentation. It is found that the Poisson maximum likelihood estimator with corrected covariance matrix estimators provide reliable inferences for longer time series. Overdispersion test statistics are wellbehaved, while conventional portmanteau statistics for white noise have too large sizes. Two empirical illustrations are included.
The celebrated Black–Scholes model made the assumption of constant volatility but empirical studies on implied volatility and asset dynamics motivated the use of stochastic volatilities. Christoffersen in 2009 showed that multi-factor stochastic volatilities models capture the asset dynamics more realistically. Fouque in 2012 used it to price European options. In 2013 Chiarella and Ziveyi considered Christoffersen's ideas and introduced an asset dynamics where the two volatilities of the Heston type act separately and independently on the asset price, and using Fourier transform for the asset price process and double Laplace transform for the two volatilities processes, solved a pricing problem for American options. This paper considers the Chiarella and Ziveyi model and parameterizes it so that the volatilities revert to the long-run-mean with reversion rates that mimic fast(for example daily) and slow(for example seasonal) random effects. Applying asymptotic expansion method presented by Fouque in 2012, we make an extensive and detailed derivation of the approximation prices for European options. We also present numerical studies on the behavior and accuracy of our first and the second order asymptotic expansion formulas.
We present a measurement error model for multivariate replicated data and focus on the improved likelihood ratio tests for parameters of interest. By assuming that the random terms follow the scale mixtures of normal distributions, the model can bring robust inference and can target on both error-prone and error-free covariates. We derive modified versions from the original likelihood ratio statistics to achieve better asymptotic properties with high degree of accuracy. Simulation studies are conducted to display finite sample behavior as compared to the unmodified counterpart. The practical utility is illustrated through a root decomposition data.
We consider a stochastic process, the homogeneous spatial immigration-death (HSID) process, which is a spatial birth-death process with as building blocks (i) an immigration-death (ID) process (a continuous-time Markov chain) and (ii) a probability distribution assigning iid spatial locations to all events. For the ID process, we derive the likelihood function, reduce the likelihood estimation problem to one dimension, and prove consistency and asymptotic normality for the maximum likelihood estimators (MLEs) under a discrete sampling scheme. We additionally prove consistency for the MLEs of HSID processes. In connection to the growth-interaction process, which has a HSID process as basis, we also fit HSID processes to Scots pine data.
This paper derives first-order sampling moments of individual Mahalanobis distances (MD) in cases when the dimension p of the variable is proportional to the sample size n. Asymptotic expected values when n, p → ∞ are derived under the assumption p/n → c, 0 ⩽ c < 1. It is shown that some types of standard estimators remain unbiased in this case, while others are asymptotically biased, a property that appears to be unnoticed in the literature. Second order moments are also supplied to give some additional insight to the matter.
Sufficient conditions for invertibility of non-linear time series models are available in the literature only for a few special cases. In this paper a practical and general method for checking invertibility is presented. Briefly stated, it consists of feeding independent and identically distributed innovations into the non-linear model and then observing whether the model blows up or not. Using this idea invertibility conditions are derived for several recently proposed non-linear moving average models. Finally, the method is applied to a number of bilinear models fitted to economic time series.
This paper explores multivariate data using principal component analysis (PCA). Traditionally, two different approaches to PCA have been considered, an algebraic descriptive one and a probabilistic one. Here, a third type of PCA approach, lying somewhere between the two traditional approaches, called the fixed effects PCA model, is considered. This model includes mainly geometrical, rather than probabilistic assumptions, such as the optimal choice of dimensionality and metric. The model is designed to account for any possible prior information about the noise in the data to yield better estimates. Parameters are estimated by minimizing a least-squares criterion with respect to a specified metric. A suggestion of how the fixed effects PCA estimates can be improved in a common principal component (CPC) environment is made. If the CPC assumption is fulfilled, then the fixed effects PCA model can consider more information by incorporating common principal component analysis (CPCA) theory into the estimation procedure.
A classical result in risk theory is the Cramér-Lundberg approximation which says that under some general conditions the exponentially normalized ruin probability converges. In this article, we state an explicit rate of convergence for the Cramér-Lundberg approximation for ruin probabilities in the case where claims are bounded, which is realistic for, e.g., reinsurance models. The method, used to get the corresponding results, is based on renewal and coupling arguments.
A classical result in risk theory is the Cramer-Lundberg approximation which says that under some general conditions the exponentially normalized ruin probability converges. In this article, we state an explicit rate of convergence for the Cramer-Lundberg approximation for ruin probabilities in the case where claims are bounded, which is realistic for, e. g., reinsurance models. The method, used to get the corresponding results, is based on renewal and coupling arguments.
In Politis and Romano (Politis, D.N.; Romano, J.P. Nonparametric Resampling for Homogeneous Strong Mixing Random Fields. Journal of Multivariate Analysis 1993, 47, 301–328.), different block resampling estimators of variance of general linear statistics, e.g., a sample mean, were proposed under the assumption of stationarity. In the present paper such estimators of variance of sample means, computed from nonstationary spatially indexed data , where is a finite subset of the integer lattice , are studied. Consistency of estimators of variance will be shown for the following kind of data: Observations taken from different lattice points are allowed to come from different distributions, and the dependence structure is allowed to differ over the lattice. We assume that all observed values are from distributions with the same expected value, or with expected values that decompose additively into directional components. Furthermore, it will be assumed that observations separated by a certain distance are independent.
The literature has recently devoted close attention to error-prone variables. Nevertheless, only a small number of research have considered measurement error in spatial econometric models. The presence of measurement error in the spatial econometric models needs to be considered as a result of the rise in spatial data analysis, as the relationship between the spatial correlation and measurement error influences parameter estimation. Therefore, in this study, the impacts of classical measurement error on the parameter estimation of the spatial lag model are theoretically examined for both response and explanatory variables. Then, using simulation studies, finite sample properties are investigated for various situations. The major findings indicate that although error-prone response variable has an opposing bias effect on parameter estimations, error-prone explanatory variables have a significant influence effect on the bias of parameter estimations. As a result, it is occasionally possible to obtain unbiased estimates only in certain circumstances.
The likelihood function of ARMA processes with some fixed parameters is considered. An expression for it and a sufficient statistic are obtained. It is shown how the proposed approach can be used to obtain closed form expressions for the likelihood functions of some ARMA models. Applications to parameter estimation, hypothesis testing, speech processing and related problems are discussed
Panel data models with factor structures in both the errors and the regressors have received considerable attention recently. In these models, the errors and the regressors are correlated and the standard estimators are inconsistent. This paper shows that, for such models, a modified first-difference estimator (in which the time and the cross-sectional dimensions are interchanged) is consistent as the cross-sectional dimension grows but the time dimension is small. Although the estimator has a non standard asymptotic distribution, t and F tests have standard asymptotic distribution under the null hypothesis.
c-optimal designs for estimating the model parameters of the quadratic logistic regression model are considered. The designs are constructed via the canonical design space. It is shown that the number of design points varies between 1 and 4 depending on the parameter being estimated. Furthermore, formulae for finding the design points along with the corresponding design weights are derived.
In this paper new filters for removing unspecified form of heteroscedasticity are proposed. The filters build on the assumption that the variance of a pre-whitened time series can be viewed as a latent stochastic process by its own. This makes the filters flexible and useful in many situations. A simulation study shows that removing heteroscedasticity before fitting a model leads to efficiency gains and bias reductions when estimating the parameters of ARMA models. A real data study shows that pre-filtering can increase the forecasting precision of quarterly US GDP growth.
The response of a treatment (direct effect) applied on a given unit may be affected by the treatments applied to its neighboring units (neighbor effects). Neighbor designs are considered robust to neighbor effects. Minimal neighbor designs are economical, therefore, these are preferred by the experimenters. Method of cyclic shifts (Rule I) provides the minimal neighbor designs for odd v (number of treatments). Method of cyclic shifts (Rule II) provides the minimal circular Quasi Rees neighbor designs for v even which are considered to be the good alternate to the minimal neighbor designs. In this article, for every case of v even, minimal circular Quasi Rees neighbor designs are constructed in such a way that these designs can also be converted directly into minimal circular balanced and strongly balanced neighbor designs.
Ridge estimators are usually examined through Monte Carlo simulations since their properties are difficult to obtain analytically. In this paper we argue that a simulation design commonly used in the literature will give biased results of Monte Carlo simulations in favour of ridge regression over ordinary least square (OLS) estimators. Specifically, it is argued that the properties of ridge estimators that are functions of pdistinct regressor eigenvalues should not be evaluated through Monte Carlo designs using only two distinct eigenvalues.
Ridge estimators are usually examined through Monte Carlo simulations since their properties are difficult to obtain analytically. In this paper we argue that a simulation design commonly used in the literature will give biased results of Monte Carlo simulations in favour of ridge regression over ordinary least square (OLS) estimators. Specifically, it is argued that the properties of ridge estimators that are functions of p distinct regressor eigenvalues should not be evaluated through Monte Carlo designs using only two distinct eigenvalues.
Statistical analysis frequently involves the problem of assessing distributional properties. This article concerns the problem of testing for skewness of random variables. It is argued that the classical skewness test is not very useful for this purpose, and another approach is suggested that is easy to implement and is also robust to heteroscedasticity. The size, power, and robustness properties of the proposed test is evaluated and compared to the classical skewness test by means of Monte Carlo simulations.
We suggest shrinkage based technique for estimating covariance matrix in the high-dimensional normal model with missing data. Our approach is based on the monotone missing scheme assumption, meaning that missing values patterns occur completely at random. Our asymptotic framework allows the dimensionality p grow to infinity together with the sample size, N, and extends the methodology of Ledoit and Wolf (2004) to the case of two-step monotone missing data. Two new shrinkage-type estimators are derived and their dominance properties over the Ledoit and Wolf (2004) estimator are shown under the expected quadratic loss. We perform a simulation study and conclude that the proposed estimators are successful for a range of missing data scenarios.
Surveillance to detect changes of spatial patterns is of interest in many areas such as environmental control and regional analysis. Here the interaction parameter of the Ising model, is considered. A minimal sufficient statistic and its asymptotic distribution are used. It is demonstrated that the convergence to normal distribution is rapid. The main result is that when the lattice is large, all approximations are better in several respects. It is shown that, for large lattice sizes, earlier results on surveillance of a normally distributed random variable can be used in cases of most interest. The expected delay of alarm at a fixed level of false alarm probability is examined for some examples. Copyright © 1999 by Marcel Dekker, Inc.
This paper derives a Lagrange Multiplier test for normality in censored regressions. The test is derived against the generalized log-gamma distribution, in which normal is a special case. The resulting test statistic coincides to some extent with previously suggested score and conditional moment tests. Estimation of the variance is performed by using the matrix of second order derivatives in order to get an easy to use test statistic. Small sample performance of the test is studied and compared to other tests by Monte Carlo experiments.
Many economic theories suggest that the relation between two variablesyandxfollow a function forming a convex or concave curve. In the classical linear model (CLM) framework, this function is usually modeled using a quadratic regression model, with the interest being to find the extremum value or turning point of this function. In the CLM framework, this point is estimated from the ratio of ordinary least squares (OLS) estimators of coefficients in the quadratic regression model. We derive an analytical formula for the expected value of this estimator, from which formulas for its variance and bias follow easily. It is shown that the estimator is biased without the assumption of normality of the error term, and if the normality assumption is strictly applied, the bias does not exist. A simulation study of the performance of this estimator for small samples show that the bias decreases as the sample size increases.
Many economic theories suggest that the relation between two variables y and x follow a function forming a convex or concave curve. In the classical linear model (CLM) framework, this function is usually modeled using a quadratic regression model, with the interest being to find the extremum value or turning point of this function. In the CLM framework, this point is estimated from the ratio of ordinary least squares (OLS) estimators of coefficients in the quadratic regression model. We derive an analytical formula for the expected value of this estimator, from which formulas for its variance and bias follow easily. It is shown that the estimator is biased without the assumption of normality of the error term, and if the normality assumption is strictly applied, the bias does not exist. A simulation study of the performance of this estimator for small samples show that the bias decreases as the sample size increases.
This article analyzes the effects of multicollienarity on the maximum likelihood (ML) estimator for the Tobit regression model. Furthermore, a ridge regression (RR) estimator is proposed since the mean squared error (MSE) of ML becomes inflated when the regressors are collinear. To investigate the performance of the traditional ML and the RR approaches we use Monte Carlo simulations where the MSE is used as performance criteria. The simulated results indicate that the RR approach should always be preferred to the ML estimation method.