Algorithms for missing data replacement in time series analysis
Three simulation studies were conducted in order to compare the accuracy of two algorithms for estimating missing observations in time series data. Each study was designed to test the algorithms under conditions which are likely to occur in applied behavioral research: (1) Study 1 examined the effects of model misspecification on the accuracy of estimation; (2) Study 2 examined the effects of systematically missing data (versus randomly missing data) on estimation accuracy; (3) and Study 3 explored the accuracy of the algorithms under conditions of nonnormality in the data series. The two algorithms, the EM (Estimation Maximization) Algorithm and the Jones (1980) Maximum Likelihood Algorithm are compared using simulated time series with positive and negative autocorrelation, four different underlying ARIMA models, normal and lognormal distributions, and 0, 20, or 40% of data eliminated from the series. Major findings are: (1) The EM Algorithm (as currently implemented in the EMCOV2.3 program by Graham, 1995) is inaccurate under virtually all conditions tested for estimating time series data; (2) model misspecification caused only minimal problems for ML estimation; (3) systematic missing data patterns led to slightly inaccurate autocorrelation estimates, but other time series parameters could be estimated accurately when ML was used to replace missing data; and (4) neither EM nor ML performed worse under conditions of nonnormality than under conditions of normally distributed data. Findings from research on time series should also generalize to other statistical models which involve repeated measures on one or a group of individuals or units. Recommendations are made for future research in this area. ^
Statistics|Health Sciences, Public Health|Psychology, Experimental|Psychology, Psychometrics
Suzanne Marie Colby,
"Algorithms for missing data replacement in time series analysis"
Dissertations and Master's Theses (Campus Access).