An Application of Dynamic Factor Analysis: An Examination of Smoking Habit Strength Measures

Across smoking cessation studies, a variety of self-report and physiological measures have been used as outcome measures attempting to operationalize the degree of "habit strength " people experience. However , the performance of these different measures has not been adequately assessed longitudinally (Velicer, Rossi, Prochaska , & DiClemente , 1996). Employing time-series data to understand underlying physiological and/or psychological processes is a useful way to study constructs which may fluctuate daily . Although it has been applied in an extremely limited number of settings , dynamic factor analysis is one statistical method which may help to evaluate habit strength measures over time . This study has four main goals: 1) to examine three measures of smoking habit strength longitudinally in order to assess the comparative reliability and stability of the measures; 2) to test the hypothesis that across time , smoking habit strength can best be described as a multiple regulation model as was shown with this same data set using traditional time-series analyses (Velicer , Redding , Richmond , Greeley , and Swift, 1992); 3) to employ an innovative statistical procedure , dynamic factor analysis , and critically evaluate the difficulty in employing the procedure; and 4) to compare the results of two alternative dynamic factor solutions, one provided by LISREL and one provided by SAS macros (Wood & Brown , 1994). The three primary habit strength measures investigated are two biochemical measures, salivary cotinine and carbon monoxide level, and one selfreport measure , number of cigarettes smoked. Two additional measures were included to provide divergent validity. Dyanamic factor analysis was not deemed especially difficult to employ, especially with the aid of the Wood & Brown SAS macros. Both LISREL and the SAS macros had


LIST OF TABLES
: Possible factor structure representing a nicotine fixed effect model 44 Table 4: Descriptive statistics for 5 variables on 10 subjects for raw data 45 Table 5: Descriptive statistics for 5 variables on 10 subjects for log-transformed data47 Table 6: Correlations for subject RTS across 5 variables and 5 lags 49 Table 7: Averaged correlations across 5 variables and 5 lags 50 Table 8: Frequencies of proper, improper, and non-converging solutions 51 Table 9: Comparison of SAS and LISREL dynamic factor model proper solutions 52 Table 10: Summary of dynamic factor models, skewness, kurtosis, and ARlMA... 53 Table 11: Dynamic factor analysis factor loadings for subject JBW 54

INTRODUCTION
Traditional research in the behavioral sciences has focused on between-subjects designs, in which groups of subjects within a sample are examined for group differences.
Although between-subjects designs are more familiar, making them simpler to understand, easier to design, and easier to analyze for most researchers, they may not be able to directly address important research questions that a within-subject design could.
Criticism of within-subject designs usually begins with the issue of generalizability.
Generalization from one subject to an entire population is not possible. However, if models of behavior established within one subject are then replicated across other subjects, the implications of the research will be meaningful.
In a within-subject design, the observations have a temporal order. Repeated observation across many occasions may reveal important patterns that were previously hidden. Data analysis techniques such as time-series analysis may be able to determine these patterns. The ultimate goal of time series analysis is to explain or predict patterns of change by studying a variable's history within a subject. Multivariate time series models can take into account multiple variables, examining the interrelationships among them. Recently developed multivariate time series procedures allow researchers to examine relationships between multiple variables. An alternative approach involves identifying underlying latent factors which multiple variables may be measuring.
Dynamic factor analysis is one method that employs the latent variable approach.
The concept of a factor, construct, or latent variable is one of the most important concepts that psychology has provided to scientific investigation. It permits the organization of a vast array of data needed for statistical control and promotes using a conceptual model to guide the analysis and interpretation of research. The concept of temporal ordering, patterns of change over time, and causality determined by temporal ordering is likewise a major concept for science. Multivariate time series analysis combines these two powerful concepts much as structural equation modeling combined latent variables and statistical control. It is likely to emerge as a major procedure when technology is able to provide appropriate data.

Manifest and Latent Variables
The concept of a latent variable to organize a set of observed or manifest variables is critical to many statistical methods including factor analysis and structural equation modeling. A manifest variable is directly measured on a quantitative scale. In this study, the five variables of interest were directly measured through calibrated instruments, and therefore are manifest variables. Any subset of manifest variables can be thought of as manifestations of an abstract underlying dimension -a latent factor. Instead of understanding the manifest variables separately, a latent factor simplifies matters in that it is hoped to contain virtually all of the information inherent in the original manifest variables.

Understanding Dynamic Factor Analysis
Although it is a new technique with few applications, the origins of dynamic factor analysis began decades ago. P-technique factor analysis, introduced by Catell (1952) showed that multiple measures collected across multiple occasions might reveal the latent structure of the data across time. However, P-technique was criticized, in part for its failure to examine the lagged covariance structure (Anderson, 1963). A generalization of the P-technique factor model which inr·0rporated this lagged covariance structure was coined dynamic factor analysis (Brillinger, 1975;Molenaar, 1985;Priestly, Subba Rao, & Tong, 1973 ). This method examines the covariance of a single variable at time t with the covariance of any other variable of higher-order lags (time t-1, t-2, ... , t-a). The covariances of the variables are decomposed into the latent factors. This is performed for each individual, so generalization of specific patterns of behavior and/or cognitive processes may be possible. Dynamic factor analysis is simply a factor analysis employing time series data in which the unit of analysis is the individual.
To better understand dynamic factor analysis, it may be helpful to think of the analysis as an extension of an exploratory factor analysis. In a factor analysis , loadings are obtained for p variables on each of m latent constructs. The relationships between manifest and latent variables are assumed to be simultaneous; that is, there is assumed to be no time difference between the measurement of the manifest variables and the associated latent variables. A dynamic factor analysis solution of lag 1 will yield these "simultaneous" loadings (at time t) in addition to loadings for the same set of p variables at a previous time (t-1 ). A lag a dynamic factor solution will include loadings for p variables at time t, t-1, ... , t-a. The total number of loadings in the solution withp variables, m latent constructs, and a lags will be equal to pm(a+ 1). Figure 1 shows a traditional cross-sectional factor analysis with six variables and two factors. In this figure, the two latent factors are enclosed in circles, manifest (measured) variables, labeled V k are enclosed in squares and error terms, labeled ek are not enclosed.
Figure 2 displays a one-factor dynamic factor model with three variables and two lags. As in Figure 1, V k represents manifest variables, and ek represents error terms.
Notice that in the dynamic factor analysis solution, error terms are allowed to be correlated. Figure 3 displays a two-factor dynamic factor model with five variables and one lag.
There are three manifest variables measuring the first factor at each time point and two manifest variables measuring the second factor at each time point. The same notation is used as with figures 1 and 2.

The Uses of Factor Analysis
Factor analysis has many uses, all of which apply to dynamic factor analysis. The first use is simply to identify underlying latent factors. In a factor analysis, the covariances of a set of variables are examined to search for underlying factors that accurately represent subsets of those variables. These underlying factors, often called latent factors, may provide insight into the subject being studied. As mentioned previously, the set of latent factors should represent a large proportion of the information contained in the manifest variables which make up those latent factors. In this way , latent factors are a more parsimonious representation of data than manifest variables.
Factor analysis can also be used to screen variables for the inclusion in subsequent statistical investigations, such as regression or discriminant analyses. Since factor analysis identifies groups of variables that are highly correlated with one another, a single variable can be chosen from each factor for inclusion among a set of predictor variables, thereby avoiding collinearity. In addition, factor scores can be used in subsequent data analysis techniques such as multiple regression or discriminant function analysis. A factor score is simply a weighted combination of the manifest variables.
A third use of factor analysis is simply as a data summary technique. As an exploratory procedure, a researcher can extract as few or as many factors as desired, examining the fit of each model. A small number of factors, perhaps only one or two, may account for a large bulk of the variance contained in the entire set of variables. This procedure can be an important precursor to future analyses.
Overall, factor analysis is an extremely powerful and versatile technique . It can both summarize data and identify relationships among variables, two of the basic functions of statistical analyses. It can also serve an inferential role in generalizing to larger populations of subjects. Dynamic factor analysis has all of these same characteristics as traditional factor analysis.

Additional Uses of Dynamic Factor Analysis
Because dynamic factor analysis is concerned with the temporal ordering of the data, it has benefits in addition to those previously outlined for traditional factor analysis.
Dynamic factor models can also add information to traditional time-series methods. On~ obvious use of dynamic factor modeling is examining the latent structure of variables across time. As with cross-sectional factor analysis, one parameter which may be of interest is the factor loadings of each variable at each lag. Factor loadings represent the degree to which each of the variables correlate with each of the factors at a particular lag.
Inspection of the factor loadings can reveal the extent to which each of the variables contribute to the meaning of the factors. Those variables with high loadings will be the ones that provide the meaning and interpretation for the factor.
Although most dynamic factor models in the literature thus far have employed stationary models displaying no trend across time, nonstationary models can also be evaluated (Molenaar, De Gooijer, & Schmitz, 1992). Potentially, this could be a valuable method in the social sciences as well as in the areas of economics and business management, where accurate prediction is of extreme importance.

Availability of Time-Series Data
Despite its usefulness, dynamic factor analysis models have been used infrequently since being introduced (Hershberger, Corneal, & Molenaar, 1994). One reason for this is the relative unavailability of time-series data. While economists and business researchers utilize time-series data for prediction frequently, social scientists rarely obtain singlesubject multivariate repeated measures (SSMRM) data, with the exception being in the area of psychophysiology (Wood & Brown, 1994 (Velicer , Prochaska, & Bellis et al., 1993) can also facilitate the collection of SSMRM data. Using expert system technology, collecting SSMRM data can be cost-effective and appropriate for school, home, business , and internet settings . The World Wide Web (WWW) is another arena in which SSMRM data collection could be made simple and extremely cost-effecti ve. "Push technology" and the WWW already allow computer users to automatically be fed with news and conferencing updates as they occur. Integrating data collection into this technology is probably not too far in the future. Two-way paging systems is another example of how data collection could be made simple, inexpensive , and reliable. With additional technological advances , it is not unreasonable to think that SSMRM data could become common in many research fields.
Using SSMRM data to study multiple measured variables and single or multiple latent factors can greatly contribute to developing a theory about the behavior of those variables and constructs over time. Studying variables and constructs longitudinally for a single person can be both an appropriate and cost-effective preliminary step before testing a hypothesis on data gathered across many individuals (Wood & Brown, 1994). Dynamic factor modeling provides a framework for examining SSMRM data in this way.

Difficulty In Utilizine Dynamic Factor Analysis
Aside from the lack of SSMRM data, there are several reasons why dynamic factor analysis has not been embraced along with other social science methodologies. because models with few lags or factors are not "nested" within more complex models. It is also still uncertain which model should be used as a null or baseline model. As with factor analysis, a dynamic factor analysis solution can be rotated for better interpretability. For most researchers, the unsolved problems with dynamic factor analysis represent serious barriers to using the method.

Evaluatini:; Dynamic Factor Models
Despite the limitations and uncertainty surrounding dynamic factor analysis, utilizing this method is possible. In traditional factor analysis, a correlation or covariance matrix is sufficient to perform the analysis. However, in order to proceed with dynamic factor analysis, the c . . variation or correlation matrix must be transformed into a block toeplitz matrix. A toeplitz matrix has the property that all matrices on the diagonal are the same.
A data set with n lags and p variables would produce a covariance or correlation matrix of size (n*p) x p or p x (n*p). The resulting toeplitz-transformed matrix will be of the form (n*p) x (n*p). In this study, n is the number oflags being examined (including the zerolag) while pis the number of variables . For a five variable, two lag model, p = 5 and n = (2+ 1 ). In this example, the resultant toeplitz-transformed covariance matrix will be (3 * 5) x (3*5) = 15 x 15. This transformed matrix can then be used by the LISREL program or the SAS macros to execute the dynamic factor analyses. originally computed, one for each lag. After the transformation there is just one matrix, a 9 x 9 matrix. Notice that this 9 x 9 matrix essentially has blocks of correlations which are repeated. This transformation, or reconfiguring, can be done through the SAS function TOEPLITZ , or through a SAS macro provided by Wood & Brown (1994).
Evaluating different dynamic factor models can be accomplished through additional SAS macros provided by Wood & Brown. These macros provide at least 15 different fit indices including an incremental fit index comparing a zero-lag, one factor model to more complex models. The output also contains standardized and unstandardized factor loadings presented in much the same way that LISREL presents these parameters.
However, documentation for these macros is lacking. As a result, using them may be difficult. Also , no evidence exists that parameter estimates produced by these macros are accurate.
Dynamic factor models can also be evaluated with structural equation modeling programs such as LISREL (Joreskog & Sorbom, 1989) or EQS (Bentler, 1995). Again, it is important to first obtain a block Toeplitz matrix which represents the simultaneous structural equation system. Hershberger et al. (1994) provides a detailed example of dynamic factor models tested through the use of LISREL.
In the LISREL or other structural equation modeling programs like EQS, the lambda matrix must be given special consideration. If one wishes to specify two factors with one lag, a total of four factors should be specified in the lambda matrix. Two factors are associated with the zero-lag and two factors are associated with the first lag. For a model with three factors and four lags, 15 factors should be specified in the lambda matrix. In addition, a certain number of factor loadings within lambda must be fixed to zero. This is an issue that must be dealt with when evaluating any structural model.
The specifications of the Theta (0) matrix also need to be considered. For a five variable model with one lag, the Theta matrix will be of size 10 x 10. The error variances of the zero-lag variables are located on the main diagonal of Theta. The first five diagonal elements of Theta should be equated to the second five main diagonal elements.

Interpreting Dynamic Factor Models
Dynamic factor analysis is a very powerful technique that is in its developmental infancy. The interpretation of dynamic factor models is one difficulty that researchers face when utilizing this method. Interpreting factor structures and best-fitting models can be a subjective process, especially with very complex models. Parsimony is also an important consideration that is often lost when extremely complex models are tested. In addition, the generalizability of the results should also be considered.
Interpreting the results of a dynamic factor analysis can be challenging, especially with models involving many variables , more than two factors, and multiple lags. In one study which involved 28 variables and five factors across lags zero and one (Hershberger et al., 1994), the authors offered an explanation for the differing factor structure across Parsimony is also an important issue when deciding upon different dynamic factor models. In most cases, adding additional lags to a model will increase the standard and relative goodness of fit indices. To the inexperienced user this may be of no practical or theoretical importance. Adding additional lags to a model will also affect the parameter estimates at lower-order lags.
When stating conclusions based on dynamic factor analysis results, the . generalizability of the conclusions should be considered . As previously stated, separate dynamic factor analyses are performed on each individual. In a study in which the same analysis is performed on more than one individual, explaining contrasting results can be extremely difficult. Contrasting results across subjects can sometimes be explained by characteristics of the individual, such as gender or medical conditions. However, if similar results are not found in an overwhelming number of subjects, and no explanation can be derived from individual characteristics , the results of the study are probably inconclusive. On the other hand, if similar results are found in multiple subjects , generalizability can only be strengthened .

Measures of Habit Strength
As with any factor analysis model, the choice of variables is critical to reaching an interpretable solution. Since the application in this study is smoking habit strength, the most important habit strength variables should be quantified and available for analysis.
In this study, three primary measures are employed as indices of smoking habit strength.
Although there are alternative measures, the three employed here are the most widely used outcome measures in smoking studies.
These three measures include two biochemical and one self-report measure.
Biochemical measures are defined as those measures that are arrived at from a physiological assessment of the subject 1 Jased on an objective biochemical or mechanical technique. Self-report measures are those which rely upon subjects' personal assessments and reports of their own perceptions or behaviors. The most common behavioral measure, the number of cigarettes smoked per day, was employed. The most common biochemical measure, salivary cotinine, was also measured along with carbon monoxide level. Two additional measures, body temperature and skin temperature, were also measured. It was hoped that these two "marker" variables would provide divergent validity in the factor structure.
In many studies, self-report is used to measure habit strength followed by a biochemical validation. However, using biochemical measures as a method of validation has been a controversial issue (Velicer, Prochaska, Rossi, & Snow, 1992). There can be great difficulty in obtaining the appropriate sample of exhaled air or bodily fluids from subjects. Also, the monetary cost in assessing these biochemical samples is an important consideration . Because of these cons, disagreement exists in the necessity of obtaining biochemical validation and in using biochemical measures to gauge habit strength. In their brief review of measures of smoking habit strength, Velicer et al. (1996) concluded that no single measure is a satisfactory measure of habit strength. The results of this study may help to clear up the confusion surrounding these measures.
Carbon Monoxide. Carbon monoxide (CO) can be measured in expired air or in the form of carboxyhemoglobin in blood. Both sources are highly correlated (V elicer et al., 1992). During smoke inhalation, CO is absorbed rapidly into the bloodstream and produces a sharp increase in CO level. It has a relatively short half-life of four to five hours in sedentary adults (Stewart, 1975). Given this short half-life, levels can be influenced by time of day and time elapsed since the last cigarette. Assessments late in the day have been found to be more valid (Benowitz, 1983b ), and self-report of recency of smoking can increase sensitivity (Bauman, Koch, & Bryan, 1982). For light smokers, sensitivity of CO is not as good as with heavier smokers (Vogt, 1982). Specificity is between 84 to 98% but can be reduced by exposure to CO present in the environment as a result of pollution and combustion sources.
Number of Cigarettes Smoked. In measuring the number of cigarettes smoked, subjects simply monitor their own smoking behavior. This self-report has been criticized by some researchers for underestimating the amount of cigarettes smoked (Haley & Hoffman, 1985;Warner, 1978). Many suggest that subjects often will under-report levels of cigarette use (Luepker, Pallonen, Murray, & Pirie, 1989;Murray & Perry, 1987;Russell, 1982;Stookey, Katz, Olson, Drook, & Cohen, 1987). Under-reporting may also occur as a result of the tendency of subjects to report cigarettes smoked in multiples of five or ten (Pechacek, Fox, Murray, & Luepker, 1984 (Abrams et al., 1987;Haley et al., 1983;Jarvis, Tunstall-Pedoe, Feyerabend, Vesey, & Salloojee, 1984Knight et al., 1985;Pojer, Whitfield, Poulos, Echard, Richmond, & Hensley, 1984) because of its superior sensitivity and specificity. It is generally more expensive than other biochemical measures and more analytically complex 0f elicer, Prochaska, Rossi, & Snow, 1992). Carbon monoxide on the other hand, is easily assessed and does not require the storage of bodily fluids. Because of its shorter half-life, CO would be more sensitive to recent changes in smoking.

Nicotine Regulation Models
Three different models have been examined in the literature to account for nicotine's effectiveness in maintaining smoking: a) the fixed-effect model, b) the nicotine regulation model, and c) the multiple regulation model. A more detailed review and description of these three models is provided by Leventhal and Cleary (1980). One of the purposes of this study is to test the hypothesis that smoking habit strength can be best described as a multiple regulation model as shown by another study using this same data (V elicer, Redding, Richmond, Greeley, & Swift, 1992). This can be done by examining the autocorrelations, cross-correlations, and the direction of the factor loadings for the habit strength measures. Verification of the results found by  would be an important contribution.
Nicotine Regulation Model. This model posits that smoking serves to regulate or titrate the smoker's level of nicotine. It assumes that smokers have a personal "set point" of smoking, which their body is accustomed to. The model suggests that any increase or decrease in smoking caused by events in a person's environment should be temporary.
When the environment returns to normal, a person should immediately return to their personal set point. In terms of dynamic factor modeling, this mc iel would result in cross and auto-correlations of approximately zero at all lags greater than zero. Hence, a dynamic factor analysis solution would probably not show distinct factors at higher-order lags. Nicotine Fixed-Effect Model. This model assumes that smoking is reinforced because nicotine stimulates specific reward-inducing centers of the nervous system.
These have been identified as either autonomic arousal or feelings of mental alertness and relaxation or both. Since these are relatively short-term effects , an increase on one occasion should be followed by an increase on the next occasion. Similarly , a decrease on one occasion should be followed by a decrease on the next occasion if the same level of arousal is to be maintained. This model would result in positive auto and crosscorrelations at low lags, with a gradual reduction in the size of the loadings at higher lags.
Subsequently, the dynamic factor solutions would show a distinct habit strength factor with positive loadings, but only at low lags. Table 3 demonstrates how this factor structure might appear.

Hypotheses And Goals
The main hypothesis of this study is that across time, smoking habit strength can best be described as a multiple regulation model. This was previously shown with this same data set using traditional time-series procedures (Velicer, Redding, Richmond, Greeley, and Swift, 1992). This will be evaluated by examining the factor structure produced by the dynamic factor models across all subjects and lags. This will be a qualitative examination of the loadings. No significance tests will be performed. Table 2 provides an illustration of how the factor structure might appear in a multiple regulation model.
There are also several secondary goals for this study. First, the factor loadings of the five variables were investigated to see which variables form a clear habit strength factor, if any. This involves a qualitative examination rather than significance testing. There are many possible combinations of factor loadings. One model would involve the three measures of smoking habit strength all measuring a single factor. This single factor can be interpreted as habit strength. If two of the habit strength variables form a distinct factor, it suggests that the third variable may not be measuring the same phenomenon. If only one of the variables has a distinctively higher loading than the rest, that single item could be interpreted as the best single measure of habit strength. Alternatively, if only two of the variables have only medium loadings on a factor, naming the factor "habit strength" may not be the best interpretation.

18
A second focus will be determining how many lags the habit strength factor can be identified in. As the number of lags (which relates to the amount of time) increases , it is expected that the habit strength factor will decrease . The half-lives of the three habit strength variables should play an important role in higher-lagged models . Measures with a greater half-life should be identifiable at higher lags. However, the greatest half-life for the variables used is 40 hours, so it is expected that no habit strength factor will be identified at lags greater than three .
The third research goal is dependent on the previous goals. If a clear habit strength factor is found , it is important to know which of the variables contributed the most to this factor. Again , the size of the factor loadings would provide this information. It may be true that number of cigarettes smoked consistently has the highest loading across lags and subjects . This result would suggest that this measure may be the best measure of smoking habit strength among the three being examined.
A fourth research goal is to critically evaluate the utilization of dynamic factor analysis. As stated previously , little documentation and published research on dynamic factor analysis is available. Without prior knowledge one might believe that the method is simply too difficult or requires too much time to learn and use. This study hopes to show how true or false this belief may be . The current status of the method as a generally disseminable research tool will be evaluated.
A fifth research goal is to compare the output of the dynamic factor models provided by LISREL with output from the Wood & Brown SAS macros. No published study has demonstrated that the two programs produce the same results using the same data and models. It is expected that the results will be the same , but verification is still important.

METHOD Subjects
Data was collected twice per day for a maximum of 62 days. Each subject could therefore have up to 124 data points. However, data was not collected in exact 12 hour intervals because of scheduling difficulties. For most subjects, the data was collected once in the morning and once again in the afternoon. The researchers attempted to keep the time subjects could possibly smoke (disregarding eating and sleeping) between intervals as equal as possible. One subject had only 97 data points, while all other subjects had at least 115 of the possible 124 data points.
Using single subject multivariate repeated measures (SSMRM) data, time series analysis typically requires a minimum of 100 data points (Box & Jenkins, 1976;Glass et al., 1975) in order to achieve stable autocorrelations. However, results of simulation studies have not provided strong recommendations regarding a minimum number of data points needed to obtain stable correlations for dynamic factor modeling. One study showed that when using LISREL to examine zero-lag factor analysis models with a sample size of 50, nonconvergence and improper solutions often resulted (Boomsma, 1982). Improper solutions have standardized factor loadings or error terms outside the bounds of acceptable values. In a nonconverging model, the analysis program is simply not able to identify a solution. Ding, Velicer, & Harlow (1995) provide strong opposition to the use of improper solutions. Their simulation study showed that when sample size is 50, the effect of improper solutions on fit indices "becomes very serious". They also advise that using structural equation modeling procedures with a sample size of 50 is "very questionable". However, in dynamic factor analysis, sample sizes of 50 have been shown to yield parameter estimates close to their true values. Wood & Brown ( 1994) conducted Monte Carlo simulations with differential factor loadings and sample sizes using dynamic factor analysis. Using randomly generated data , nearly identical rates of improper solutions were found compared to traditional P-technique factor analysis.
Improper solutions are those in which loadings, regression paths, or error terms are not found to be within the proper standardized range (between -1.0 and 1.0). There is still information to be gained from improper solutions, but the resulting fit indices and factor loadings should not be accepted as being perfectly precise.
The Wood & Brown (1994) simulation study mentioned above also found that rates of nonconvergence in dynamic factor models were higher than those found with traditional P-technique factor analysis . A model is said to be nonconvergent when the statistical program examining it is unable to converge to a single solution . With ten subjects in this study, it is of interest to track when nonconvergence and improper solutions occur .

Procedure
During the first session , subjects completed a detailed demographic and smoking history questionnaire. Level of dependence was assessed by Fagerstrom's (1978)

Tolerance Questionnaire and the Addiction Research Unit Smoking Motivation
Questionnaire (West & Russell , 1985). A payment of $100 was made on completion of the study, with a bonus of$20 provided for each week the diary was kept. Thus, the total amount payable was $140.

Measures
At each session, self-report and physiological measurements were taken. Self-report was collected on the number of cigarettes smoked. Carbon monoxide levels in exhaled breath and cotinine in saliva were also measured. The entire measurement procedure took between 5 and 10 minutes on each occasion. Physiological measures were generally taken in a fixed order, except when more than one person arrived simultaneously, in which case equipment was utilized as it ,ecame available .
An EC 50 Carboximeter was used to measure CO concentrations in end-expired breath. Subjects were instructed to talce a deep breath, expire all air from the lungs, talce a second deep breath, hold it for 20 seconds, and then blow into the instrument's mouthpiece.
Cotinine was measured from two mL samples of saliva provided by the subjects in sterile plastic sampling tubes. Subjects were requested to produce enough saliva from their mouths for this purpose and to deposit this in the tubes, being careful not to contaminate the sample with phlegm. All assays were conducted at Royal Prince Alfred Hospital, where salivary cotinine was measured by gas chromatography with an ionic diffusion of 12.5m X 0.2mm (Thompson, Ho, & Peterson , 1982).
The number of cigarettes smoked from one measurement period to the next was assessed by having subjects self-monitor their smoking behavior. This was achieved by having the subject use tally cards which they kept in their cigarette packets. Each time they smoked a cigarette, they ticked off a number on the card. These cards were collected each morning, at which time a new one w~ issued. A small number of subjects preferred to self-monitor by counting the number of cigarettes left in their packet at the end of each day.

RESULTS
The  Table 4.
Because of these extreme violations of skewness and kurtosis assumptions, the data was transformed by a square root transformation and a natural log transformation.
Comparing the resulting skewness and kurtosis values after the two transformations clearly showed that the natural log transformation was better at correcting the extreme skewness and kurtosis violations . It was decided at this point that all subsequent analyses would use the log-transformed data. For the log-transformed data, normalcy assumptions were still severely violated in six subjects for the cotinine variable. Normalcy for skin temperature was violated in three subjects . Normalcy for number of cigarettes smoked and CO were violated in one subject. Body temperature was the only variable in which normalcy assumptions were not severely violated in the log-transformed data. The skewness and kurtosis values, as well as other descriptive statistics of the log-transformed data, can be found in Table 5. solution. This may be due to the fact that there is a lack of statistical power to detect it, or alternatively, this factor does not exist for that individual.
The auto and cross-correlations were examined up to five lags, or six time points. For the three smoking habit strength variables , significant autocorrelations were found that confirm the previous study employing this data 0f elicer, Redding, Richmond , Greeley, & · Swift , 1992). Careful examination of the cross-correlations showed that there was one consistent relationship within the five variables. This was an oscillating correlation between cigarettes and CO that occured in nine of the ten subjects. In these 9 subjects the 0-lag correlation was consistently positive , the first-lag correlation negative, the secondlag positive , the third-lag negative , etc. In the tenth subject (JBW) , the opposite effect was found. The correlations for each variable across 5 lags are presented in table 6. The cross-lag correlations for these nine subjects were averaged. Examination of these averages showed that the relationship between cigarettes and CO was the only clear relationship that existed. These averaged correlations can be found in Table 7.
A consistent pattern for the cross-correlation between body temperature and skin temperature was observed. For all ten subjects , the 0-lag correlation was moderately positive in comparison to the correlations between other variables (range= .13 to .38, mean r = .22). This correlation was significant at the .05 level in six subjects. Higherorder lagged correlations between these two variables were almost all positive but few were significant . For all other pairs of variables , correlation patterns were not consistent across subjects and the correlations were , in most cases , not significant.
At this point in the analysis, there is evidence for a factor structure being shown by the dynamic factor analysis, considering the relationships between cigarettes and CO and between body and skin temperature. However, the cross-lagged correlations are low , which may lead to a factor structure that is not stable when multiple lags are added to the basic model.

Dynamic Factor Analyses
In the second stage of the analysis, dynamic factor models were run using LIS REL.
The LISREL procedures for utilizing dynamic factor analysis models are outlined in Hershberger, Corneal, & Molenaar (1994). In order to run dynamic factor analyses in a structural equation modeling program, a toeplitz-transformed covariance matrix was calculated through a SAS macro provided by Wood & Brown (1994). One and two factor models with lags from zero to five were examined. Thus, 12 models were examined for each subject, or 120 models total. LISREL was able to provide fit indices for 102 models (85%) but was only able to produce standardized factor loadings for 66 of the 120 models (55%). For most models, the Theta EPS matrix was not positive definite or the admissibility test failed. A non-positive definite Theta matrix is a result of negative eigenvalues or negative determinants of submatrices. Most models had to be repeated at least once with better starting values in order to obtain a proper standardized solution.
For the one-factor models, 43 out of a possible 60 models (72%) produced proper standardized solutions. For the two-factor models, proper standardized solutions were found for only 23 out of 60 models (38%). In both the one and two-factor models , there was a slight trend for higher-ordered lagged models to have greater difficulty producing proper standardized loadings, compared with lower-lagged models. When more factors and lags were added to the models, there was clearly a greater chance of the model not converging at all. Table 8 shows the frequencies of proper, improper, and nonconverging solutions for each model in the LISREL analysis. 27 These same 120 dynamic factor models were then analyzed with SAS macros provided by Wood & Brown (1994). The results from these analyses were slightly more encouraging. For the one-factor models, 46 out of a possible 60 models (77%) produced proper standardized solutions. For the two-factor models, proper standardized solutions were found for 30 out of 60 models (60%). Overall, 76 out of 120 models produced proper standardized solutions (63%). For the analyses in which proper solutions were not found, the solutions either did not converge (41 cases, 34%) or were improper (four cases , 3%). Table 7 also shows the frequencies of proper, improper, and non-converging solutions for each model in the SAS analysis.
Of the 14 one-factor models that were found to have improper or nonconverging solutions with the SAS macros, eight (57%) of these models also had improper or nonconverging solutions in the LISREL analyses. For the two-factor models, this was the case with 24 of the 31 (77%) improper or nonconverging solutions in the SAS macro analyses. This is an indication that the two methods of executing dynamic factor analysis are generally not converging upon proper solutions in the same models. However, the overlap in improper or nonconverging solutions was not completely consistent. Utilizing both programs will produce higher rates of proper solutions.
Each subject will be examined separately to determine what factor structure, if any, best describes the data. Dynamic factor models for zero, one, and two-lag solutions are presented for each subject in figures 5-14. In addition, factor loadings are presented in tabular format for subject JBW in Table 11.

28
Subject ABN had the fewest number of data points (97). In the one-factor solutions, a habit strength factor was identified by the Cotinine and CO variables, but only at the zero lag. There was no clear interpretable factor at lags one and two. For the three, four, and five-lag solutions, the factor structure identified in the zero, one, and two-lag solutions was less evident. The loadings at each successive lag alternated in direction, suggesting the effect of the nicotine regulation model.
For the two-factor models, only the one, three, and four-lag models had proper solutions. For the one-lag solution at the zero-lag, a habit strength factor emerged (all three habit strength variable loadings> .37) as well as a temperature factor (both temperature variables > .40). At the 1-lag, only the temperature factor was present (BODY, SKIN> .40). For the three and four-lag solutions, the factor structure was not interpretable, with fewer than two variables having significant loadings. As with the onefactor solutions, the higher-order lagged models were characterized by inconsistent loadings, leaving the models uninterpretable.

SubiectJBW
Subject JBW had 113 complete data points, with 11 data points missing on at least one of the five variables. Nevertheless, this subject represented the clearest and most consistent factor structure of any of the subjects. For the zero and one-lag solutions, there existed a clear habit strength factor at lag zero, as all three habit strength variables had loadings greater than .50. At lag one, the factor was still present, although the loadings were somewhat weaker (all three> .20). This same structure was consistent across all solutions even including the four and five-lag solutions. No factor was evident in any lags greater than lag one.

29
All two-factor solutions also supported the presence of a habit strength factor at lag zero. For the zero-lag solution, all three habit strength variables had loadings greater than .40. For all two-factor solutions, no factor was evident at higher-ordered lags. There was no evidence at all for the existence of a temperature factor.

SubiectEBE
As was true with subject JBW, subject EBE also had a very consistent solution across all lags. For the one-factor solutions, a habit strength factor existed at only the zero-lag.
For the zero-lag solution, all habit strength loadings were greater than .30. At higherordered lags, the solutions were very similar.
The two-factor solutions were similar to the one-factor solutions. For the two-factor, zero-lag solution, all habit strength loadings were greater than .31. For higher-lagged solutions, the same factor was found at the zero-lag, but no evidence existed for factors at any higher-ordered lags, or for the existence of multiple factors.

Subiect KTN
Subject KTN is one of two subjects whose solutions support the existence of a temperature factor. Consistent with all proper solutions, a temperature factor existed at the zero-lag. For the one-factor and zero-lag solution, the variables BODY and SKIN had loadings greater than .54. Higher-lagged one-factor solutions were not quite as clear.
The two-factor solutions also showed that a temperature factor existed. At the zerolag, BODY and SKIN had loadings of at least .55. Of the higher-ordered two-factor solutions, only the two-lag solution supported this. All other two-factor solutions were either improper or simply uninterpretable. There was no evidence for the existence of multiple factors.

Subiect LRD
Unfortunately, the data for subject LRD was plagued by improper solutions. For the one-factor solutions, only the two-lagged solution was proper. For the two-factor solutions, only the two, three, and five-lagged solutions were proper.
The existence of a temperature factor (including both BODY and SKIN) is supported by the one-factor two-lag model as well as the two-factor five-lag model. Because of the frequency of improper solutions and the lack of support from the other proper solutions, this finding should hardly be considered a stable solution for this subject. Again, no evidence existed for the presence of multiple factors.

SubiectWSS
Subject WSS did not seem to have any consistency within the one and two-factor solutions. Only the one-factor, one-lag solution had an interpretable factor structure: a complex factor with the loadings of CIGS = .40, and SKIN = -.31. Considering that all other solutions were either improper or uninterpretable, the results for subject WSS should be considered inconclusive.

Subiect RTS
Similar to subject WSS, subject RTS had solutions with low loadings which were inconsistent when lags were added to the model. For every one-factor solution, the variable CO had a high loading at the zero-lag, but all other variables had extremely low loadings.
For this subject, within the two-factor models only the one-lag and three-lag models were able to produce proper solutions.

Subiect BER
With subject BER, only the zero-lag model was able to produce a proper solution within the six one-factor models. Of the two-factor models, again only the zero-lag solution was proper. Although these solutions could possibly be interpreted as having a habit strength factor at lag zero, it is probably best to not consider these solutions because they cannot be supported by other solutions. It should be noted that the five-lag models failed to converge with both one and two-factor models.

SubiectRWF
For subject RWF, within all one-factor models, body temperature seemed to carry the weight of the zero-lag factor. Loadings at higher order lags were extremely low.
Only two two-factor models had proper solutions (zero and three-lag), both of which were inconsistent and difficult to interpret.

SubiectJWN
One-factor models were not at all consistent when additional lags were added.
Loadings were generally very low. The two-factor models were also very difficult to interpret . Examining each dynamic factor solution on its own, one might conclude that a habit strength or temperature variable existed. But upon looking at all proper solutions , there was clearly not a consistent factor pattern.

Evaluation of the Two Dynamic Factor Analysis Programs Utilized
Another research goal was to critically evaluate the implementation of dynamic factor analysis. Using the SAS macros required only basic knowledge of setting up data and data analysis files in SAS. Utilizing the macros was relatively simple, as the instructions provided by Wood & Brown (1994) were helpful. The output of the macros was difficult to understand, as little explanation is documented in the output itself. They are organized well, but information is often repetitive and/or unnecessary. The macros are inflexible in the sense that without programming knowledge of SAS macros and in-depth knowledge of the mathematics behind dynamic factor analysis, evaluating alternative models is not possible. Unfortunately this does not allow most researchers to evaluate correlated factor models or models with additional correlated error terms. It is also not clear exactly which error terms and factors are being correlated. Because of lack of documentation, it is up to the researcher to figure this out. The amount of time required to evaluate models with multiple factors and lags is also a disadvantage of the macros. It often required hours of computer time to evaluate models with three to five lags, requiring late night runs on a mainframe computer.
All dynamic factor models were also evaluated through LISREL version 7. The LISREL programming for dynamic factor analysis is outlined in Hershberger et al. (1994). For researchers who are already familiar with LISREL, the programming is not difficult but can be time-consuming for models with multiple factors and lags. Although it was not performed in this study, evaluating a model with five-six factors and four-five lags could require up to 500 lines of programming code. However, compared to the Wood & Brown SAS macros, the LISREL output is easier to understand. LISREL is also flexible in that difforent starting values can be used. The percentage of proper and converging solutions was lower than that of the Wood & Brown SAS macros.
Documentation exists for LIS REL 7, allowing researchers to specify many parameters of the program within any model. Multiple models can be evaluated in a much shorter time span compared with the SAS macros. The ratio of computational time for the two programs was approximately 10 to 1. Programmed models with four-five lags and two factors took only 10-20 minutes to run, rather than two or more hours with the SAS macros.
The methods in which the SAS macros and LISREL obtain solutions differ slightly.
As a result, differing rates of proper solutions were obtained. The SAS macros outperformed LISREL in this respect, obtaining a greater percentage of proper solutions in one-factor models (77% vs. 72%), two-factor models (50% vs. 38%), and overall (63% vs. 55%).
When a proper solution was found, the SAS macros and LISREL were generally in agreement. This often required multiple attempts using LISREL with different starting values, particularly with models of higher-ordered lags. LISREL was unable to converge upon a proper solution in 51 of the 120 models. Out of these 51 models, the SAS macros obtained a proper solution in 19 (37%). Conversely, the SAS macros were unable to converge upon a proper solution in 44 models. Out of these 44 models, LIS REL obtained a proper solution in 12 (27%). This suggests that knowledge of both programs would be beneficial toward obtaining the greatest number of proper solutions. These results are presented in Table 9.
Overall, the LISREL program is preferred for evaluating dynamic factor models.
Flexibility and time are the key advantages of LISREL compared to the SAS macros. 34 The same approximate solutions were found for each of the two programs, although different starting values affected the results a great deal in some models.

Interpretations
Before the analyses were carried out, it was assumed that there would be a clear habit strength factor emerging from each lag. Evaluating the results, it was assumed, would depend on the magnitude and direction of each of the loadings. However, in all but three subjects the habit strength factor did not emerge, and consequently the magnitude and direction of the variable loadings were of no interest. It was believed that with the twofactor models, the second factor would be a temperature factor if the two temperature variables were correlated. This hypothesis was not supported either, as only one subject (ABN) had a clear interpretable two-factor solution. It was assumed that the postanalysis interpretation would consist of determining which of the three nicotine regulation models the data represented. However, because of the many inconsistencies within and across subjects, the bulk of the interpretation instead became figuring out why these inconsistencies exist.
Only four of the subjects had multiple solutions which were interpretable (ABN, JBW, EBE, KTN). Unexpectedly, these four subjects had the fewest number of data points. Three of these subjects had at least one variable which severely violated the kurtosis assumption. However, these four subjects did have clearly identifiable ARIMA models associated with each of the smoking habit strength variables. These results are summarized in table 8. It should be noted again that the solutions for these four subjects were not extremely consistent with the addition of lags to the models. The solutions found for these subjects were not as robust and clear as hypothesized, but they are the only subjects whose solutions did not appear to have random loadi:lgs of varying magnitude and direction across lags. The factor loadings for subject JBW are presented in table 9. It was concluded that subject JBW had a habit strength factor, represented by the three hypothesized habit strength variables, which had a negative autocorrelation. No conclusions could be drawn from the solutions of the remaining six subjects. The solutions presented in figures 9-14 do not support the existence of a habit strength factor indicated by two or more variables. Table 12 presents a summary of all subjects' dynamic factor analysis results for one-factor models, with the associated nicotine regulation model which was interpreted from the entire set of solutions for each subject.
One goal of the study was to search for evidence which may suggest which of the three smoking habit strength measures are most reliable and stable across time. This could not be adequately assessed because there was not a clear habit strength factor in most of the solutions. As defined previously, a clear factor should be indicated by at least 2 variables with at least medium loadings (> .50). Across all solutions, it did not occur often that two of the three habit strength measures had loadings greater than .50. Of the 46 proper solutions obtained in the analyses of one-factor models , only 68 ( out of a possible 1050) loadings were greater than .50. Fifteen of these were cigarettes smoked, 19 were cotinine, 14 were CO, 13 were body temperature and 7 were skin temperature.
Examining the correlations among the habit strength variables clearly shows that the variables do not have the strong relationship that is generally accepted in the literature .
Because of the half-lives of cotinine and CO, correlations might be expected to be low at lags 1 and above. However, even at the Olag the average correlations between these 3 variables were much lower than expected. Number of cigarettes smoked and cotinine had virtually no relationship (average r= .. 01) . Number of cigarettes and CO had a non-significant relationship as well (-.12). Even the 2 biochemical measures had an nonsignificant average correlation (.17). Considering the number of data points studies across 10 subject, these correlations should raise serious doubt about the validity of using biochemical measures in place of self-report measures, and vice versa. These correlations may differ depending on the interval length used, but even with the data collection interval used in this study, the obtained correlations are alarmingly low.

Conclusions
From the solutions presented from the dynamic factor models, one of two conclusions could be drawn from this data: 1) The relationships between the variables across time are such that dynamic factor modeling could not show support for a clear, consistent factor structure; or 2) The relationships between the variables are unstable as a result of data collection techniques used and the interval length between data collections.
Considering that the data collection outlined earlier seemed to be extremely standardized and done in a professional manner, and that V elicer et al. (1992) were able to find clear ARIMA models to represent most variables' performance over time, the second conclusion from above is probably not valid. Dynamic factor analysis is a descriptive technique. It is the conclusion of this researcher that this technique was simply not able to describe the data in a way that was hypothesized.
There are many characteristics of the data that contribute to this. The most significant characteristic was encountered when examining each individual's correlation matrix. The correlations between variables were extremely low, especially across lags. When this occurs , it is diffic lt to obtain stable estimates across similar models. Utilizing uncorrelated variables in a factor analysis has been coined the GIGO principle, or garbage in, garbage out (Kachigan, 1986). Although many naive investigators believe otherwise, factor analysis (and dynamic factor analysis) does not create new information. The technique merely organizes and summarizes existing information. Consequently, if the input information is inadequate, the final analysis will be inadequate. As stated previously, the average correlations between the three habit strength variables were nonsignificant, even at the 0-lag.
The low cross-lagged correlations not only contributed to the failure to produce interpretable solutions, but also to the high rates of improper and non-converging solutions. The results of the Wood & Brown simulation study of improper and nonconverging dynamic factor models found that adding lags to a two-factor model will decrease the percentage of nonconverging solutions, while the percentage of improper solutions will remain stable. This was not found to be the case in this study, as only 2 of 60 (3%) two-factor models did not converge. Of those solutions that did converge, rates of improper solutions seemed to be unrelated to the number of lags in the model. This result is consistent with the Wood & Brown simulation study. It is important to note that their simulation study employed 100 replications per condition and used simulation data, while this study had only 10 replications and used actual data. The unstable correlation matrices produced by this data led to many problems in the LISREL analyses. For many models , LISREL output suggested that the Theta EPS matrix was not positive definite and that admissibility tests failed.
Despite a great deal of effort to employ differential starting values in the LISREL analysis, it is somewhat of a mystery why LISREL was not able to provide a higher rate of proper standardized solutions compared to the Wood & Brown SAS macros.
However, it is significant to document that the two procedures produce the same results , with some variation due to rounding error and the instability of higher-lagged and multiple factor models.
It is not coincidence that the same subjects who had clear identifiable ARIMA models also had clear identifiable dynamic factor models. This is more evidence for the importance of the input correlation matrix. With extremely low-magnitude correlations, identifying longitudinal relationships is difficult no matter what procedure is being employed.
It is also worth noting that the LISREL programming for dynamic factor analysis was extensive. It is recommended that only seasoned users of structural modeling employ LISREL to evaluate dynamic factor models. To those not fully comfortable with structural modeling and dynamic factor analysis, the SAS macros are invaluable. To utilize the macros, only the variables of interest need to be specified along with the maximum number of factors and lags. The output from the macros is extensive and large models can take hours to run on a mainframe computer. The ease of use easily makes up for these pitfalls. The user-friendliness of these macros could not be made simpler, but the computation time and output has room for improvement.

Study Shortcomings and Recommendations for Further Research
LISREL was chosen over EQS because of the author's familiarity with it. Evaluating dynamic factor models with EQS could prove to be more rewarding.
This study examined only general dynamic factor analysis models, in which each variable loaded onto each factor. More specific confirmatory dynamic factor models could be examined as well. Such models may have specified factor loading paths restricted to be zero. Other models could have correlated factors or even be hierarchical models. Of course, programming and interpreting becomes increasingly difficult as models become more complex.
It was a possible shortcoming of the study that the data was collected twice per day.
The data was collected so that the available smoking time was similar between data collections. However, for many cigarette smokers smoking is a daily physiological Because of the nature of the data (time series) and the type of analysis, it was determined that a maximum of 64 data points per subject were too few to produce stable lagged correlations. The half-lives of the biochemical measures also affected the decision to use all available data points. Using only a single data point per day would certainly eliminate all factor loadings greater at any lag greater than one.
However, even with this method of data collection it was surprising that the correlations between variables were so low, especially at higher-ordered lags. This suggests that these three variables may be measuring entirely differ :mt aspects of the same phenomenon. This has far-reaching implications, especially for those who insist that biochemical validation is necessary to confirm self-report measures of smoking. If there is no relationship between the two, or inconsistent relationships across subjects , then using biochemical measures can not be validating self-report measures at all.
Further research quantifying the relationships between these variables appears to be necessary.
Most subjects had complete data for at least 120 out of a possible 128 time points.
However, this may have not been enough to obtain consistent dynamic factor model solutions for higher-lagged models. As additional lags are added to dynamic factor models , having a complete data set with no missing values becomes more important. It is also questionable if 128 complete time points is enough for higher-lagged models. It is difficult to know how many time points are necessary, because studies employing models with greater than 2 lags are virtually non-existent . In their simulation study, Wood & Brown obtained decent rates of convergence for higher-lagged models , but the stability of the solutions was not in question. This is an important topic which should be addressed in future research .
This research might have been an example of a data set being applied to a data analysis technique rather than a technique being applied to the data. The latter is theoretically sound and will usually provide the most information. As with regular factor analysis , dynamic factor analysis should be the technique when there are many time points and many variables . The stability of a factor analysis solution is dependent on the saturation of variables (correlations) (Guadagnoli & Velicer, 1991) as well as the number of variables per factor (Velicer & Fava,10j7). With only a few variables per factor and low saturation, a factor analysis solution should be examined with extreme caution. This was the case with the data employed in this study. In retrospect, extremely low factor loadings and uninterpretable factors should not have been a surprise. If the same study were to be conducted again, at least eight or ten measures of habit strength would be recommended.
This study did provide an extensive review of dynamic factor analysis. It is clear that Until these issues are explored in great detail, dynamic factor analysis will remain a great analysis tool with unfulfilled potential.     Italicized values indicate severe violations of normalcy assumptions  Italicized values indicate severe violations of normalcy assumptions