The Transtheoretical Model and Exercise Behavior: A Comparison of Five Staging Methods

This project is an examination of one of the first studies that applied the Transtheoretical Model of Behavior Change to the area of exercise. A core concept of the Transtheoretical Model is the temporal dimension represented by the stages of change. A variety of alternative staging methods have been developed. This study compared a continuous measure of stage membership and four discrete algorithms to stage exercise behavior in the context of a worksite program. In Study I, a previously developed continuous measure of stage membership, the (URICA), was adapted to the area of exercise behavior (URICA-E ) . The structure of the instrument was replicated using Confirmatory Factor Analysis. One, two, three and four factor models were compared, and a correlated four factor model, representing the four stages of Precontemplation, Contemplation, Action and Maintenance, was found to have the best fit. Fit was improved by reducing the number of items. The 16 item version was confirmed in a second sample. A Cluster Analysis was performed using the four standardized scale scores of the 16 item version of the URICA-E. Nine distinct clusters were found and replicated in a cross validation. Profiles were interpreted and found to have a number of similarities when compared to the profiles previously reported in population using the URICA. In Study II, four discrete algorithms were examined

Results of 31 Item CFA on Exploratory Sample Table 2 Page Number 15 17 Confirmatory Factor Analysis: Comparison of Fit Indices Table 3 19 Model Comparison using CFA (31 Items) on Exploratory Sample Table 4 23 Model Comparison using CFA (16 Items) on Confirmatory Sample Table 5 Stage Distributions of the 4 Algorithms Table 6 Pladder by Pexscale Frequency Percentages Table 7 Pladder by Pexscpo Frequency Percentages Table 8 Pladder by Pproscal Frequency Percentages Table 9 Pexscale by Pproscal Frequency Percentages  Table 11 Discriminant Function Analysis: Pproscal as group and 31 items of URICA-E as predictors classification Results Table 12 Discrimianant Function Analysis: Pproscal as Group and 16 Items of URICA-E as predictor Classification Results Table 13 Discrimianant Function Analysis: The Transtheoretical Model and exercise behavior: a comparison of five staging methods.
The Transtheoretical Model of behavior change (Prochaska & DiClemente, 1983) uses an amalgamation of latent constructs; Decisional Balance, Self-efficacy, Processes of Change and the Stages of Change, derived from a variety of sources. Each construct is operationalized by a measure composed of a series of items unique to a problem behavior.
These items have been tested and refined to develop highly reliable instruments for a number of problem behaviors, with the most extensive work involving smoking cessation. This paper focuses on the measurement of the key organizing construct, the Stages of Change. Several alternative measures will be compared within the area of exercise.
The Transtheoretical Model.
Within the Transtheoretical Model, the dependent measures include the two scales from the Decisional Balance Measure, the Pros and Cons (Velicer, DiClemente, Prochaska, & Brandenburg, 1985). These concepts are based on Janis and Mann's (1977) concept of decisional balance. A second set of dependent measures are the three scales of the Temptation or Self-efficacy measures (Velicer, DiClemente, Rossi, & Prochaska, 1990). This measure is based on Bandura's (1977) self-efficacy construct which involves the degree of confidence a subject has that they will not engage in a problem behavior in tempting situations. The dependent measures also include the behaviors appropriate for a specific problem area.
The independent measures include the influences from the internal and external environment (including ·interventions) and ten Processes of Change (Prochaska, Velicer, DiClemente, & Fava, 1988). The ten processes of change measured in the model are garnered from a review of psychotherapy techniques (Prochaska, 1984) and represent the behaviors, cognitions and emotions which the subjects engage in during the course of changing a behavior.
A core organizing concept, used in the Transtheoretical Model, is the temporal dimension represented by the Stages of Change. In recent work, Velicer, Prochaska, Rossi & Snow (1992) conceptualize the five stages of change as: Precontemplation (PC), a stage where no change in behavior is planned for at least the next 6 months; Contemplation (C), where change is planned within the next 6 months; Preparation (P), where change is planned in the next 30 days and some type of action has been attempted in the last year; Action (A) , where change has begun and has been sustained for less than 6 months; and Maintenance (M), where change has been maintained for longer than 6 months.
Several prominent theories employ a stage concept as a central organizing construct. It has been used to organize and track the process of development. Piaget (1960Piaget ( , 1972 presented cognitive development as a series of 4 stages. Kohlberg (1976) laid out moral development as a series of 7 stages. Stage has also been used to break a complicated topic into more manageable units. Kubler-Ross (1969) used stages to analyze the complex period of dying. Her 5 stages could be moved through sequentially but, more often than not, the progress was variable. Some would become stuck in a single stage, others would fluctuate back and forth between stages. Stage can also be used to differentiate treatment modalities. The medical profession stages serious illnesses such as cancer in order to determine what protocol will be used as an intervention. Stage is used as an organizing tool, as an analytical instrument, and as an intervention guide by the Transtheoretical Model.
When the Transtheoretical Model moves into a new behavior area, such as exercise, the first task is to develop an efficient staging tool. Within the model, the stages have been measured in either of two ways: by a series of discrete questions (algorithm) or by a continuous measure. However, the relationship between these two different ways of assessing stage membership had not been empirically investigated. This study investigated the relationship between discrete and continuous staging methods by using a secondary analysis of several exercise data sets gathered as part of a larger worksite smoking cessation study Marcus, Rossi, Selby, Niaura, & Abrams, 1992). The worksite study was one of the first use of the Transtheoretical Model in the area of . exercise.
Exercise. When translating the Transtheoretical Model into a new area, such as exercise, it is important to take into account the ways that exercise differs from smoking, the behavior on which the model was developed. Exercise, unlike smoking, is a positive behavior that people are attempting to incorporate into their lives. It is not an easy behavior to maintain. Research shows that adherence to exercise is a major problem with 50% of people who start programs, quitting before a year (Dishman, 1988). This implies that Maintenance is not a stable stage, as it often is for smokers. It is also suggested that exercise is not an allor-nothing phenomenon and that individuals who stop performing may intend to start again (Sonstroem, 1988) It seems that exercise can not easily reach termination.  Velicer, 1989;Mcconnaughy, Prochaska, & Velicer, 1983). It is a short 32 item inventory which yields four highly reliable scales. It was used during psychotherapy to stage clients on whatever problem they were in therapy for. Appendix A presents a copy of these items.
Initially the stages of change were theorized as PC, C, P, A, and M but P was eliminated on the basis of the analysis. The four components accounted for 58% of the total variance and Coefficient Alphas for the four scales ranged from .88 to .89. (Mcconnaughy, et al., 1983). These findings were replicated where the same four components accounted for 45% of the total variance and the Cronbach's reliability coefficients for the four scales ranged from .79 to .84 (Mcconnaughy, et al., 1989).
Mcconnaughy and colleagues performed cluster analysis on both the initial sample and the replication sample. The initial work found 18 clusters. They named 7 major and 2 Since the data iet was large, any subjects with missing data were deleted from the analysis. The data set was then split into two samples: odd identification numbers becoming the first or exploratory sample (N = 474) and even numbers becoming the second or confirmatory sample (N = 462). The split produced two extremely similar samples with regard to demographic characteristics.
Using the exploratory sample, the component structure was analyzed using Principal Components Analysis (PCA) in replication of the work done by Mcconnaughy, et al. (1983) and Mcconnaughy, et al. (1989). The number of components to extract was based on MAP (minimum average partial) (Velicer, 1976) procedure and Horn's (1965) (Bentler, , 1989;Joreskog & Sorbom, 1989). The procedure was to use LISREL VII (Joreskog & Sorbom, 1990) to fit the data to a correlated four factor model and to test it for goodness of fit in comparison to competing models. A correlated four factor model is consistent with the results found for other health behaviors (Mcconnaughy, et al., 1983;Mcconnaughy, et al., 1989;DiClemente & Hughes, 1990).
In addition to the hypothesized correlated four factor model, six other possible models, using all 31 items, were tested on the exploratory sample. The models tested were: (1) a one factor model that conceptualizes change as a single dimension; (2) an uncorrelated and (3)  The modification indices in general represent the "expected drop in Chi-square if a particular parameter were freely estimated" (Byrne, 1989, p. 57 (Bollen, 1989).
The modification indices for the factor loadings can be interpreted as a measure of complexity. Each non-estimated item is given a modification index for the loadings on the four factors.
High values for modification indices on nonloading factors indicate an item that shows complexity.
Removal of complex items improves both the fit and the reliability of a construct.
The matrix of modification indices for the measurement errors pinpoints pairs of items which, if the correlation of the residuals was freed, would reduce the Chi-square. The maximum modification index points out the two items whose correlation of their residuals cause the largest amount of change in the Chi-square. Deletion of at least one of these items can reduce the Chi-square value.
The use of modification indices is a procedure that capitalizes on error variance to improve the fit of the model to the data. So, uncritical reliance on modification indices to modify a model can have serious consequences and lead to acceptance of an incorrectly specified model (Kaplan, 1989;Maccallum, 1986;Silva & Maccallum, 1988) Cross validation of the final model by replicating it in a separate sample can protect against this danger (Marcus, Rossi, Selby, Niaura & Abrams, 1992).
Using the revised, parsimonious version which had been developed, the other models were tested again using the confirmatory sample to ensure that the correlated four factor model was indeed the best fitting model.
The same seven models were tested as described previously. The seven models were compared using the previously described fit indices.
Lastly, scale scores for the URICA-E were formed by calculating the unweighted sum of the scores on the 4 items allocated to each stage then dividing the total by 4. This score was standardized to a T-score metric (mean of 50 and a standard deviation of 10). Each subject therefore had 4 standardized T-scores, one for each stage.
A Cluster Analysis was performed on this data to determine if different types of changers exist, following Mcconnaughy et al. (1983Mcconnaughy et al. ( , 1989. Although the four scales were somewhat correlated, a Euclidean distance measure was employed. The clustering method was Ward's (1963). Three through fourteen clusters were examined. Decisions on how many clusters to interpret were made using the cubic clustering criterion, investigation of the dendogram, and comparison to the previous p~ofiles (Mcconnaughy, et.al., 1983;Mcconnaughy, et al., 1989).  (Velicer, 1976), and Parallel Analysis (Horn, 1965;Lautenschlager, 1989)  Model Testing. Seven models were fit to the data. Table 1 presents the five goodness of fit indices for the seven models. The seven models were (1) a correlated four factor model; (2) an uncorrelated four factor model; (3) a correlated three factor model; (4) an uncorrelated three factor model; (5 ) a correlated two factor model; ( 6 ) an uncorrelated two factor model; and (7 ) a one factor model.
The correlated four factor model, although a poor fit, _ did a better job of fitting the data than any of the 6 other models. It is recommended that a ratio of Chi-square to df be less than 2 to 1 (Joreskog & Sorbom, 1979) or at least less than 5 to 1 (Hayduk, 1987).
The RMSR (.112) was well over .06 which is the acceptable limit for good fit (Hayduk, 1987). The Comparative Fit Index (.740) was poor being no where near .90, the minimum desired value for good fit (Bentler, 1990).
The standardized solution did produce four fairly clear correlated factors (see Table 2 are not contributing to the overall fit of the model.  Table   3 for a comparison of model fit for the different number of items. With each deletion, the fit had improved, but the number of items per construct also had to be taken into    Table 4 for a comparison of the models tested and correlations between factors for the correlated four factor model.  (Mcconnaughy, et.al., 1983;Mcconnaughy, et al., 1989). The degree of replication across the three samples was remarkable.
Choosing the correct number of profiles is a difficult task for which no single method is broadly accepted as correct. Two numeric methods are the cubic clustering criterion and the dendogram. The cubic clustering criterion is a numeric value that starts out as a positive number, descends to zero, and starts to grow negatively. When this number disrupts its linear sequence and starts to bobble, it is around the number of clusters that should be interpreted.
Interpreting the dendogram is also an inaccurate experience.
It is around the number of first level breaks, depicted on a schematic representation of the scores, that indicates the number of clusters to interpret. With the numeric criterion so vague, more dependence was put on choosing the number of clusters that kept strong profiles intact. Based on these three criteria, eleven clusters were retained in each sample, but only nine were interpreted. The tenth and eleventh clusters were each only found in one sample. This failure to replicate precluded interpretation.
Naming of clusters is influenced heavily by previous work and the researcher's personal interpretation. The nine distinct subtypes were named: (1) Maintenance, (2) Action,

STAGES OF CHANGE
found and labeled Action (see Figure 3). In this profile, the score on PC is slightly below the mean.
The scores on C are slightly above the mean whereas A, and M scales are all almost equal and approximately one standard deviation above the mean. Subjects with this profile are exercising regularly but the struggle to maintain this behavior still remains something to think about.  ..

ST AGES OF CHANGE
was found and 'was labeled Contemplation 1. (see Figure 5 ) .
In this profile, the score on PC is nearly a half a standard deviation below the mean. The scores on C are nearly a half a standard deviation above the mean. A and M scale scores are also below the mean. Subjects with this profile are thinking a great deal about exercise, but they are not yet doing anything. It was labeled Contemplation 2. (see Figure 6) . In this profile, the score on PC is more than half of a standard deviation below the mean.
The scores on C are more than a half of a standard deviation above the mean.  In this profile, the score on PC is more than a standard deviation above the mean.
The scores on C are more than a standard deviation below the mean. Scale scores for A and M are also below but closer to the mean. Subjects in this profile evidence difficulty in coming to the realization that exercise is a problem for them. They are doing some thinking. Turmoil comes to mind when examining this profile.  Precontemplation 3. (see Figure 9). In this profile, the score on PC is extremely high, from one to two standard deviation above the mean.
The scores on C, A and M are more than two · standard deviation below the mean. (see Figure 9). In this profile, the score on PC, c, A and M all hover around the mean. Subjects in this profile are doing so very little of anything that they can best be described as uninvolved.
J . definition. An attempt to produce a 5 factor solution (the addition of P) had to be abandoned when the MAP procedure (Velicer, 1976), Parallel Analysis (Horn, 1965) and (Lautenschlager, 1989)  Participation in her first study (Mcconnaughy, et al., 1983 ) . The Action cluster for exercise bears a resemblance to McConnaughy's Participation profile in the second sample (Mcconnaughy, et al., 1989). The Decision Making cluster for exercise is very similar to the cluster of the same name in both papers. The Contemplation 1 cluster for exercise follows the pattern of McConnaughy's profile for the second sample also named Contemplation (Mcconnaughy, et al., 1989 ) .
There is no match for the Contemplation 2 cluster for exercise. The Precontemplation 1 cluster for exercise most closely resembles McConnaughy's second sample Immotive profile (Mcconnaughy, et al., 1989 ) . The Precontemplation 2 .cluster for exercise echoes McConnaughy's second sample Precontemplation profile (Mcconnaughy, et al., 1989). The Precontemplation 3 cluster for exercise has no match. The Uninvolved cluster for exercise mimics McConnaughy's profiles of the same name for both samples (Mcconnaughy, et al., 1989, Mcconnaughy, et al., 1983. The Pladder. The first algorithm, the Pladder was modeled after a smoking algorithm (Biener & Abrams, 1991).
It consisted of a question above a drawing of 2 ladders side by side. The initial question was: "Now and in the past five years, have there been any times when you did regular exercise?" If you answered "Yes" you were asked to mark The Pproscal. The fourth algorithm, the Pproscal was a set of 5 questions that were answered by "True" or "False".
This is the procedure that most closely resembles the algorithm employed for smoking. See Appendix E for the questions and the formula for scoring the algorithm.

Results
Quantitative Methods. A comparison was made of the stage distributions frequencies (percentages) of the 4 algorithms (see Table 5) . Intuition would support the premise that more stable stages (PC & M) would show higher percentages of people than the more dynamic stages (C, P & A) where subjects generally stay a shorter amount of time.
AU shaped curve would graphically capture this image. Only one algorithm shows this expected U shaped curve, the Pproscal.     It should be noted that the single question format does not include a behavioral component in the definition of P, only the intention of starting to exercise with in the next 30 days.
As can be seen above, the single question format is not unlike the Pproscal. Because of the qualitative superiority and the similarity of distribution with the single question, the Pproscal was chosen as the discrete algorithm to use in Study III where its relationship with the continuous measure of change, the URICA-E will be studied.

Study III. Relationship between URICA-E and Algorithms
The URICA-E is a staging instrument adapted for exercise behavior and based on the University of Rhode Island Change Assessment (URICA) (Mcconnaughy, DiClemente, Prochaska, & Velicer, 1989;Mcconnaughy, Prochaska, & Velicer, 1983 In Study III, there will be a comparison of the short form algorithm, Pproscal, and the continuous measure, the URICA-E. Two techniques will be used. Discriminant function is a way to quantify the principles of human decision making (Norusis, 1990). With information from a set of cases for which you know the outcome, equations can be derived to separate the cases into groups.
In discriminant analysis, coefficients are selected so that the scores are similar within a group but differ as much as

Results
Comparison of Profiles of URICA-E and Pproacal.
The cross classification matrix (see Table 10) revealed that the Maintenance profile had 67% correct classification when compared with the Pproscal stage M. It also had a 33% misclassification with PC. This is basically correct classification, but a problem between PC and M appears.
The Action profile had a 67% misclassification with Pproscal stage M and a 33% misclassification with PC. This profile is clearly not picking up the same staging criteria as the Pproscal.
The Decision Making profile had 32% correct classification when compared with Pproscal stage £ and 30% with ~-Although there is some misclassification with A and M this profile is mainly in agreement.
The Contemplation 1 profile is ambiguous. It had a 36% misclassification with Pproscal stage PC, but a 33% correct classification with C. PC and C are such different stages that this is a real problem for this profile.
The Contemplation 2 profile had a 56% misclassification with Pproscal stage M. The could be viewed as an endorsement for the interpretation of this profile as representing maintenance people who are in temporary lapse.
The Precontemplation 1 profile had correct classification of 28% with Pproscal PC, but a 38% misclassification with stage C, and a 26% misclassification with P. This is a problem when PC is confused with C and P. i.e., walking on a job, while it is not, other people may exercise at rates exceeding established standards but wish to achieve a much higher personal level.
The profiles did a credible job of replicating validation against the Pproscal algorithm produced very mixed results. The conclusion is that the URICA-E is not just an alternative staging algorithm, but is something more complex.   This function only involved a small incremental contribution and was, therefore, difficult to interpret.  Precontemplation is seen. The chance level of prediction for a five group discriminant function is 20%, so the 16 items was almost two and a half times better at predicting group membership correctly than chance. Only two of the four functions were found to be significant.  Comparison of URICA-E (scale scores) and Pproscal The discriminant function correctly classified 42.37% of the subjects. Table 13 presents the cross-classification   table. The percent predicted group membership for PC, C, A, & M is larger for the stage it corresponds with than for off stages. As can be seen, the majority of those whose actual group is Preparation, do not get classified as P for their predicted group. This is probably explained by the fact that the URICA-E has no Preparation stage. There is again seen some confusion of classification with adjacent stages.

Again the problem of confusion between Maintenance and
Precontemplation is seen for both these groups. The chance level of prediction for a five group discriminant function is 20%, so the 4 scale scores did more than twice as well (42.37%) at predicting group .membership correctly. The lower values produced by the scale scores underlines the fact that some of the items that were deleted from the 31 were accounting for some of the variance. Only two of the four functions were found to be significantly different from each other. Table 13 Discriminant   Since differences were noted between the item results and the scale scores results, adjustment for the differences in the number of variables was calculated by adjusting the canonical correlations for the functions by use of a shrinkage formula for R squared (Kerlinger & Pedhazur, 1973). Adjustment using the shrinkage formula resulted in a very small change in the differences between the item results and the scale score results.

Discussion
To summarize Study III, there was substantial disagreement between classification by the profiles of the URICA-E and the discrete stages of the short algorithm,

Pproscal.
It is concluded that the continuous measure of stage of change, the URICA-E, is substantially different and more complex than the algorithm. The discriminant function, The research questions that this study attempted to ask are:

1) Do the algorithms stage subjects in a similar way?
The different algorithms did not stage subjects in exactly the same way.

2) Do different formats and wording of algorithms change a subject's choice of stage?
The answer to this is yes.

3) Does an algorithm (the Pproscal) stage a subject the same way as a continuous measure?
There was a difference in the way the Pproscal staged subjects and the way the URICA-E staged them. The continuous measure seems to be something different than a discrete algorithm.

4) Can richer information be obtained from a continuous measure?
It is intuitive that a profile which provides data on all four stages of a subject has richer information than a discrete algorithm that consigns a subject to a single

5) Is the response burden of answering 32 questions too great?
The obvious answer is that 936 people answered the questionnaire so it isn't too great, but the reduced number of items produced a better fit of the data to a correlated four factor model. However, the reduced item set was not as accurate in classifying people. :;~ -~:--:·?;""~~~m~~~~~r:~~~aire-· .

Preeoritanpldtion
Item: (!_) As far as I'm concerned, I don't have any problems that need changing.
i I'm not the problem one. It~•t make sense foe me to be here.
Being hcre is (X'Ctty much of a waste ·of time foe me because chc problem doesn't have to do with me.
I guess I have faults, but ~·s nodllng that I ccally need to change. 23. I ~y be pact of the problem. -but I don't really think lam.
. 26. All this talk a.bout psychology is boring. Why can't people just forget about thetc problems? 29. ·1 have womeS but S<S does the next person. Why spend time thinking about them? 31. I would rathec cope with my-faults dian tJ:y to change-them.

Contanplation
Item: (Ji. I think I might be ready foe so~ self-improvement. @. It might be worthwhile to wod.: on my problem.
(& I've been thinking that I m.ighl want to change something about myself.
12. I'm hoping this place will help me to bettec understand myself. @ I have a problem and I really think [ should work on iL 19. I wish I bad mocc ideas on how to solve my problem. 21. Maybe this place will be able to help me.. 24. I hope that someone hecc will have some good advice for me.

Item:
@. I am doing something about the problems that had been bothering me. 7. I am finally doing some work on my problems .
.@ At times my problem is difficult. but I'm wocking on it.
® I am really woddng hard to change • .@ Even though I'm not always suoccssful in changing, I am at least working on my problem. -'-@_~~~that_ l~_ #.~on: a.¢.>bkm I have alteady changed, so I am here to scclc help.
. 28. Jt_ is faistrating, but I feel I might be bavfug a rccw:rcncc of a problem I thought I had resolved.
" .@After all I had done to tcy to.change my problem. every now and again it comes back to haunt me. Wrong Marks<Er®<:)~: .