A Meta-Analytic Examination of Decisional Balance Across Stage Transitions: A Cross-Sectional Analysis and Cross-Sequential Cross-Validation

Interventions to decrease unhealthy behaviors and to increase healthy behaviors are crucial for health promotion and disease prevention. Computerized tailored interventions provide a promising approach for creating positive health behavior change. The Transtheoretical Model (TTM) delineates a way to conceptualize behavior change , provides the foundation for developing assessments of an individual's readiness to change, and is utilized in tailoring interventions for actualizing behavior change. To produce optimally tailored interventions: (1) theory that guides interventions needs to be comprehensively tested ; and (2) empirical data that drive the tailored interventions needs to be systematically generated. This program of research tests theoretical assumptions of the model and begins to outline necessary tailoring data, and does so in three main phases: (1) comparison of effect size procedures; (2) cross-sectional meta-analytic investigation of the Stages of Change and Decisional Balance; and (3) a cross-sequential cross-validation study of the cross-sectional study. In meta-analyses, essential information for calculating effect size is often missing in published studies. Such missing data precludes the inclusion of studies in the meta-analysis thereby introducing bias to the review. Methods to utilize as much data as possible are invaluable to meta-analytic work. A comparison of effect size procedures was conducted in order to identify a bias correction for studies otherwise unusable. The study included 38 studies, with a total of 46 datasets for each of the Decisional Balance measures. Hedge ' s g provided a 10% larger effect size than a standard score method of effect size estimation for both the Pros and Cons measures. A single correction equation was developed to allow the use of standard score estimation in future meta-analytic work with these constructs . The second phase of this program ofresearch utilized meta-analytic procedures with longitudinal data to examine 146 datasets across 55 behaviors spanning 18 countries and including 85,272 participants. This study identified a 2factor structure for the Pros and Cons in 96% of the studies, with a crossover occurring before the Action Stage in the majority of studies. Overall magnitudes of effect were larger in the earlier Stages for Pros and in the later Stages for Cons. Heterogeneity of effect size distribution was found and moderators assessed . Generalizability of these constructs were supported, especially across behaviors and populations. Moderators of effect.were found to be differentially related across particular Stage transitions but in no readily apparent pattern. Thirdly, longitudinal changes in Decisional Balance across the Stages of Change were assessed across the Stage transitions using a cross-sequential approach. Overall magnitudes of effect were somewhat larger for the longitudinal data. Similar magnitudes of effects for cross-sectional adjacent stage transitions and longitudinal stage movements in the early stages were found between cross-sectional and crosssequential profile for the two behaviors examined. Overall, the use of preexisting studies can lead to groundbreaking empirical data for the development of more comprehensive and more precisely tailored health prevention and disease prevention interventions and further research will continue to delineate important patterns of relationships between these variables.

A single correction equation was developed to allow the use of standard score estimation in future meta-analytic work with these constructs .
The second phase of this program ofresearch utilized meta-analytic procedures with longitudinal data to examine 146 datasets across 55 behaviors spanning 18 countries and including 85,272 participants. This study identified a 2factor structure for the Pros and Cons in 96% of the studies, with a crossover occurring before the Action Stage in the majority of studies. Overall magnitudes of effect were larger in the earlier Stages for Pros and in the later Stages for Cons.
Heterogeneity of effect size distribution was found and moderators assessed .
Generalizability of these constructs were supported, especially across behaviors and populations. Moderators of effect.were found to be differentially related across particular Stage transitions but in no readily apparent pattern.
Thirdly, longitudinal changes in Decisional Balance across the Stages of Change were assessed across the Stage transitions using a cross-sequential approach.
Overall magnitudes of effect were somewhat larger for the longitudinal data. Similar magnitudes of effects for cross-sectional adjacent stage transitions and longitudinal stage movements in the early stages were found between cross-sectional and crosssequential profile for the two behaviors examined.

Introduction
Interventions to decrease unhealthy behaviors (e.g., smoking and sun exposure) and to increase healthy behaviors ( e.g., exercise and mammography screening) are crucial for health promotion and disease prevention. Tailored interventions provide a promising approach for creating positive health behavior change. In order to produce optimal tailored interventions: (1) the theory that guides interventions must be comprehensively tested; and (2) empirical data designed to drive the tailored interventions must be systematically gathered and analyzed.
Many theories have been developed to explain and to assist in behavior change. One theory, the Transtheoretical Model (TTM), has drawn from some of the most powerful processes and principles of change from across multiple theories of behavior and behavior change to form one comprehensive model. The model delineates a way to conceptualize behavior change, provides the foundation for developing assessments of an individual's readiness to change, and is utilized in tailoring interventions for actualizing behavior change. Since the development of the TTM, hundreds of health promotion and disease prevention studies around the world have used the model, yet many theoretical assumptions of the relationships between TTM constructs have not been thoroughly tested.
To date there has been no comprehensive examination of TTM constructs across behaviors. The common measures utilized in hundreds of studies, across the many different content areas, research designs and populations, provide a unique opportunity for comparative and integrative analyses of important behavior change factors. Unique meta-analytic techniques will be used to analyze and compare crosssectional and longitudinal datasets between and within multiple behaviors on two behavior change factors across five Stages of Change. This quantitative review aims to: (1) comprehensively identify if and when theoretical prescriptions of the TTM are maintained or if modifications of the model need to be made; and (2) update and expand the data that drives intervention development of tailored behavior change systems. Overall, this study will help bridge the gap in knowledge between theoretical prescriptions and intervention applications to create more efficient and effective population-based health promotion and disease prevention interventionsultimately reducing the incidence of disease, such as cancer.

Health Promotion and Disease Prevention
The development of many diseases is often rooted in unhealthy behaviors.
For instance, many behaviors that can lead to cancer, such as smoking and alcohol abuse, are preventable and many behaviors that reduce the risk of cancer, such as exercise and healthy diet, are readily attainable. Unfortunately, it is often difficult for individuals to change well-established but unhealthy behaviors, even when the need to do so is apparent and acknowledged by the individual, thus emphasizing the need for behavior change interventions.

Computerized Tailored Intervention Systems
Computerized systems that deliver population-based health promotion and disease prevention interventions have proven particularly effective when systems provide tailored feedback to each participant. The combination of advances in both the behavioral sciences and in computer technology has enabled the development of expert system interventions that guide and motivate individuals to change. Expert systems are broadly defined as computer programs that mimic the reasoning and problem solving of a human 'expert'. Early programs implemented the advice of an individual without the inclusion of empirical data (Negotia, 1985). In contrast, empirically based systems generate interventions driven more by science than opinion. Expert systems start with basic principles and theoretical decision rules and are then enriched with heuristics and empirical data.
Current expert systems provide tailored print feedback or interactive and immediate tailored feedback to each individual Velicer et al., 1993;. Feedback includes both normative feedback, which compares the individual to other successful and unsuccessful individuals, as well as ipsative feedback, which compares the individual to his or her own previous responses.

Importance of Well-Tested Theory for Interventions (TTM)
To ensure maximum effectiveness of tailored interventions for behavior change, the theory that guides interventions must be comprehensively tested. Since many theoretical assumptions of the TTM have not been thoroughly tested, developing interventions based primarily on theory rather than empirical data may prove problematic. This program of research will explore theoretical prescriptions and determine what , if any, modifications are appropriate to enhance this behavior change model.

Empirical Data for Optimally Tailoring Interventions
To date , no study has produced a set of empirical data to drive complex behavior change intervention systems ; the paucity of available data results in interventions primarily guided by the theoretical principles alone . This study provides an empirical review of several TTM variables across a wide range of behaviors. Traditionally , the empirical basis for expert systems has been based on small-sample pilot data. In order to approach truly optimal tailoring, data that currently drive expert systems need updating . Additionally, a more precise understanding of where , when and on what to tailor need to be generated . An optimal tailoring approach can reduce demands on participants and costs to providers while increasing intervention effectiveness. Optimal tailoring can be achieved in part by systematically and quantitatively analyzing key relationships between various populations and behaviors across hundreds of available studies, both crosssectionally and longitudinally.
Several studies have shown differences in intervention outcomes for adolescence vs . adults (Spencer et al., 2002 ;, and males vs.
females  that suggest tailoring on particular demographics may be warranted . For example , in reexamining the work of CHAPTER TWO

THE TRANSTHEORETICAL MODEL OF BEHAVIOR CHANGE The Transtheoretical Model
Many different theories are utilized in behavior change interventions and serve as the basis for tailored interventions. One theory, the Transtheoretical Model (TTM) of behavior change, is particularly suited to provide the framework for individualized interventions using expert system approaches. The TTM involves three dimensions: the temporal dimension , the independent variable dimension , and the intermediate variable dimension . The central organizing construct of the TTM characterizes behavior change over time through five distinct Stages of change : Precontemplation, Contemplation, Preparation, Action, and Maintenance. The independent variable dimension of the TTM identifies behavior change strategies through ten primary Processes of Change that are grouped into two higher order factors -the behavioral and experiential Processes. Two intermediate indicators of when the Stage of Change will occur are Decisional Balance ( weighing of Pros and Cons) and Self-efficacy (Situational Confidence or Temptation) .

Stages of Change
In health psychology, the concept of change occurring in a series of stages has been examined in efforts to understand the temporal aspects of change in human nature or behavior. An example of the stage concept can be seen in Horn's (1976;Horn & Waingrow , 1966) work with smoking behavior and cessation. Horn developed a four stage process of change investigating smoking behavior, which consisted of 1) contemplation of change; 2) the decision to change; 3) short-term change; and 4) long-term change. Although arising independently and in a different context, the Stages of change as conceptualized by the TTM are similar to Horn's stages. Over the years, through its own evolution, the TTM ultimately identified five Stages (DiClemente, . Historically, Prochaska and DiClemente (1982) identified the five Stages Due to what appeared at the time to be lack of empirical support, Decision Making was dropped (McConnaughy, 1989) and four Stages became the primary focus of subsequent study. In 1991 the Preparation Stage, similar to the Decision Making Stage, was reinstituted into the Stage of Change construct and in general has remained a part of the model (Di Clemente et. al., 1991).
The algorithms for the five Stages of Change are specific for each behavior, but usually follow these general Stage concepts. Participants are considered to be in the Precontemplation Stage if they report an undesired status, that is, the presence of a problem behavior or the lack of a healthy one, and express no intention of changing in the next six months. Participants are considered the Contemplation Stage if they intend to change in the next six months. Participants in the Preparation Stage plan to change in the next month and have begun to engage in target behaviors, but have not yet met particular criteria. Participants reach the Action Stage once they have met the 8 given behavioral criteria. Lastly, if the participant has met the specified behavioral criteria for more than six months, they have reached the Maintenance Stage.

Processes
The TTM also includes a series of independent variables, the Processes of Change (Prochaska, Velicer, DiClemente, & Fava, 1988). The Processes represent strategies for changing one's behavior. The Processes of Change instruments that measure this construct assess affective, cognitive, evaluative, experiential, and behavioral activities as one moves through the Stages of Change. The Processes have been found to have a correlated higher order factor structure and measure change Processes that represent two primary dimensions, experiential and behavioral . In general, the experiential Processes include consciousness raising, dramatic relief, environmental reevaluation, self-reevaluation, and social liberation. The behavioral Processes include stimulus control, counter conditioning, reinforcement management, self-liberation, and helping relationships. Experiential Processes generally are considered to be more salient in the earlier Stages of Change whereas the behavioral Processes are considered to be more salient in the later Stages.
Most typically, when measuring Self-efficacy, a general Self-efficacy scale is reported. If subscales are found or reported, they most commonly are: Positive/Social and Negative Affect. Additional examples of subscales representing different types of situations include Emotional Situations, Skill Applications, Relapse Recovery (DUkstra & de Vries, 2000), Excuse Making, Resistance from Others  and Habit/Addictive (Velicer et al., 1990) . For Temptations, in addition to the most common subscales -Positive/Social and Negative Affect, other examples include Curiosity  and Habit/ Addictive (Velicer et al., 1990).

Decisional Balance
The TTM originated by integrating theories of psychotherapy as well as incorporating constructs from alternative models . One of the most important and reliable TTM constructs, Decisional Balance, was inspired by  conflict model of decision-making.  proposed a descriptive schema called a "balance sheet" of incentives. The four main categories of consideration for decisional conflicts are: a) utilitarian gains and loses for self; b) utilitarian gains and losses for significant others; c) self-approval or -disapproval; d) approval or disapproval by significant others. These four comparative categories of potential positive and negative incentives involve both instrumental effects of utilitarian objectives and nonutilitarian considerations such as issues of self-esteem and value-based determinations.
The development of the TTM Decisional Balance measure (Velicer , DiClemente, Prochaska, & Brandenburg, 1985) was based on the 8 factors (4 gains and 4 losses) of  . The researchers constructed the scale to study the decision-making process across the Stages for smoking cessation. Instead of achieving an 8 factor-structure as anticipated, principal components analysis identified two orthogonal components. These two components were called the Pros and Cons of Smoking .

Stage of Change and Decisional Balance
The Stages of Change and Decisional Balance will be the focus of the subsequent studies in this program of study, therefore special emphasis is given to these constructs.

Stages: The Central Organizing Construct
The Stages of Change are considered the central organizing construct because many of the theoretical assumptions and construct relationships are framed around this temporal process of change. Therefore, the role of the Stages of Change in the context of the Transtheoretical Model is important theoretically and practically .

Measuring Stage of Change
The Stages of Change can be measured in a variety of ways. These measures can be divided into two broad categories : continuous and discrete. Most typically , though not exclusively, continuous measures utilize clustering procedures to create Stage profiles. These approaches usually include more lengthy questionnaires than the discrete or algorithm procedures and include "clusters" of statements that represent different Stages that are subsequently analyzed based on profiles using cluster analysis. Discrete Stage measures appear to be more frequently used by researchers, since they are more prominent in the literature. These procedures are guided by decision rules and typically examine behavioral intentions as well as actions regarding behavior change .

Discrete Stage Approaches
Discrete Stage measures are designed to unambiguously classify individuals into one of the defined Stages. The algorithms for the five Stages of Change are specific for each targeted behavior. In general they adhere to the following Stage concepts , though particular aspects of the staging (e.g., time frame , behavioral criteria) vary among studies. Participants are considered to be in the Precont emplation Stage if they report an undesired status , that is, the presence of a problem behavior or the lack of a health y one, and express no intention of changing in the next six months . The Contemplation Stage indicates that the participant intends to change in the next six months . Participants in the Preparation Stage plan to change in the next month and have begun to engage in target behaviors , but have not yet met particular criteria. For well studied beha viors these criteria are set according to gold standards in their given field. The Action Stage is attained once the participant reaches the given beha vioral criteria. Lastly, if the participant has met the specified behavioral criteria for more than six months , the y have reached the Maintenance Stage.
The stage concepts guide the staging algorithms but typically vary in time frame , behavior criteria , action criteria , and response format. Staging also can vary in clarity of behavioral operationalization , in complexity of target behavior and in consistency in stage description.

Time Frame .
Most, but not all, Stage algorithms include a time frame . The most typical time frames used in staging algorithms are: Precontemplation = no intention in the next 6 months; Contemplation = seriously considering in the next 6 months ; Preparation= considering in the next 30 days ; Action= changed behavior within the past 6 months; Maintenance = changed behavior for more than 6 months .
The use of the time frame across studies is inconsistent, however, and variations 13 include: (1) time frame is not used at all (e.g., Never having had a prior mammogram nor having high intentions to have one ;Lauver, et al., 2003); (2) time frame is only used in some Stages ( e.g., Wyse, 1995); (3) staging includes additional time frame criteria (e.g., Smokers who planned to quit in the next 30 days and had made an attempt to quit in the past 12 months; Etter et al., 1997); (4) staging includes alternative placements of time frame (e.g., Action = Reported regular vigorous exercise for at least one month; Armstrong, 1993); (5) staging includes "unstandard" time frames (e.g., Intending to use CPAP in the next two weeks; Stepnowsky, 2002; and no screening mammogram 2-4 years ago, with one in the past 2 years, intends to have mammogram in future; Chamot et al., 2001). Further exploration is needed for such issues as impact of wording of time frame (one month versus 30 days), possible patterns of the use of time frame, relevance of particular time frames with respect to particular behavior (e.g., 2 weeks= CPAP and 2 years= mammography).
Critics of the TTM consider the use of time frames (e.g., 6 months) "arbitrary" (Bandura, 1997;Sutton, 1996;Weinstein et al., 1998) and indicate that this is a contradiction to the idea that Stages differ based on qualitatively different attributes (Bandura, 1997;Weinstein et al., 1998). It is also thought that altering the time frame may alter the Stage distribution (Weinstein et al., 1998).
Behavioral Criteria. In some staging algorithms, Contemplation and Preparation are differentiated by the time frame criteria (seriously considering changing in the next six months versus the next month). In order to create a greater distinction between these two Stages a behavioral criterion is sometimes used. If a behavioral criterion is used, it is typically included in the Preparation Stage.
Behavioral criteria demonstrate an attempt by the individual to take action. For instance, in a recent study, Keller et al. (2000) defined the Preparation Stage with the typical intention statement, time frame and then set behavioral criteria. The Preparation statement reads: I intend to change my behavior in the next 30 days and in the last 6 months, have made steps to actively deal with the topic of good body posture, like reading a book or watching a TV program about it.
Action Criteria. "Action criteria" typically indicates the particular behavioral criteria necessary for a participant to reach the Action Stage. This seemingly simple criterion can be the source of much debate when developing or evaluating Stage measures. When a behavior is clear, discrete, and agreed upon by a given field then setting a behavior criterion can be simple, for instance "quitting smoking". An action criteria of "reduction of cigarettes smoked" complicates the measurement of the behavior and behavior change. There is not a clear consensus among professionals or in the field how much reducing the amount of cigarettes per day is helpful to an individual. Choosing this criterion could minimize the value of the overall study outcomes or could introduce ethical concerns if in fact this strategy proves unhelpful or more harmful (e.g., as in the example menthol and light cigarettes). Additional complications arise concerning the need for tracking or recall for participants. For instance, it is much easier to remember if one did or did not smoke versus an individual remembering precisely the number of cigarettes smoked over the course of a day, a week or a month.
Utilizing gold standards in the field is often a good approach for setting an action criterion. But not all behaviors have established gold standards, for instance 15 areas such as diet (healthy eating) / weight loss face much debate regarding the best strategies. For example, in the area of weight loss, it is unclear what the best strategy is -should one eat a low fat diet or the Atkins diet? In the area of nutrition, should individuals eat 5, 7, 9, or 11 fruits and vegetable a day? Another example , treatment for sleep apnea -it is unclear what treatment is best, how often should the treatment be given or how long should the treatment last?
Response Format. Perhaps one of the most variable aspects of staging (aside from behavior) is the response format of the algorithms. Response formats include various combinations of characteristics such as: (1)  and ( 6) single item vs. multiple item format. The multitude of combinations introduces a significant concern for comparability of measures. Studies have shown that different response formats can differentially impact Stage distributions; for example, algorithms using longer, more complete definitions of exercise produced larger number of participants in the earlier Stages, whereas Likert scale format resulted in lower percentage of participants in Precontemplation and Maintenance than fixed format. Additionally, in one study (Reed et al., 1997), the algorithm that provided a long definition and measured vigorous exercise using a 5-choice format revealed a pattern that most closely matched the theoretical prescription of the Pros and Cons across Stages as well as for the pattern for hours of exercise across Stages (i.e., hours increasing across Stages). In some cases similar results were found between response formats. For example, effect size comparisons found both long definitions measuring vigorous exercise using a 5-choice and true/false format equally effective across Pros, Cons, confidence, and hours of exercise. And lastly , true/false and choice formats were found to be comparable .

Continuous Stag e Approa ches
A variet y of continuous measures have been developed to assess Stages of Change. The primary continuous Stage measures are considered clustering approaches (e.g. , URICA , Socrates) . These staging procedures do not offer consistent methods and staging (as described below) and therefore will not be included in the dissertation project analyses . Other times the URICA (and similar approaches) includes alternative categor ies called "non-reflective action", "taking steps", "recognition" "ambivalent", "uninvolved/discouraged" or "participation". The numbers of categories or clusters also vary, including 5- (DiClemente & Hughes, 1990), 8-(McConnaughy et al., 1983) and 18-cluster solutions (McConnaughy et al., 1989) found in some studies. Littell and Girvin (2002) summarizes a variety of problems seen in cluster-type staging. Examples of scale construction issues include: (1) items in each scale are scored in the same direction increasing the likelihood of response sets; (2) many double barreled items are included in the scales; (3) the scales use many double negative items; ( 4) some items are awkwardly worded ( e.g., "I'm not following through with what I had already changed as well as I had hoped, and I'm here to prevent a relapse of a problem"; Jefferson, 1991); (5) uncommonly used phases are included in items; (6) overlap among items cause possible inflation of internal consistency measure; and (7) generalizability is potentially limited since the scales were normed on middle-class, Caucasian participants.

Scale Construction Problems. A review by
Scoring issues. The cluster-type measures utilize a variety of methods for "scoring" or identifying Stages. The most simplistic method involves identifying the highest raw score or standardized score of a given scale and classifying an individual into a stage based on that scale. One complication of this method is that more than one scale can be tied. Strategies include: (1) placing the participant in the more advanced stage; (2) identifying ties as a new stage; or (3) a combination of approaches (Heather, Rollnick & Bell, 1993). These strategies are rather arbitrary and clearly result in different stage distributions (Littell & Girvin, 2002). Prochaska and Di Clemente (1998) indicate that regardless of an individual's Stage of Change, participants may experiences attitudes described across the stage designated items, therefore more complex cluster approaches are likely to be the best method for interpreting the continuous instruments. Though currently, there is no definitive answer as to which of these procedures is best.

Stage of Change: Continuous or Discrete?
In the current studies only discrete measures of Stage of Change will be used.
This will eliminate the methodological issues discussed above regarding the continuous clustering measures of Stage of Change as well as increase consistency across Stages as much as possible . Additionally, although there has been an on-going debate regarding the theoretical conceptualization for stage as continuous versus discrete, regardless of the outcome of this debate and regardless of when or if stage is measured on a continuous scale, practically speaking discrete stages are needed in order to tailor interventions.

Decisional Balance : Pros and Cons of Change
Following the original study by  , the use of the construct began to expand, encompassing an array of behaviors such as exercise, condom use, and mammography screening. This early work culminated in a paper by , which looked at patterns in Decisional Balance across Stages in 12 behaviors. This integrative study investigated: 1) the generalizability of the TTM for Stage and Decisional Balance across behaviors; 2) the generalizability of the TTM for a variety of populations; 3) the number of components in Decisional Balance and their respective internal consistencies; 4) patterns of Pros and Cons across Stages; and 5) the Stage of crossover between standardized Pros and Cons scores. More recently, a meta-analysis examining 37 behaviors utilizing 81 datasets including nearly 40,000 participants was conducted re-examining these relationships and exploring additional ones .  and  found clear support for the generalizability of the Stages of Change , the Pros and Cons , and the integration between them. The researchers additionally found that these constructs generalized across a variety of populations and behaviors.

Structure of Decisional Balance
Consistent with the two-factor structure identified by V elicer et al. (1985),  and

Functional Aspects of Decisional Balance
Decisional Balance Crossovers. In Kurt Lewin's (1948) expectancy theory, it is postulated that behavior changes as a function of the increases and decreases in motivation to contemplate gains and losses. The TTM builds on this notion by suggesting a clear directionality to the function as well as a characteristic way of examining it. This function is based on the relationship of when and how much the Pros increase and the Cons decrease  and is examined by 20 identifying a graphical crossover of the Pros and Cons as they change relative to each other. It is believed that interaction of the Pros and Cons may increase or decrease the likelihood of a cognitive conflict between the benefits and disadvantages of a particular behavior. One possible interaction occurs when a person believes there are greater personal gains and less disadvantages associated with a behavior. It is hypothesized that at this time, the Pros increase and the Cons decrease, making it more likely that an individual will change their behavior. The crossover indicates an equal weighting in a person's cognitive framing of the gains and losses of engaging in a particular behavior. This equal weighting can create a state of ambivalence towards change.
This crossover therefore is a theoretically important marker for the success of the functional relationship between the Stages and Decisional Balance.  and  found that the Decisional Balance crossovers occurred during the Contemplation Stage for 58% and 53% of the studies, respectively. Based on the timing of the crossover, the researchers suggest that progress from Precontemplation to Contemplation involves an increase in Pros whereas progress from Contemplation to Action involves a decrease in Cons.

Patterns across Stage. Several additional patterns in the relationship between
Stage of Change and Decisional Balance were found in the two studies . For example, the Cons of Changing were higher than the Pros of Changing in the Precontemplation Stage for all datasets in both studies. Prochaska found the opposite true for 11 out of 12 behaviors in the Action Stage.

21
Decisional balance (Pros and Cons) serves as an intermediate indicator of when change will occur and has generally been thought to be especially salient in the earlier Stages of Change .  also illustrated that the relationship between the Stages and Decisional Balance for an unhealthy behavior is different than for a healthy behavior.
Specifically, the pattern for an unhealthy behavior was such that the Cons decreased across the Stages whereas the Pros displayed a curvilinear pattern, which paralleled the decline of the Cons in the later Stages. In contrast, the healthy behavior showed more of an X configuration, with the Pros continuing to increase across the Stages whereas the Cons decrease across the Stages.

Strong and Weak Principles.
Across twelve studies , mathematical relationships were found between the Pros and Cons of Changing and progress across the early Stages into Action . The Strong Principle of Change states that PC ➔ A ~ 1 SD j PROS: progress from Precontemplation to Action involves approximately one standard deviation increase in the Pros of Changing. The Weak Principle of Change states that PC ➔ A~ 0.5 SD j, CONS: progress from Precontemplation to Action involves approximately 0.5 SD decrease in the Cons of Changing.
In re-examination of the Strong and Weak Principles, the magnitude of the maximum increase in the Pros of change was again found to be greater than the maximum decrease in the Cons of change from Precontemplation to Action across 37 different health behaviors . Consistent with Prochaska's 22 (1994) Strong principle, the average effect size for the Pros was approximately one standard deviation (d= 1.05, SD= 0.45), almost identical to  original finding (d = 1.06, SD= 0.26) .  findings also revealed that Prochaska's Weak Principle may not be so weak. That is, the average effect size for Cons was stronger (d = 0.62, SD= 0.38) than was found in the previous study (d = 0.45,SD= 0.22) by , though clearly the Cons remains weak relative to the Pros . Practical implications of these principles are that the Pros of Changing must increase twice as much as the Cons must decrease , suggesting that an intervention place twice as much emphasis on raising the benefits as on reducing the costs or barriers.

Stage Transition s
The Strong and Weak Principles are useful for understanding the amount of work generally needed to move from Precontemplation to Action. Although theoretically the principles have been important in conceptualizing and understanding the relationship between Decisional Balance and Stages of Change, these principles are essentially action-oriented when applied to interventions. That is, by examining characteristics of the transition only from Precontemplation to Action , and using these to tailor interventions , one is potentially neglecting three Stages: program of research directly answers to this important goal by testing several theoretical assumptions from a health behavior theory using innovative and unique methods in order to make recommendations for revising or rejecting parts of the theory . Furthermore , at this workshop, Weinstein indicated that "all current theories of health behavior have significant limitations" and that , "progress in improving theories that explain or encourage health behavior has been slow" (National Cancer Institute: Division of Cancer Control and Population Sciences, 2002). This program of research seeks to speed that process . Weinstein continues by explaining, "The scientific progress, in which theories are tested, weaknesses exposed, inadequate theories rejected, and better theories arise to take their place has not been taking place." This program ofresearch intends to advance this scientific progress.

Integrative Approaches
There are a variety of approaches that can be employed to test theor y. In fact, most empirical studies attempt to test some theory at least in part . Powerful approaches to assess theoretical models involve integrative approaches. These are approaches which attempt to combine the data of previously conducted studies , such as combining longitudinal secondary data or employing meta-analytic strategies.

Meta-Analysis for Theory Testing
One particularly effective integrative tool is meta-analysis. Researchers encourage the use of meta-analysis for informing theory  as well as testing and advancing theory . This is in part because meta-analyses have the power to test large sets of explanatory mechanisms by more broadly examining moderators (Marsh et al., 2001) than traditional qualitative reviews as well as individual empirical studies. Therefore , meta-anal yses have the potential to more comprehensively examine aspects of theories that are oftentimes not testable due to practical constraints . Researchers have begun utilizing traditional meta-analytic techniques as a basis of theory testing (e. g. , Marsh et al., 2001).
Meta-analyses can be used for theory testing in varying degrees . A simple meta-analysis combines the results of previous studies, examining a single questionthe same question , asked by previous researchers. This type of meta-analysis is considered the lowest level and provides the least theoretical advance. That is, the simple meta-analysis can be an essential tool for clarifying inconsistencies in the literature, but typically introduces no new ideas and offers little additional information to the field. A more sophisticated method of employing meta-analysis 31 for theory testing is to build on the aforementioned method by aggregating the effect sizes from the literature and subsequently assessing homogeneity. If heterogeneity is discovered, searches for moderators become possible . (It is possible to establish a priori moderator tests precluding the need for heterogeneity to be found). The introduction of moderator analyses enables the possibility of generating new theoretical evidence. Lastly, the most important method of conducting meta-analysis for theory testing involves testing theoretical relationships or concepts not previously conducted in primary-level studies. This creative use of previously conducted studies offers the highest potential for theoretical advance. This dissertation seeks to conduct this highest level meta-analysis.
Additionally, meta-analysis can be used to clarify conflicting results due to particular failings of null hypothesis testing, especially with single studies. The use of effect sizes can provide important information about effects that hypothesis testing does not offer, namely, the magnitude of an effect. Confidence intervals can provide a means to test statistical significance in meta-analysis and increase precision with each study added to a meta-analysis. Moreover , meta-analysis has typically been used as a method to simply combine results of studies based on null hypothesis testing. More sophisticated uses of meta-analysis can go beyond the simplicity of null hypothesis testing to identify results not previously reported in the literature, explore moderators of effect that may not be able to be examined in a single study, and even provide a tool for advancing theory. This program ofresearch aims to provide results that overcome many of these hypothesis testing failings by synthesizing data from nearly 150 studies utilizing effect sizes and confidence 32 intervals . Additionally, this meta-analysis aims to go beyond the simplicity of hypothesis testing exploring new relationships in the data and exploring moderators with the goal of advancing theory.

Importance of Comprehensive and Systematic Review Strategies
Although seemingly obvious , careful implementation of integration strategies is essential to the success of qualitative and quantitative reviews . Comprehensive and systematic review strategies have not often been found in previous reviews examining the TTM. For instance , hundreds of studies related to behavior change , and more specifically health-related behaviors have been conducted using the TTM , yet no study has integrated this vast body of literature . A small number of studies have attempted to review the TTM literature , but these studies have either: ( 1) limited their review to one behavior ( e. g., Horowitz , 2003;Horwath , 1999;Marshall & Biddle , 2001;Spencer , Pagell , Hellion & Abrams , 2002) ; (2) qualitatively compiled studies (e. g., Horowitz , 2003 ;Horwath, 1999;Spencer et al., 2002 ;; or (3) failed to systematically or comprehensively gather and analyze all available data Spencer et al., 2002 Many meaningful relationships are lost in reviews that limit analysis to one behavior. Efforts in health promotion and disease prevention can best be served by understanding the similarities and differences in behavior change factors across behaviors. The advent of recent efforts to focus on multiple behavior change , where constellations ofrelated (or unrelated) behaviors are treated simultaneously , make understanding the inter-and intra-relationships of behavior change factors increasingly critical for developing effective interventions.

33
Most literature reviews examining the TTM have focused on overall intervention effects or study design and methodology rather than quantitatively examining process-to-outcome relationships that can aid in improving intervention development. Additionally, the role of sample size and power has not always been fully appreciated as a factor in these reviews. Sample size is often a source of discrepancy in the results of outcome studies, especially as the effect size for most such interventions tends to be fairly small, particularly for population-or community-based interventions. A meta-analytic review of the literature is especially well suited to evaluate the role of sample size, effect size , and power in the generation of discrepant study outcomes .
Inadequate review strategies can produce misleading results. For example , TTM tailoring is usually incorporated into interventions in one of three ways: 1) stage onlyproviding feedback specific to only an individual's SOC; 2) partial tailoringproviding individual feedback on SOC, Decisional Balance , and/or Selfefficacy; and 3)full tailoringproviding individual feedback on all TTM constructs , including SOC, Decisional Balance, Self-efficacy, and processes of change. In a qualitative review of smoking , Spencer et al. (2002) identified 22 studies evaluating TTM tailored or stage-matched interventions. The researchers concluded that the body of literature on staged-matched interventions was "acceptable" but not "conclusive." Upon closer examination, it appears that the authors did not perform the review in a careful systematic way . New and informative patterns were revealed by simply re-grouping the studies by the type of tailoring: 13 of the studies used stage only tailoring, 4 of which had positive results Leed-Kelly, Russell , Bobo , 1996;Valanis et al., 2001;Wang, 1994 ), five of which had negative results (Lancaster et al., 1999;Lenox et al., 1998;Steptoe et al., 2001 ;, and 4 of which were unclear (Bemstein & Stoduto , 1999;Goldberg et al., 1994; . In contrast , the five partial tailoring studies produced positive results on a 3 to 2 ratio . The three positive studies were Coleman-Wallace et al. (1999) , Dijkstra et al. (1998 and the two negative studies were Dijkstra et al. (1998) and .
Lastly, the five studies using/ull TTM tailoring produced positive results on a 4 to 1 ratio. The four positive studies were  and the negative study was . In sum, partial tailoring studies were 2 times more likely andfull tailoring studies were 2.7 times more likely to find significant treatment effects than stage only intervention studies .
Additionally , in Spencer et al. ' s(2002) review, length of follow-up was not accounted for when comparing studies. The five partial tailoring studies were limited to shorter-term effects with follow-ups ranging from only 10 weeks to 6 months , while the five full tailoring studies demonstrated more sustained effects with followups ranging from 18 to 24 months. Also, none of the partial tailoring studies involved population cessation, while four of the full tailoring studies were population-based. Hence , it appears the more demanding studies (e.g., full tailoring , long follow-ups and population-based) produced the highest rates of significant results and had the highest impacts on smoking. This example suggests that the variations in treatment effectiveness are possibly due to systematic differences in study and intervention designs . More specifically, variation in use, application , and measurement of the TTM may significantly contribute to the discrepancies in assessment of the overall effectiveness of TTM-based tailoring. In a meta-analytic review, these factors would be coded and analyzed to preclude such vague, contradictory, and misleading results. It is clear that more care and rigor is needed when compiling such rich data and that systematic analysis is critical to discovering such essential findings regarding key factors of successful interventions.
The use of existing datasets for secondary data analysis is often thought of as a simple procedure, but this is not necessarily the case. For instance, both Spencer et al. (2002) and  performed a "comprehensive" review of smoking studies utilizing the TTM . Spencer et al. (2002) identified 22 stage-matched intervention studies and  identified 23 such studies over similar time frames. Amazingly, only 7 of the studies were the same. Inadequate or selective data collection inherently leads to bias. Therefore, it is clear that sufficient time and resources need to be devoted to systematic data collection in order to attain a minimally biased and maximally complete set of datasets for a meta-analytic review.

Integrative and Meta-analytic Approaches to Testing the TTM
In addition to reviews by Spencer et al. (2002) and  a variety of of studies have used integrative approaches to examine the TTM; some of these are highlighted below.

Predicting Behavior Change
A powerful question that drives psychology is: can we predict behavior? And if so, how can we predict behavior? The Transtheoretical Model explains that 36 individuals move through a series of Stages as an individual prepares to and eventually succeeds in changing a given behavior. In this way the model can predict behavior. Two ways to explore these predictions are to examine relationship of the constructs to stage movements and stage effects .
Stage Movement . In the context of the Transtheoretical Model , the concept that primarily is being predicted is stage movement. That is, can the model help predict the movement of individuals from one stage to the next? As described above , the model also predicts that particular constructs serve as indicators for change and play a part in predicting stage movement. For instance, one may ask -do the processes of change predict movement from Action to Maintenance? More specifically , since experiential processes have been shown to be more salient in the earlier Stages and the behavioral processes more salient in the later Stages , we might refine that question to -do the behavioral processes predict stage movement from Action to Maintenance? These predictive questions have been at the heart of recent debates . Overall, the model's ability to make these predictions has been demonstrated cross-sectionally, cross-sequentially, and longitudinally. Although it is beyond the scope of this essay to review the body of literature on the predictive nature of the model one particular integrative method , examining stage effects, will be highlighted. In sum, it is clear that the Stages of Change play the key role in understanding this key aspect of the model.

Stage Effects .
Examining stage effects is another means to explore the predictive nature of the model, and more importantly the role of stage in the context of the model. Stage effects involve the prediction of a person's behavior over time depending on which stage the individual is in at baseline. More specifically, a person further along in the Stages is predicted to be more likely to move to Action or Maintenance than a person in the earlier Stages. Overall, stage effects across behaviors such as sun, smoking, and diet have been supported (Krebs, 2003;Prochaska et al., 2003). This suggests that persons in Precontemplation were less likely to take Action than those in Contemplation, who were, in tum, less likely to take Action than those in Preparation. An important implication of these findings is that moving forward one stage can increase the adoption of the target behavior. For instance, recent research has demonstrated that stage movement from Precontemplation to Contemplation increased the chances that smokers quit by 75% . Again, we see the importance of the Stages of Change in the context of the model.

Moderators of Stage Distributions
In a preliminary review of the literature a variety of moderators of stage distributions have been found. Examples include study characteristics and sample characteristics as well as behavior specific variables. More specifically, these moderators have included variables such as behavior, age, gender, setting, income, education, race, ethnicity, country, recruitment method, response format and activity criteria. For sake of brevity, several of the main moderating variables will be discussed in more detail below.
Behavior . Overall, stage distribution is typically examined across behaviors in the context of other potentially moderating variables and therefore will be touched 38 upon in most other sections. The only review identified in this preliminary examination of stage distributions that specifically compared stage distributions across behaviors was a study conducted by Nigg, Burbank , Padula, Dufresne , Rossi, Veleicer, LaForge, & Prochaska (1999) . This study examined Stage distributions across ten health behaviors in older adults ( and many findings will be highlighted in the age section). In the broadest scope, the researchers discovered that highest percentages of individuals across the age groups were in either the Precontemplation or Maintenance Stages across all behaviors. This finding indicates that for all behaviors getting people to think about change may be the most important place to start in population-based interventions .
When examining across behaviors we can assess trends in readiness to change, adoption/cessation of a behavior , and maintenance and then use these trends can help determine where the greatest emphasis should be placed for heath interventions. One example might be utilizing stage distribution information to determine which order behaviors will be intervened upon in a multiple behavior intervention . An intervention targeting a large population may want to begin with a behavior where the most individuals are ready to change. Successful change in one behavior may increase participant confidence and potentially lead to greater success for subsequent behaviors. If this idea is valid, accomplishing this may best be done by tailoring the order of behaviors at the individual level. But individual tailoring to this extent may not be possible or practical. For instance, many multi-level multibehavioral interventions include group wide (e.g., school or workplace) intervention activities or campaigns for a given behavior. In this case, it likely would be best to simultaneously work on the same behavior at the group level and individual level.

Age .
In a meta-analytic study examining exercise , Marshall and Biddle (2001 ), found that the youngest group ( <25) had the lowest percentage of individuals in Precontemplation and the highest proportion in Preparation. In contrast, the oldest group (55+) had the highest proportion in extreme Stages, Precontemplation and Maintenance. Velicer, Fava, Prochaska , Abrams, Emmons , and Pierce (1995) found stable patterns in stage distributions across three samples of smokers for three of four age categories. The age categories consisted of 18-24 years, 24-44 years, 45-64 years, and 64+ years. Of these, the oldest category, 64+ years, was consistently different from the other three categories. Nigg et al. (1999) looked specifically at stage distributions in various older samples. Differential distributions were identified for these age groups and especially in comparison to distributions in younger adults. Some interesting patterns were identified across behaviors for age, for instance the prevalence of maintaining a lowfat and high fiber diet increased as age increased whereas for exercise distributions increasingly stratified to the extremes as age increased ( consistent with findings by Marshall & Biddle, 2001). Overall, of the 10 behaviors all but two showed higher prevalence of individuals maintaining behaviors in one of the two oldest age groups (65-74 and 75+). This may be due to the fact that individuals who maintain healthy behaviors in fact live longer. Therefore, it seems this level of specificity is important for guiding age-related interventions, especially with regards to the elder end of the spectrum . For instance, an intervention developed for an assisted living facility would best be informed with data specific to this age group rather than with data from the general population.
Study Characteristics. Marshall and Biddle (2001) conducted a study examining stage distributions for exercise in a variety of different samples. Their results indicate that stage distributions were influenced by factors such as sampling method, with random/ nonrandom recruitment, and active / passive recruitment strategies. More specifically, passive recruitment and nonrandom sampling strategies resulted in a higher percentage of participants in the later Stages where as active recruitment and random sampling strategies resulted in higher percentages of participants in the earlier Stages . Likert scale format for staging resulted in lower percentage of participants in Precontemplation and Maintenance than fixed format.
Ultimately, it is important to be able to know how a particular study design impacts stage distributions to enable researchers to more accurately interpret findings of a particular study or more adequately design a new study.

Gaps.
Overall, more studies examining stage distributions are warranted.
Specifically, a comprehensive review examining multiple moderators across a variety of behaviors would be appropriate. A variety of limitations in previously conducted studies have left gaps in the exploration of stage distributions. For example, these studies have either: (1) limited their review to one behavior (e.g., Marshall & Biddle , 2001;Velicer et al. 1995); one age group  or one specific staging algorithm Velicer et al., 1995); (2) did not include all Stages ; (3) were unable to explore the influence of study design Velicer et al., 1995); ( 4) examined only 41 US samples Velicer et al., 1995); and (5) failed account for multiple operationism (Marshall & Biddle , 2001). Hall and Rossi (2004a) re-examined Prochaska ' s (1994) Strong and Weak Principles for Decisional Balance by updating the number of studies and expanding the types of analyses conducted. Hall and Rossi (2004a) investigated the crossover profile patterns and effect sizes for Decisional Balance across the Stages of Change for 37 different health behaviors in 81 independent datasets including nearly 40,000 participants. Behaviors such as exercise , smoking , and condom use were represented by more than 10 datasets each, enabling preliminary exploration of each of these behaviors in more detail.

Strong and Weak Principles
Patterns consistent with those reported by  were found for Pros and Cons ; for instance , in all studies the Cons of Changing outweighed the Pros of Changing during the Precontemplation stage. The "crossover" pattern of the Pros and Cons occurs when the standardized Decisional Balance scores are graphed by stage . Conceptuall y, the crossover point is thought to represent the ambivalence individuals feel when they begin to seriously consider changing their behavior.  and Hall and Rossi (2004a) found that the Decisional Balance crossovers occurred during the Contemplation stage for 58% and 53% of the studies , respectively. However, although often thought of as a simple function , Hall and Rossi (2004a) discovered complexities in the crossover patterns. For example , 16% of the datasets revealed multiple crossovers of the Pros and Cons , suggesting 42 further examinations are needed. One methodological consideration of the crossover function is that many of the studies found very unequal numbers of participants across the Stages of Change. Unequal sample size distributions affect the standardized score estimates of the Pros and Cons and tends to distort the crossover profile pattern, typically by "pulling" the crossover point towards either Precontemplation or Maintenance, since it is usually one of these two Stages that contains disproportionately greater numbers of participants. Recent work by Hall and Rossi (2004b) has begun the development of strategies to standardize these unequal sample size distributions which will allow a "truer" pattern to be revealed, enabling more accurate testing of theory and shedding more light on the meaning of the crossovers and what influences them.
In re-examining the strong and weak principles , the magnitude of the maximum increase in the Pros of Change was again found to be greater than the maximum decrease in the Cons of Change from Precontemplation to Action (Hall & Rossi, 2004a). Consistent with  strong principle, the average effect size for the pros was approximately one standard deviation (d = 1.05,SD= 0.45), almost identical to    (1994), though clearly the cons remains weak relative to the pros.
Heterogeneity of the distribution of effect size across all datasets was found, which signifies the presence of modifiers of effect size. Preliminary exploration of the moderators was conducted, beginning with behavior. Behaviors represented by more than 10 datasets were compared; four behaviors met this criterion (smoking, exercise, diet, condom use). Significant differences between behaviors were found for both Pros and Cons of change. The magnitude of effect for pros of exercise was significantly greater than for smoking, condom use, and diet. The effect size for cons of smoking was significantly greater than for diet and condom use. Examination of heterogeneity within each behavior was also significant, therefore additional moderators were explored.
Studies were characterized as either focusing on the cessation or acquisition of a behavior. Analyses revealed that cessation and acquisition behaviors did not differ significantly. One explanation could be that, conceptually, the distinction between cessation and acquisition may not be as straightforward as it initially appears. For instance, with a behavior like exercise, the emphasis may be on increasing or encouraging a person to exercise regularly, but this also means the person ultimately must decrease their sedentary behavior. Therefore, although measures of Decisional Balance and other TTM constructs may focus on the acquisition of a particular behavior, achieving acquisition may involve the cessation of other behaviors as well. More elaborate coding methodology may be necessary in order to differentiate more accurately the cessation and acquisition characteristics of the behaviors.
"Framing" Decisional Balance in the context of healthy versus an unhealthy behavior also showed no significant relationship to effect size (e.g., pros of quitting smoking versus the pros of smoking). The number of items in the Decisional Balance 44 scale was also unrelated to effect size. Effect size was related to gender for cons but not pros . One implication of this finding is that expert system scoring algorithms might need to be different for men and women , at least for the Cons of Change.
Experimenter bias was also examined as a potential moderator of effect size .
All 81 datasets were divided into three groups: studies performed by investigators at the Cancer Prevention Research Center (CPRC) at the University of Rhode Island (the originators of the Stages of Change and Decisional Balance measures) ; studies performed by CPRC trained researchers but conducted outside the center; and studies conducted by researchers with no formal CPRC affiliation . Bias was assessed by examining effect sizes rather than significance levels (p-values). This method provides a more sensitive context within which to investigate bias since , unlike significance tests , these indices are independent of sample size. So far as we know , this was the first time experimenter bias has been evaluated in this way. Analyses showed that there were no significant differences in effect sizes between the three groups. These findings suggest that the effect size results are not biased towards the researchers who originally developed the Decisional Balance measure and that the measures can be modified and applied by a variety of researchers with equal effectiveness.
The main limitation of this study was that all of the datasets were crosssectional. We can not assume the same relationships will necessarily occur when examining the movement of participants from Precontemplation to Action with longitudinal data , therefore it is important these results be replicated using longitudinal data. Additionally longitudinal data allows for more elaborate exploration of moderators and subsamples , which can increase the richness of the results. The analyses of potential moderators in this study were conducted using correlational and univariate statistics . More sensitive tools such as random effects modeling need to be utilize to advance such research.
Only 4 of the 3 7 behaviors had sufficient numbers of datasets to perform subanalyses by behaviors. More comprehensive data collection and updated searches of published and unpublished research will increase these numbers , greatly enhancing the depth of the results. Since average effect size significantly varied by behavior , it is important to look carefully at the effect sizes for individual behaviors rather than just across all behaviors as more studies are conducted and identified for each behavior.

Self-efficacy Across the Stages
Both Self-efficacy and the Stages of Change serve as important aspects of the design and evaluation of clinic-based and population-based health promotion interventions.  examined the relationship between Selfefficacy and the Stages in 28 independent studies across 14 behaviors (total N = 21,244), including cancer-related behaviors such as smoking cessation and prevention , exercise adoption, dietary fat reduction, fruit and vegetable consumption , sun exposure, cocaine use, binge drinking, weight control, contraceptive use, highrisk sex and condom use.
The functional relationship between Self-efficacy and the Stages of Change varied across behaviors but typically was monotonically increasing and linear across Stages. Studies that assessed both situational confidence and temptations displayed a consistent profile across Stages with a characteristic "crossover" towards the Action stage.
The magnitude of effect was examined in a similar fashion as the strong and weak principles for Decisional Balance (described above). The magnitude of the maximum increase from Precontemplation to Maintenance was strong and fairly consistent across behaviors, about 1.5 standard deviations, suggesting that Stage of Change accounts for about 36% of the variance in Self-efficacy.
This preliminary study again demonstrates the use of effect sizes to quantify theoretical relationships. Unfortunately this study is insufficiently powered and limited in scope. Rossi (2001) conducted a very preliminary study of stage transition effect sizes for the processes of change in a single longitudinal dataset. In general, sample sizes in population-based health promotion research are often so large that statistical power can be quite high even for fairly small effect sizes. Consequently , all or most statistical tests may be significant. This can lead to difficulty in distinguishing between important effects and trivial ones. The distinction often cannot be made simply on the basis of the magnitude of effect sizes but must be grounded in theory . to Contemplation , the experiential processes of change are expected to be more salient. For the transition from Contemplation to Preparation, the behavioral processes of change are expected to be more important. These transitions were examined using data from a smoking cessation study . Because the sample size for this study was large (N = 1466), nearly all statistical tests across stage transitions were significant. However, calculation of contrast effect sizes for each stage transition showed a clear pattern that was consistent with transtheoretical model predictions: for the early stage transition, effect sizes were larger for experiential than for behavioral processes; for the later stage transition, effect sizes were larger for behavioral than for experiential processes. The utility of effect sizes for the analysis of population-based research is evident, especially as compared to standard significance testing approaches. The use of such procedures within the context of theory-based research promises to be of even greater utility in the development of more quantitative models of health behavior change.

Stage Transitions
The strong and weak principles are useful for understanding the amount of work generally needed to move from Precontemplation to Action. Therefore, although the strong and weak principles have been important in conceptualizing and understanding the relationship between Decisional Balance and Stages of Change, these two principles have an action-oriented influence when applied to interventions.
That is, by virtue of examining characteristics of the transition from Precontemplation to Action and using these to tailor interventions, one is gathering information which potentially neglects three Stages: Contemplation, Preparation and Maintenance. If the ultimate goal is to advance the quantitative resources for development of stage matched tailored interventions in order to move people from Precontemplation to Contemplation to Preparation to Action and then finally to Maintenance, more specific transitions should be examined. It then becomes imperative to examine the magnitude of effect for each stage transition in order to explore the possibility of more complex impact patterns. For instance, some behaviors maintain a classic crossover profile, but that does not necessarily mean that their patterns of effect are straightforward; the patterns of effect may be curvilinear rather than linear. In order to identify curvilinear patterns in the relationship between Decisional Balance and Stages of Change one would need to examine the magnitude of adjacent stage transitions.
Stage transition analyses could identify the most effective strategy for tailoring behavior change interventions . For example, for a given behavior if the effect size from Precontemplation to Contemplation for pros is large but very small for cons, in this instance pros could be intervened on whereas cons could be eliminated for precontemplators in an intervention. From Contemplation to Preparation, if both the Pros and Cons effect sizes were large, both might be emphasized at this transition in an intervention . Following this strategy across each stage transition would create the most efficacious and efficient application of the measure, maximizing the impact and minimizing the resource expenditure.
Ultimately, careful and systematic investigation of the changes in these measures across stage transitions by behavior can provide even more detailed evidence for exactly how to use these measures most efficiently in future stage-matched interventions.

Integrative Analyses: Current Project
The current project was conducted in three main phases: (1)  The first phase is a preliminary study to help inform and facilitate a more comprehensive exploration of the Stages of Change and Decisional Balance in phases two and three. The second and third phases will each contribute a set of results, which will uniquely contribute to the aims of the project. Additionally , phase three integrates the results of the cross-sectional and longitudinal studies by comparing similarities and differences between the two . That is, in phase three , the longitudinal data will serve as a cross-validation of the cross-sectional data among common relationships between the two types of data. Dijkstra, A., De Vries, H., Roijackers, J. & Van Breukelen, G. (1998). Tailored interventions to communicate stage-matched information to smokers in different motivational stages. Journal of Consulting and Clinical Psychology, 3, 549-557. Goldberg, D.N., Hoffman, A.M., Farinha, M.F., Marder, D.C. , Tinson-Mitchem , L. Burton , D., & Smith, E.G. (1994). Lancaster, T., Dobbie , W., Vos, K., et al. (1999). Randomized trial of nurse-assisted strategies for smoking cessation in primary care. British Journal General Practice, 49, 267-274. Leed-Kelly, A., Russell, K. S., Bobo, J. K. (1996). Feasibility of smoking cessation counseling by phone with alcohol treatment center graduates. Journal of Substance Abuse Treatment, 13, 203-210. Lennox, A. S., Bain , N., Groves, J., et al. (1998) Laforge, R. G. & Prochaska, J. 0. (1999). Stages of change across ten health risk behaviors for older adults. The Gerontologist,39,[473][474][475][476][477][478][479][480][481][482] , What if there wer e no significance tests ? (pp. 175-197  Interactive versus non-interactive interventions and dose-response relationships 56 for stage matched smoking cessation programs in a managed care setting. Health Psychology, 18, 21-28. Wang, W. D. (1994). Feasibility and effectiveness of stages-of-change model in cigarette smoking cessation counseling. Journal of Formosa Medical Association, 93, 752-757.

Brief History
Methods of combining data were being developed as early as 1904 when Karl Pearson combined average correlations across five independent samples , and furthermore into the 1930' s , but these integration techniques were rarel y applied  . In 1976, Gene Glass coined the term "meta-analysis" . Together with the increasing need for integrating research findings , Glass ' s fitting term seemed to finall y launch this methodology into the field . Since then , meta-analytic techniques have continued to grow more comple x and have slowly been expanding to meet the needs of the growing applications across the social and medical sciences.

Failings of the Null
In the behavioral and social sciences , empirical questions have been explored and theories guiding those questions have been tested primarily using null hypothesis procedures . As the field progresses , it is becoming clearer that this indirect procedure is an inadequate tool for efficiently answering research questions or for testing , modifying or rejecting theories . The focus on rejection of the lack of a relationship among variables as a means to clarify a research question not only obfuscates the objective , but also can ultimately be misleading .
The failure to reject the null hypothesis can occur for a variety of reasons , for example , insufficient power due to small sample sizes. Additionally , sample sizes can be so large that statistical power essentially becomes "too high", which can lead to all or most statistical tests being statistically significant and thereby make it difficult to distinguish between important effects and trivial ones. Furthermore, null hypothesis testing can only provide ordinal or directional characterization of a relationship between variables not the magnitude of particular relationships.

The Role of Meta-analysis in the Significance Testing Debate
Meta-analytic techniques can be used to help overcome some of the limitations of null hypothesis testing as well as to move beyond null hypothesis testing. Meta-analysis can help resolve conflicting results, reduce variability in results, and increase the precision of reported results. Meta-analysis can best achieve this by synthesizing studies using point estimates and confidence intervals.
Meta-analysis can be used to synthesize the results of a body of conflicting studies and avoid some of the pitfalls ofrelying on traditional hypothesis testing. For example, low power can result for a variety of reasons such as not considering issues of power when planning a study, low retention during the course of a study and insufficient resources to achieve adequate power. Low power typically will create conflicting results within a body of literature due to statistical probability in null hypothesis testing. In fact, the effect size of two studies can be equal even when their level of significance may differ (i.e., one significant, the other not) due to variations in sample size. Meta-analyses can combine the results of multiple studies to overcome issues of low power. By placing confidence intervals around effect sizes the overall significance of a body of research can be determined and thereby clarify discrepant results found among individual studies.
Additionally, meta-analysis can be seen as overcoming failings of traditional hypothesis testing by increasing the precision of results by accounting for sampling error and other sources of variance. Meta-analytic techniques , by controlling for sampling error, can reduce variability in results. Likewise, corrections for study artifacts such as measurement error can be implemented using meta-analytic techniques. Various methods for modeling these types of error (i.e., fixed, random , and mixed effects modeling) can further increase precision in meta-analyses.
In sum, meta-analysis can be used to clarify conflicting results due to particular failings of null hypothesis testing, especially with single studies. The use of effect sizes can provide important information about effects that hypothesis testing does not offer, namely, the magnitude of an effect. Confidence intervals can provide a means to test statistical significance in meta-analysis and increase precision with each study added to a meta-analysis . Moreover, meta-analysis has typically been used as a method to simply combine results of studies based on null hypothesis testing. More sophisticated uses of meta-analysis can go beyond the simplicity of the null hypothesis testing to identify results not previously reported in the literature , explore moderators of effect that may not be able to be examined in a single study, and even provide a tool for advancing theory .

When to Use Meta-Analytic Techniques
Many literature reviews can and should consider the inclusion of quantitatively synthesizing results so long as they meet certain basic criteria . First, the review must include empirical studies. And secondly , these studies must include at least in part quantitative data. Beyond these minimum requirements , several additional issues should be considered. First, the research question(s) driving the research synthesis should be clearly defined. The question should be narrow enough to provide meaningful results from the literature, but broad enough to advance theory. A question too broadly defined risks creating the "apples and oranges" problem, in other words, summarizing studies that are not really dealing with the same constructs or relationships . One must use keen judgment to differentiate an "apples and oranges" study from a study aptly examining "fruit". In addition to these considerations of the research question, ultimately, data must be available in order to transform data or summary statistics into comparable statistical forms for the successful integration of research findings.
Although  stated that "The level of quantitative skill and training required to use basic meta-analytic procedures is so modest that researchers capable of readily learning the small number of calculations required to answer standard meta-analytic questions", this idea is quickly changing. Increased attention to meta-analysis and in meta-analytic techniques has led to many advances and refinements. These strategies are more statistically demanding and these statistically demanding techniques are becoming more and more standard in practice. This does mean that more sophisticated and user-friendly software is being developed to aid the conduct of meta-analyses, though clearly this does not eliminate the need to be able to conceptually understand many complex issues.

Literatur e searches
The advent of computerized reference databases has made searching for available studies, more "up-to-date" and much easier in many respects, but it does have its caveats. For instance, the seeming ease of searching computerized reference databases may provide researchers with a false security regarding their success in collecting available studies. Researchers should be aware that there are currently many available reference databases (e.g., some health related databases : MEDLINE , Psychlnfo , Cancerlit , Cinahl, Health and Wellness Resource Center, Pubmed) . It is important for researchers to determine which of these are relevant to their particular review and since some databases are more inclusive than others , researchers should not rely on just one or two. Each field of study has a handful of primary databases (e.g., MEDLINE, PsychINFO) that are typically used . Unfortunately, the quality of these databases can vary from institution to institution. That is, a literature .search conducted on a reference database at one institution can yield different results than at another institution since institutions can purchase different versions or levels of a particular database. Therefore, discrepancies will exist in the level of inclusiveness yielded by searches . Despite these cautions , the use of reference databases is an essential tool for literature reviews .
At the outset of the computerized search, keywords that represent the area of interest should be identified and entered into the database. These keywords may need to be modified as the search progresses in order to include important keywords that were overlooked or unknown to the reviewer at the onset. Documentation of keywords, reference databases, and other search strategies should be kept during the conduct of the meta-analysis.
In addition to computerized reference databases , search strategies include (1) manual searches of relevant (and typically the most prominent) journals; (2) careful examination ofreference lists from review articles and acquired articles; (3) conference pro grams and proceeding ; and ( 4) contacting authors and experts.
Thorough use of each of these procedures is important to minimize bias in the data collection phase.

Identifying the Studies
Several strategies can be used for determining which studies will be included in the meta-analysis. Usually preliminary screening of studies occurs at the literature search level. Oftentimes abstracts are used to give a gross assessment of applicability of the study to the meta-analytic review. Any study that is considered potentially relevant is then collected and should be carefully reviewed to further assess if each study meets the complete inclusionary criteria .
Inclusionary criteria should be explicit and directly related to the overall study objectives. Researchers should consider criteria such as: (1) Does the study include particular research participant groups (based on demographics or other characteristics) relevant to the research questions? (2) Are the key variables present?
(3) Is there adequate information regarding the variables to calculate the needed effects? (4) Does the study use a design (e.g., RCT) that is acceptable to examine the question at hand? And (5) Are there particular time frames that the studies are limited to? These types of issues should be considered in terms of the specific research question for the meta-analytic study. Decisions regarding the precise research question should be made as early in the processes as possible to avoid the need to redo the literature search.

Software .
Coding of study results, constructs , characteristics , treatment , samples and design are the essence of meta-analysis. To most efficiently and effectively create a database, several software options should be considered, including software for reference management , database production , meta-analysis computation and graphical display.
Firstly, literature searches and bibliographic information form the core of data collection . In order to manage the oftentimes unwieldy number of studies that are gathered during the collection phase , a reference management software program is recommended . Several different programs are available, such as Reference Manager and Refworks, which aid in cataloging each study (that was included or excluded) and save considerable time when referencing the studies during the writeup phase of the meta-analysis.
Next, deciding whether a meta-analytic software package is needed. The size of the meta-analysis, the type of effect sizes, and the type of analyses can impact this decision. A variety of meta-analytic software programs are now available , such as Comprehensive Meta-Analysis , Review Manager, MetaGraphs, SAS Macros , STATA , ARCUS , DSTAT , Meta-Anal yst, Easy MA , Fast Pro , True Epistat , and DESCARTES and macros for SPSS. Each of these software packages or macros has advantages and disadvantages and should be carefully reviewed before purchasing.
Types of features to consider include : (1) Does the software package include the appropriate statistical model (e.g., fixed , random, mixed effects models) (2) How is data entered into the program (e.g. , forms , spreadsheets)? (3) What effect size measures are available ? ( 4) Can the software perform advanced procedures? (5) What types of graphics , if any , does it offer ? (6) Can the graphics be easil y imported into another program (e.g., Word or PowerPoint) so they can be shared with other researchers? (7) Does the program have a test of homogeneity? and (8) Is the software flexible enough to allow adaptations of the algorithms? Typically , a given software package will not have all of the features needed. One must assess the particular needs of the individual meta-analysis and base decisions on the most important needs.
A viable option for many researchers may be to develop their own software using a program such as EXCEL. Algorithms for the calculation of effect sizes and bias corrections can be programmed into the spreadsheet. The advantage of this option is that data entry and preliminary computations analyses can be formatted in a way that is most intuitive or useful for the individual researcher . But more importantly, many programs are not sufficiently flexible to allow for specialized calculations not included in the software to be added or algorithms adapted to the specific needs of the researchers. Building your own program can give you that flexibility .
Depending whether you decided to create your own program or which metaanalysis software package you choose will play a part in determining how data entry will take place. For instance , Version 1 of Comprehensive Meta-analysis is created to have data entered directly into the program using a form template. In this case, data should be entered directly into the program at the start of the meta-analysis in order to maximize the features of the program . Other programs allow for data to be entered in spreadsheets, such as EXCEL, and can then be later imported into the software program.
No matter which way the researcher chooses, data entry should be conducted directly into a computerized spreadsheet or form rather than on pencil and paper forms . Using paper and pencil forms that are subsequently keyed into a computer database merely introduces additional error. It is also likely that this would require additional data entry procedures in order to verify the accuracy of the keyed data .
This will increase the overall amount of work.

Coding Manuals and Forms.
A coding manual outlines the items that should be extracted from each study, gives detailed definitions of each item, and specifies codes for each item . Coding forms traditionally were paper and pencil and each study was summarized by filling in the form . The contemporary method is to develop a spreadsheet that contains places for each item to be entered.
The coding manual is typically a work in progress . That is, the researcher first carefully considers the data that should be extracted from the studies.
Subsequently, as the coders are trained and as the coding manuals are being used, modifications and refinements will be made. For instance, a researcher may too broadly or too narrowly define a variable in the coding manual. This may result in inconsistent data extraction or an unusually high rate of missing data. The researcher must then reassess the item in question and modify the manual accordingly. Many of the coding manual refinements will occur during the training of the coders.  emphasize that the coding of studies is "one of the most technically demanding aspects of meta-analysis". Therefore the training of coders should be give careful attention. At the onset of the project several instructional seminars, including readings, lectures, and discussions, should be provided. The extent of the instructional seminars will depend largely on the knowledge and sophistication of the coders . Issues that should be covered include: Subsequently, coding manuals should be reviewed line-by line and discussions of various aspects of the coding manual discussed. Coders should practice using the manuals until all researchers are comfortable with the coding procedures. Periodic meetings should be conducted to review progress and discuss any difficulties encountered .

Data Extraction and Coding
Choosing variables to extract and code in a meta-analysis should be done with careful thought and planning. The coded variables provide a description of the set of studies included in the meta-analysis and serve as potential moderating variables for analysis. In general, coding is done at the study level. Oftentimes a given study will offer more than one type of result, relationship, sample, construct, or time point. Each of these can result in more than one effect size that can be included in the meta-analysis. Care needs to be taken to provide study characteristics specific to each effect size included. Variables related to each effect size include characteristics such as, measure or construct, effect size specific sample size, calculation procedures, and scale reliability.
Study characteristics such as date of publication, form of publication (e.g., dissertation, peer-reviewed, conference presentation), sample size, demographics (e.g., age, gender, education), population description and setting should be included.
Variables related to methods and procedures such as, sampling procedure, attrition and survey design are also important. Additionally, variables specific to the research question or particular body of literature that may be theoretically relevant should be identified.
Studies are sometimes coded for methodological soundness of the study. This type of coding is particularly relevant for comparisons between groups ( e.g. treatment and control) across studies that may be directly impacted by study design.
Setting explicit criteria for assessing methodological soundness is essential for avoiding the pitfalls of subjectivity. Using guidelines for methodological soundness 68 set by the field as much as possible can help increase objectivity. Two independent coders should rate the quality of studies and the coders should be blinded to names of authors and journals if possible.

Effect Sizes and Confidence Intervals
Effect sizes and confidence intervals are the most central elements of metaanalysis. Effect sizes provide valuable measures of magnitude of an effect , where as confidence intervals illustrate the precision of the given parameter estimate.

Effect sizes
Effect sizes are the primary index for reporting results in a meta-analys is.
Effect sizes provide an estimate of an effect of an independent variable on a dependent variable . This relationship between variables most often in meta-analysis is describing a study outcome or treatment effect , though this is by no means the only relationship examined in meta-analytic studies. Although an effect size can represent the direction or magnitude , the most typical (and rich) effect size measure both direction and magnitude . As the central statistic in meta-analysis , understanding effect sizes and the possible variations of effect sizes that exist prove essential for conceptualizing and starting a meta-analysis . Knowing the precise information needed to calculate effect sizes at the outset enables the researcher to more accurately identify which studies can be included in the meta-analysis .
Although authors can be contacted for some missing information , this can prove to be unsuccessful for a variety of reasons and decisions about what type of information will be gathered directl y from authors should be carefull y selected .

One-variable relationships
One-variable relationships are the least used measure of effect size and metaanalyses focusing on these relationships are not typical. In general this refers to the pattern of observation across a variable reported as a central tendency distribution (mean, median or mode) or as a distribution of values (frequencies, proportions or sums). An example of a one-variable relationship is the comparison of scores between two types of measures that represent the same construct . These scores would need to be reported or converted in the same metric (e.g ., percentiles) and associated standard errors calculated in order to be compared as an effect size.

Two-variable relationships
Two-variable relationships are the most common in meta-analyses. One common type of two-variable relationships is the pre-post contrasts, which involves the comparison of two central tendencies. These comparisons can be made on unstandarized means when the metric is the same for all measures. Otherwise comparisons can be made with standardized means scores.
Another common two-variable relationship is the group contrasts . Group contrasts involve a variable that is measured in two or more groups . Means, standard deviations, and sample sizes for each group on each variable or proportions based on the characteristic of interest can be used for comparisons . There are a variety of effect size measures that can be used for group comparisons, such as, the unstandardized mean difference, the standardized mean difference, the proportion difference and the odds ratio.

Association Between Variables
The association between any two variables can serve as the basis of the metaanalysis. These bivariate relationships involve the covariation between two variables.
The primary effect size in this context is the Pearson product-moment correlation coefficient. In some situations one could include the odds-ratio and the standardized mean difference. The standard Pearson product-moment correlation coefficient is used as an effect size index for two continuous variables . In contrast, a biserial point correlation coefficient is used for dichotomous-continuous variable relationships.
Another possibility includes the two-dichotomous variable relationships assessed with the odds-ratio and phi coefficient. The odds-ratio is generally used to describe the comparison between two subgroups whereas the phi coefficient describes the relationship or predictive strength in a group.
When mixing continuous and dichotomous variables one must first decide if the dichotomous variables are inherently dichotomous or artificially dichotomous. If one variable is inherently continuous and the other is artificially dichotomous then product-moment correlation is used but it should be corrected for the artificial dichotomization or a z-transformed r can be used as the effect size index. On the other hand, if one variable is inherently continuous and the other is inherently 71 dichotomous either a biserial coefficient with a correction for dichotomy or a standardized mean difference with probit, logit or arcsine correction is used.

Multivariate Relationships
Multivariate relationships are the least common relationships examined. The methods for examining multivariate relationships are few and the applications limited. Various types of multivariate relationships, such as multiple regression, structural equation modeling and factor analysis have been discussed in the literature. The most challenging problem with exploring multivariate relationships is the lack of data needed to calculate standard errors. Researchers have begun to explore strategies such as synthesizing correlations statistics and meta-analyzing them by doing multivariate analyses on synthesized matrices. Ultimately, the most feasible approach identified so far is to actually conduct meta-analyses on individual studies that have conducted multivariate analyses and examining the multivariate relationships by combining correlations of reported predictors. The limited proportion of studies that use multivariate statistics to examine particular relationships in a body of literature poses a significant limitation. Additionally, multivariate meta-analyses are limited due to the lack of full correlation matrices reported in studies. Much more research is needed in this area to identify methods for working with available data. And more importantly, standards for reporting needed information such as correlation matrices, should be set by journal editors to help facilitate the use of these advanced procedures, which will no doubt have important impacts on the field.
In sum . There are advantages and disadvantages to each of effect size indices discussed. When conducting a meta-analysis, the relevant indices need to be identified. Potential effect size indices should be examined to determine which is most relevant, which requires information most attainable from the particular body of literature in question, and which will provide the most reliable measure possible.

Confidence Intervals
Confidence intervals illustrate the precision of a parameter estimate. The confidence interval provides an additional dimension to reporting effect sizes by indicating how "confident" one can be of the measure of magnitude obtained. The researcher has the flexibility to choose the level of probability, and this is done by setting the probability percentage for the confidence interval. For instance, a 95% confidence interval indicates that the confidence interval has a 95% probability of containing the population parameter.
The width of the confidence interval is directly related to: (1) the amount of data used to generate a given effect size; (2) the level of confidence interval chosen by the researcher; and (3) the computational model used. The larger the amount of data used the more precise the measure of effect size and therefore the narrower the confidence interval. Inversely, the greater the confidence level chosen by the researcher, the wider the confidence interval will be. Lastly, variance across studies attributed to entirely random influences are modeled with a fixed effects model. This model allows for confidence intervals that can actually reach zero. On the other hand, if the random variance and between-studies variance is assumed then a random effects model is typically employed. This model tends to produce wider confidence intervals and limits the ability of the confidence intervals to approach zero.
In a traditional meta-analysis , interpretation of confidence intervals can be guided by clinical utility . In the medical field for instance, some research questions may be focused on reduction of mortalit y. These studies will hold different criteria than a treatment effect for less "critical " outcomes. Such practical issues illustrate the need for flexibility when setting particular confidence interval levels or for which results are considered "significant" or "not significant". The most important consideration is whether the differences are clinically important, not important or yet umesolved. In general, caution should be used in interpreting point estimates. An emphasis is placed on the use of confidence intervals to increase the meaningfulness of interpretation.
Additionally, in meta-analytic reviews, confidence intervals should be reported for the individual studies as well as for the combined effects. Graphical representations of these confidence intervals provide an important tool for interpretations of the overall effects and the influences of individual studies on the overall effects. These simple but informative schematics , such as Forest Plots , provide the first step in the exploration of moderating variables of the effects.

Issues of Bias
It is sometimes necessary to correct the meta-analysis for bias. This can happen at two levels, either by correcting effect sizes with study-level information or by correcting more globally with meta-analysis level bias corrections. Most typically bias is corrected using study level information. Various types of biases that are important to consider include, sample size bias, artifact biases (e.g., measurement bias), publication bias, and the 'file drawer' problem .

Sample Size Bias
Simple pooling of data occurs when each set of data from a particular study is pooled together without regard to differences in sample size. Therefore, studies which include 10,000 participants provide the same weight in the analysis as a study with 10 participants. To avoid this problem, effect sizes can be corrected for sample size bias by weighting each data point by its respective sample size. The inverse variance weight reflects the precision of the effect size estimate, which varies as a function of sample size. This method is used to weight the contribution of each study effect size so that larger studies are given more weight in the calculation of the overall meta-analysis effect size.

Artifact Adjustments
Artifacts are imperfections that arise when conducting studies. Adjustments can be made to account for study artifacts such as sampling error, variable reliability , restriction for range of variables, dichotomization of continuous variables, and imperfect construct validity. Most often the information needed to make such corrections is not available in published reports and even when the information is available for many studies , likely it is not available for all. Efforts should be made to identify alternative sources of information for conducting bias corrections. For instance , one may tum to manuals for a given measure (if available) in order to identify the overall reliability coefficient. Another way to deal with the lack of consistent data is by conducting artifact distribution analyses to correct for biases using distributions. This method allows for bias correction at the meta-analysis level rather than at the study level. Ultimately, bias corrections can increase accuracy in some areas but may decrease accuracy in others. For instance, sampling error is larger for effect sizes with individually corrected effect size (e.g., measurement error correction; . Additionally, the researcher must decide ifhe or she is comfortable making interpretations using adjustments or basing interpretations on unadjusted data. If the researcher chooses to correct for bias , comparisons of effect sizes can always be made by reporting effect sizes both with and without the corrections.

Measurement bias
An example of an artifact bias is measurement or scale reliability bias. Effect sizes can be corrected for attenuation due to scale unreliability using Hunter and Schmidt's (1990) correction: where, ES= the observed (attenuated) effect size estimate, ES'= the disattenuated effect size estimate, and, r YY = the reliability of the dependent variable measure, which was estimated using the reported value of the scale internal consistency coefficient Alpha.

76
The 'File Drawer " Problem When data is missing at the study level due to non-publication, it is called the "file drawer" problem and represents a form of publication bias. The most common procedure to estimate the impact of this sampling bias is computing the "fail-safe K" . This is an estimate of the number of undiscovered (presumably mostly unpublished) studies that , if known, would in aggregate reduce the overall meta-analysis effect size so that it was not statistically significant.
Additional methods of assessing the robustness of the meta-analytic results in efforts to account for unpublished or missing studies include: (1) file drawer estimated by ; (2) missing studies estimate method by Gleser and Olkin (1996); and (3) trim and fill .
Although the "fail-safe K" has been a widely used procedure,  found that the Iyengar and Greenhouse (1988) method provide a better estimate of missing studies.  also indicate that the trim and fill method provides an adequate estimate of the number of missing studies . Again more research and consensus is warranted.

Publication bias.
Historically there has been a tendency to published studies with statistically positive results as well as larger effects. Including published studies, "grey literature " and/or unpublished studies can result in an attenuated effect size estimate . McAuley , Pham , Tugwell , and Muher (2000) suggest that the inclusion of grey literature can reduce the effect size estimates by 12-50%. Publication bias is a form of sampling bias at the study-level.   or may be due to both heterogeneity and publication bias . Therefore, these tests should be conducted in conjunction with other tests to increase the accuracy of interpretation. One additional solution is to perform a full artifact distribution interim meta-analysis to determine a meta-analysis level correction rather than study-level corrections.
In sum . Ultimately, a meta-analyst must take into account a variety of information before deciding when to correct for bias, how many different bias corrections are needed (if any), or with which methods will then correct the bias.

78
Researchers need to consider that corrections of certain biases can increase others.
For instance, overall sample size (i.e., number of studies) can decrease the sampling error, therefore the researcher may feel more inclined to correct for measurement error in these more "robust" situations. On the other hand, although the sample size may be large, if subgroup analyses are intended, this may in fact increase the sampling error within subgroup analyses and the researcher may then decide against correcting for measurement error. Another example may be that the range of reliability measures is quite small and the measures on the whole quite reliable, in this case the research may chose to not correct for measurement error in order to avoid the draw backs associated with individually correcting effect sizes.
Ideally, if bias corrections are needed, researchers of "primary" studies will report all information needed to perform comprehensive meta-analyses. This can enable meta-analysts to individually correct for bias rather than need to rely on interim meta-analyses to correct for artifacts at the meta-analytic level rather than the individual study level. If bias corrections are used it important to know that metaanalytic formulas used for corrected effect sizes typically are different than those used for uncorrected effect sizes.

Homogeneity of Effect Size
In meta-analytic studies, variation among effect sizes sometimes occurs due to random error, though oftentimes this variation becomes statistically larger than one would expect (due to random or sampling error). Tests of homogeneity attempt to identify the difference between variation due to random error and variation due to systematic differences between study design and participant characteristics or other theoretically relevant variables. Ultimately , assessing heterogeneity of effect size is important because the lack of a homogeneous distribution typically suggests the presence of possible predictors or moderators of the effect size magnitude. A variety of methods for testing homogeneity are available including tests of homogeneity designed to assess variation in treatment effects with odds ratios, percent of variance accounted for measure of effect size, and standardized effect size measures.

Modeling Variance
There are three primary methods of modeling variance: fixed, random and mixed effects modeling. The fixed effects model assumes the only source of variance is subject-level sampling error, whereas the random effects model assumes that the source of variance includes subject-level sampling error and study-level sampling error (i.e., random error). The fixed effects model can also include systematic variance due to identifiable moderating variables. A mixed model assumes all three sources of error, sampling error, random error and systematic error.
In a fixed effects model, it is assumed that the variance in the effect size distribution is only due to subject-level sampling error. So it is assumed that an effect size from an individual study represents the population effect with only random sampling error associated with chance factors. Tests of homogeneity examine the assumption that error is only due to sampling error and thereby serves as an assessment for whether the fixed effects model holds, though as discussed below, evidence of heterogeneity does not necessarily rule out a fixed effects model.
If the fixed effects model does not hold then another model is sought. The random effects model assumes that sampling error is accompanied by other sources 80 of variability randomly distributed . The sampling error is considered to be from the subject-level whereas the random error is an estimate of between-studies variance.
This between-studies error is thought to be analogous to study-level sampling error.
Identifying the random effects variance component is the most difficult of the two components and can be assessed using noniterative methods based on method of moments and iterative methods based on maximum likelihood.
If one assumes that the fixed effects model holds but additional error is due to a systematic difference in the coded variables then one can attempt to partition the effect size variance. This can be tested by performing an analog to analysis of variance. This ANOV A will be used to explore variables by partitioning the total variance Q (Total) into Q (Between) and Q (Within). Q (Between) values will be used to test between group differences by using a Chi-square with df = p-1 , where p = the number of studies.
If the variance is assumed to be due to subject-level sampling error, studylevel sampling error (or random error) and some systematic variance then the model would be a mixed effects one. This means that the component of error not accounted by sampling or systematic variance must be incorporated into the effect size analysis by creating a weighting function. In this case, the residual Q (Between) or Q (Within) would still be heterogeneous. Fitting the mixed effects model is similar to the random effects model.
The mixed effects model is used less often than fixed and random effects models due to the complexity of the modeling procedures. Ultimately, fixed effects are more likely to identify systematic variance than the mixed effects model since it 81 has more power to do so. On the other hand , mixed effects models have more accuracy in terms of type I error. On the whole these models require much more research to understand how best to account for and assess variance , especially in relation to accounting for and calculating the study-level sampling or random error.

Meta-Regression
In addition to ANOV A analogs using fixed and random effects models , one can employ the use of meta-regression to examine the association of effect with study characteristics. Meta-regression is a more sophisticated approach to assessing moderators when heterogeneity is observed . As with other regression analyses care should be taken to assess colinearity by assessing correlations between variables . Overall fit of the meta-regression can be calculated with a QR for the regression and a QE for the residual error , which are distributed as a chi-square  .

Power Analysis
When conducting a meta-analysis that is primarily aimed at resolving conflicting studies in the literature, namely those using null hypothesis testing, metaanalysts should perform power analyses. In this case, using the obtained overall effect size and assuming it is equivalent to the population effect size, the use of Cohen's power tables are appropriate. In other situations, the principal question of power for meta-analysis involves not so much whether or not the overall effect size is statistically significant but rather whether there is sufficient power to determine if the effect size distribution is heterogeneous. Power for the Q test depends on the ratio of between-to within-study variance . Unfortunatel y, it is difficult to know what this ratio might be a priori.  reports a range of 0.33 to 1.0 across a large number of meta-analyses that he examined.  suggest variance ratios of 0.33, 0.67, and 1.00 be considered small, medium , and large degrees of heterogeneity , respectively.

Outliers
Effect size distributions should be examined for extreme high or low scores in order to identify outliers. Outliers should be examined carefully in efforts to identify any potential errors ( e.g., transcription errors) on the part of meta-analyst that can be corrected. In some cases summary data in published studies may appear to be causing the outlier and the meta-analyst can attempt to contact authors for verification.
In order to identify the outliers several methods can be employed . For instance , standardized residuals can be used to identify outliers when using models (e.g ., the standardized difference from the mean , using fixed or random effects) , though the guidelines for cutoffs are most appropriate when the underl ying distribution is homogeneous. Another method is to test the suspected outliers by eliminating one effect size at a time and assessing the change in Q (homogeneity statistic). If underlying heterogeneity exists, Forest plots or other subgroup analyses can be employed to assess the outlier within the context of a variety of moderators and to identify the outlier with respect to potential associated clusters of effect sizes.
Another method includes identifying effect sizes that are 2 to 3 standard deviations from the mean to be considered for removal or recoding.

83
Once identified outliers are most typically removed from further analyses. If a researcher does not wish to remove studies from the meta-analysis, a more inclusive approach can be employed. This involves identifying a break in the effect size distribution and coding outliers back to the next largest cluster of effect sizes . The advantage to this method is that it ensures as much data as possible is maintained and allows for the extreme values to be retained but not to distort other effect sizes in the distribution.
Each of these methods requires a degree ofresearcher judgment or subjectivity when determining which values are outliers and when those outliers should be removed. Furthermore, these methods have typically not been developed specifically for meta-analysis.  developed an outlier statistic for meta-analytic data, called SAMD (sample-adjusted meta-analytic deviancy statistic). Unfortunately it requires a high number of computations and is typically not appropriate for coefficients that are individually corrected for bias ( e.g., measurement error; though in some instances with additional adjustments these can be accounted for). Additionally, recent research has raised critical questions about its potential utility. In particular, when SAMD was used for removing outliers with correlational data, small correlation outliers are more likely to be removed than larger correlation outliers  which introduces yet another bias problem. It is unclear if this problem persists in the examination of outliers for standardized mean difference scores. To date, recommendations by  to resolve the asymmetric outlier identification have not yet been explored; therefore more research should be done before this procedure is used.

Interpretation Issues
The interpretation of results in a meta-analysis largely depends on the research question that the meta-analysis was designed to examine . Simple metaanalysis will often seek to determine if a body of literature, when combined , produces significant results. Overall assessments of significance allow interpretations that parallel the individual studies on which the meta-analysis was based . More sophisticated meta-analysis will examine moderating variables and explore aspects of theories and interpretations will be couched in underlying theories or new discoveries.
At a more basic level, the index of effect size is the central measure in the meta-analysis. Effect sizes have general rules of thumb for interpretation. Although important, these should merely serve as guidelines and ultimately need to be interpreted within the context of the measured relationships. For instance , Cohen (1977) suggests that effect size magnitudes have range of small (d = .20; r =. l O; r/ = .010), medium (d = .50; r = .25; 77 2 = .059), and large (d = .80; r = .40; 77 2 = .138). These can also be interpreted as 1 %, 6%, and 14% of the variance is accounted for respectively for small, medium , and large effects. In a recent study,  found that typical effects for public health interventions were closer to 0.5%, 1.0%, and 1.5% of the variance for small, medium, and large effects. These findings are dramatically smaller than rules of thumbs set by Cohen due to differences in the context of their development. Similarly, when analyses are dealing with clinically and practically significant results it is up to the researcher to determine the significance of the findings within its particular context.
Currently, meta-analysis has much growing to do and the field has much to do to embrace and foster that growth. In such an environment, meta-analysts should be particularly thorough in reporting methods used to conduct meta-analyses, in providing data in a variety of forms ( e.g., with and without bias corrections or fixed versus random effects modeling), and avoid interpreting results with unfounded confidence (e.g., without sufficient consideration to limitations). These strategies enable other researchers to replicate meta-analytic findings, and help provide the data needed to replicate findings in the face of new, refined, or more advanced techniques while minimizing the risk of unwarranted controversy over a technique that, as it matures, will be more and more important to the field.

Introduction
Effect sizes are the primary index for reporting results in a meta-analysis.
Effect sizes provide an estimate of an effect of an independent variable on a dependent variable . This relationship between variables most often in meta-analysis is describing a study outcome or treatment effect , though this is by no means the only relationship examined in meta-analytic studies . As the central statistic in meta-analysis, determining effect size is essential for meta-analysis.
Although relatively little information is typically needed to compute the most common measures of effect size, the necessary data are not always reported in research articles.  and others ) have provided useful summaries of methods for computing effect sizes under a variety of situations. For example, to compute Cohen's d or Hedges' git is necessary to know the means, standard deviations (SD), and sample sizes of the groups that are being compared. If some or all of these data are unavailable, one could estimate effect size from t or F test results as long as the degrees of freedom of the test are known.
A different type of problem occurs when reported data appear to support the calculation of effect size but instead lead to inappropriate or misleading results. This might happen when data are reported as standard scores, such as z (M = 0, SD = l) or T-scores (M = 50, SD= l 0). For example, a meta-analysis by  included a study by  that used T-scores to describe the benefits of smoking. The effect size comparing smokers and ex-smokers was determined by subtracting T-scores and dividing by 10, resulting in an effect size of 0. 43. This procedure appears correct because it is known that the SD of T-scores equals 10.
Therefore, dividing the T-score difference by 10 should result in the standardized difference between the means. However, this procedure is sub-optimal because 10 is the total group SD, not the pooled within-group SD used to compute Hedges' g or Cohen's d . Since Velicer et al. (1985) provided group n's and SD's, we were able to compute an effect size based on the pooled within-group SD.
This resulted in an effect size of 0.55, nearly 30% larger than the value reported by .
In most cases the within-group variance will be less than the total group variance whenever the grouping variable has a non-zero relationship with the dependent variable, since this will reduce within-group variance relative to total variance (i.e., SSerror = SStotaI -SSbetwee 11 ). Thus, computing effect size by subtracting T-scores should underestimate Cohen's d or Hedges' g. This study examined this possibility using two meta-analyses that were being conducting , including only studies that provided enough information to compute effect size using both methods. It was predicted that the method of subtracting T-scores would underestimate the effect size based on the pooled within-group SD.
Conceptual Framework for Effect Size. The effect sizes computed for this study represent the cross-sectional "change" in Pros and Cons scores across the stages from Precontemplation to Action.  found that the effect size for Pros and Cons across the stages from Precontemplation to Action was approximately I SD for Pros and 0.5 SD for Cons. He coined this relationship the Strong and Weak Principles. The Strong and Weak Principles were re-examined in this study using three different methods of effect size calculation that served as the basis of the effect size comparison . Additionally, correction factors to account for differences between the calculation methods for each of the constructs were considered and a final bias correction was developed.

Method
The current study compiles two sets of data from an on-going study were considered for analysis. The primary inclusion criteria for the current study required that articles must provide: 1) enough information to compute the pooled SD ; 2) mean T-scores for each group ( either in the text, in a table, or in a graph depicting T-scores across groups).

Analyses
Effect Size Calculations . Three methods of determining effect size were used in the analysis: Hedges ' g, standard score , and graphical standard score estimation .
Hedges ' g requires means , SD's , and n' s for each group (either raw scores or Tscores). Hedges' g is the difference between the sample means divided by the average pooled sample standard deviation .

Spooled
( 2) Both the standard score and the graphical standard score method utilize the same equation to compute effect size . This equation simply requires standard Tscores (M = 50, SD = 10) for each group , and is defined as the difference between the sample means divided by the standard deviation.
The only difference between the standard score and the graphical method is that group means for the standard score method was obtained from the text or a table , whereas, the graphical method used scores estimated from a graph .
Estimations of T-scores from graphs were made by using a ruler to measure the point against the y-axis where the T-scores are graphed . The ruler measurements were then used to interpolate T-scores between adjacent tic marks on the y-axis . The 93 scores were interpolated to the nearest hundredth. It should be noted that the graph method was used only when Hedges ' g could not be calculated and the actual standardized scores were not available.

Effect Size Differences . Paired-samples t-tests were conducted between the T-
score and graphical methods for each of the constructs. Additionally , paired-samples t-tests were conducted between the standard score approach and the Hedges' g for each of the constructs.

Effect Size Correlations.
Correlations were performed between the effect size methods for each of the constructs .

Regression Analyses. Regression analyses were conducted with the Pros and
Cons data . Regression analyses were performed including the variables: sample size, standard score approaches (T-score and graphical standard score estimation) , Hedges ' g and standard score , to assess the best fit. Mean differences in effect sizes between the Hedges' g and the standard score calculations for each study was assessed.
A second set of analyses were run following the removal of outliers.
Differences in scores between the Hedges' g method and scores estimated via the regression formula of 0.5 or greater were considered outliers and were removed from the analyses.
Corrections Formulas . Regression analyses and trendlines for each set of data were compared to identify the best fit for subsequent use as regression correction formulas.

Results
Thirty-eight studies investigating Decisional Balance (Pros and Cons), satisfying all inclusionary criteria were identified. Some studies examined multiple health behaviors; therefore the total number of datasets was 46 for each of the constructs.
Prior to comparing Hedges' g to the standard score method, the comparability of the two different techniques for obtaining the standard score was examined. This comparison was conducted for the subset of studies that provided information on Tscores both in text and in graphs. Paired-samples t-tests were conducted to test the equivalency of these two approaches. There was no significant difference in using reported T-scores versus estimating scores from a graph for Pros, t(24 )= 1.201, Cons, t(24)=-l.160, with correlations of 0.965 and 0.981 respectively. Therefore, these two methods were collapsed. T-scores were estimated from graphs only when the actual standardized scores were not available.
The standard score approach was then compared to the Hedges' g method using paired-samples t-tests. As expected , a significant difference was found for Pros, t(45) =3.896,p<.05,d=0.193,Cons,t(45) =3.807,p <.05, d=0.105. Table 1 presents the means, SD's, and n's for Hedges ' g and standard score effect size's for the two constructs. The correlation between the two effect size methods was nearly the same for the two variables: 0.924 for Pros, 0.944 for Cons.
where ES = standard score effect size. Diagnostics were then assessed and difference scores between the Hedges' g effect size and the estimated effect size based on the regressions reported above. Four sets of data showed differences of 0.5 or greater and therefore were considered outliers. These data were removed from the analyses , including one data set for Cons (Marcus et al., 1994, exercise) and three for Pros (King et al., 1996, smoking;Jordon, 2002, bulimia;Grimley et al., 1995, condom use -main).
Regressions were then run on the data with the outliers removed. This

Cons of Change
Standard Score Effect Sizes Lastly , since the relationships for the Pros and Cons between the Hedges' g effect sizes and the standard score effect sizes revealed similar patterns the two sets of data (see Figure 1 and Figure 2) for these constructs were combined (see Figure 3) and regression analyses were performed . This resulted in the best fit (R\ctj = 95.5%) , with an overall significant relationship (F 4 ,4 1 = 1826.575, p < .01).  where , ES = standard score effect size estimate.

( 6 )
Our results showed that Hedges ' g consistently provided a larger effect size estimate than the standard score method for both constructs. For Pros and Cons, Hedges' g was approximately 10% greater than the standard score effect size. In fact, not only were there overall differences but the standard score method underestimated effect size for 87% of the data included in the analysis. An implication of this finding is that one might consider using a correction for standard score estimates to better approximate the effect size.
A single correction equation was achieved by combining the data from both constructs that provides a good effect size adjustment. Unfortunately , it is unclear if the same correction will suffice for other constructs. Thus, meta-analysts may need to determine their own correction factor specific to their given measures.
It is clear from the data that the regression line fit small effect sizes better than larger effect sizes ; therefore the correction formula is likely to be more accurate for smaller effect sizes. More exploration is needed to understand this relationship .
An important limitation of our study is the use of only two variables. Our preliminary findings suggest that a more extensive examination is warranted in order to identify possible systematic variation in effect size procedures. Simulation studies may provide an effective way to examine more complex patterns and relationships in the data to clarify what factors affect the magnitude of the discrepancy between these two approaches of estimating effect size.
In sum, a reasonably accurate correction factor was achieved by creating a single regression equation for both constructs. Effect sizes based on either the standard score or the graphical standard score methods can subsequently be corrected for bias using the regression equation. In addition to reducing the underestimation in the magnitude of effect sizes based on standard scores, these procedures will also help reduce the number of studies that must be discarded from the study due to insufficient reporting of data on which to compute effect sizes, thus eliminating a potential source of bias. Song, F., Sheldon , T.A., Sutton, A.I., Abrams , K.R., & Jones, D.R. (2001 One of the most important and reliable TTM constructs , Decisional Balance , was inspired by Janis and Mann ' s (1977) conflict model of decision-making. Janis The development of the TTM Decisional Balance measure (Velicer , DiClemente , Prochaska, and Brandenburg , 1985) was based on the 8 factors (4 gains and 4 losses) of  . The researchers constructed the scale to study the decision-making process across the Stages for smoking cessation . Instead of achie ving an 8 factor-structure as anticipated, principal components analysis identified two orthogonal components . These two components were called the Pros and Cons of Smoking.
Following the original study by Velicer , the use of the construct began to expand, encompassing an array of beha viors such as exercise , condom use , and mammography screening . This early work culminated in a paper by , which looked at patterns in Decisional Balance across Stages in 12 behaviors. This integrative study investigated: 1) the generalizability of the TTM for Stage and Decisional Balance across behaviors; 2) the generalizability of the TTM for a variety of populations; 3) the number of components in Decisional Balance and their respective internal consistencies; 4) patterns of Pros and Cons across Stages; and 5) the Stage of crossover between standardized Pros and Cons scores. More recently, a meta-analysis examining 37 behaviors utilizing 81 datasets including nearly 40,000 participants was conducted re-examining these relationships and exploring additional ones .  and Hall &Rossi (2004) found clear support for the generalizability of the Stages of Change, the Pros and Cons, and the integration between them. The researchers additionally found that these constructs generalized across a variety of populations and behaviors.

Structure of Decisional Balance
Consistent with the two-factor structure identified by ,  and

Functional Aspects of Decisional Balance
In Kurt  expectancy theory, it is postulated that behavior changes as a function of the increases and decreases in motivation to contemplate gains and losses . The TTM builds on this notion by suggesting a clear directionality to the function as well as a characteristic way of examining it. The function is based on the relationship of when and how much the Pros increase and the Cons decrease  and is examined by identifying a graphical crossover of the Pros and Cons as they change relative to each other. More specifically, the crossover occurs when the standardized Decisional Balance scores are graphed by Stage.  and  found that the Decisional Balance crossovers occurred during the Contemplation Stage for 58% and 53% of the studies, respectively. Based on the timing of the crossover , the researchers suggest that progress from Precontemplation to Contemplation involves an increase in Pros whereas progress from Contemplation to Action involves a decrease in Cons.
Several Decisional Balance patterns were found in the two studies . For example, the Cons of Changing were higher than the Pros of Changing in the Precontemplation Stage for all datasets in both studies . Prochaska found the opposite true for 11 out of 12 behaviors in the Action Stage.

Patterns across Stage
Several additional patterns in the relationship between Stage of Change and Decisional Balance were found in the two studies . For example, the Cons of Changing were higher than the Pros of Changing in the Precontemplation Stage for all datasets in both studies. Prochaska found the opposite true for 11 out of 12 behaviors in the Action Stage.

Decisional balance (Pros and Cons) serves as an intermediate indicator of
when change will occur and has generally been thought to be especially salient in the earlier Stages of Change .  also illustrated that the relationship between the Stages and Decisional Balance for an unhealthy behavior is different than for a healthy behavior. That is the pattern for an unhealthy behavior was such that the Cons decreased across the Stages whereas the Pros displayed a curvilinear pattern which paralleled the decline of the Cons in the later Stages. In contrast, the healthy behavior showed more of an X configuration, with the Pros continuing to increase across the Stages whereas the Cons decrease across the Stages.

Strong and Weak Principles
Across twelve studies , mathematical relationships were found between the Pros and Cons of Changing and progress across the early Stages into Action  In re-examination of the strong and weak principles, the magnitude of the maximum increase in the Pros of change was again found to be greater than the maximum decrease in the Cons of change from Precontemplation to Action across 37 difference health behaviors   , though clearly the Cons remains weak relative to the Pros. Practical implications of these principles are that the Pros of Changing must increase twice as much as the Cons must decrease, suggesting that an intervention place twice as much emphasis on raising the benefits as on reducing the costs or barriers.

Stage Transitions
The Strong and Weak Principles are useful for understanding the amount of work generally needed to move from Precontemplation to Action. Although theoretically the principles have been important in conceptualizing and understanding the relationship between Decisional Balance and Stages of Change, these principles are essentially action-oriented when applied to interventions. That is, by examining characteristics of the transition only from Precontemplation to Action, and using these to tailor interventions, one is potentially neglecting three Stages: Contemplation, Preparation and Maintenance. Therefore more practically, the examination of each Stage transition can help identify the most effective strategy for tailoring behavior change interventions.

Research Hypotheses and Predictions
The aims of this study are focused on the two constructs, Stages of Change Precontemplation to Action . Since the Strong and Weak Principles measure the maximum increase and decrease of the Pros and Cons from Precontemplation to Action , rather than an absolute difference , these two principles are "biased ". That is, the principles result in a potential over-estimation of the cumulative effect size across the three transitions PC-C, C-PR, and PR-A.

Prediction 2. The Pros and Cons appear to be most salient in the earlier
Stages of Change , therefore it is predicted that the greatest magnitude of effect will be seen in the transition from PC-C for both the Pros and Cons. The transition from PC-C is anticipated to be approximately .5 SD with the transitions C-PR and PR-A each approximately .2 SD. Finally, since it is believed that the earlier Stages are most salient and since one would anticipate more "work " would be needed to move from pre-Action Stages towards Action than from Action to continuing to maintain that Action , the transition from A-M for Pros and Cons is predicted to have the smallest effect size .
Prediction 3. In an examination of the Strong and Weak Principles , the distribution of effect sizes were found to be heterogeneous for both Pros and Cons , therefore it is predicted that the distribution of effect sizes for each Stage transition will also be heterogeneous and that there will be several moderators of the effect size distributions. 3a. Although it is clear that the model generalizes to a variety of behaviors, there are many factors within the characteristic of the behaviors and the studies that may contribute to variation ( e.g. similarity of measures within some behaviors but not across behaviors). Therefore, as seen in previous studies  , it is anticipated that behavior category will be a moderator variable and that there will be heterogeneity within behaviors, indicating additional moderators. 3b. Studies have shown that patterns of TTM variables vary by age group , therefore it is predicted that mean age of study participants will be a moderator of effect sizes. 3c. A recent study found significant effect size differences between response formats  for the Pros and Cons, therefore it is predicted that response format will be a moderating variable . 3d. Publication status has not previously been examined as a potential moderator of effect size distribution. Datasets will be gathered from sources such as dissertation and conference presentations, since these types of publications are "refereed", although perhaps not as rigorously as peer-reviewed publications, it is anticipated that publication status will not be a moderating variable.

Literature Searches
The datasets for this study were identified through literature searches on computerized databases, PUBMED, Cancerlit , Cinahl, Health and Wellness There are a variety of ways in which researchers assess Stage of Change.
Two main categories of Stage assessment are algorithm type staging and clustering type staging. For inclusion in the current study, the Stages must have been assessed by an algorithm or Likert procedure and not cluster procedures ( e.g. URI CA, Socrates, or cluster analysis approach). Firstly , procedures such as the URICA , often don't yield traditional Stage categories (e.g., they often include categories called "immotive" and "non-reflective action"). Attempts at consistent interpretation of the profiles to ensure consistent interpretation would be a complex procedure and would require sufficient reporting of profiles to "re-categorize" them and link them to the Stages or order them from "least ready" to "most ready" to change . Additionally , profiles across studies are inconsistent which would mean interpretation would have to be done on a case-by-case basis. Lastly, scoring for cluster approaches , specifically URICA, are complicated and are sometimes improperly or inconsistently scored , therefore raw data and reanalysis would likely be necessary (D. A. Levesque , personal communication, December, 17, 2003).

Training of Coders
Lipsey and Wilson (2001)  Subsequently , coding of variables were reviewed and procedures for the coding process discussed. Coders practiced coding variables until researchers were comfortable with the coding procedures. Periodic meetings were conducted to review progress and discussion any difficulties encountered followed. In some cases significant discrepancies occurred between researchers for particular variables. These variables were then redefined and recoded for all studies.

Coding
A coding manual (see Appendix A) was developed during the course of the project to delineate the coding for study design, participant, measurement, publication, and research characteristic variables.  found that the typical rate of recording errors is approximately 1 %, though this rate can be as high as 48%. In order to minimize transcription error, data was entered directly into an excel spreadsheet by the researcher and by trained research assistants . To minimize clerical error and reduce subjective bias in coding, all data was checked at least twice. A third researcher reviewed any discrepancies and consensus was reached. Table 2 provides a summary of the primary variables extracted and coded.
The coding manual found in Appendix A provides more detailed descriptions of coded variables. Additionally key variables are expanded upon below.  ., 1998), the given behaviors were considered separate target behaviors.

Data Reversal
In general researchers consistently define the Pros and Cons. In some cases , the Pros and Cons are inversely defined , that is, the measures focus on the positive aspects of an unhealthy rather than of a healthy behavior (e.g., Borland & Segan, 2000;. For example , a study may examine the Pros of smoking (V elicer et al., 1985) rather than the Pros of quitting smoking . In this type of study, the Pros and Cons will be reversed for analytic purposes .

Acquisition versus Cessation
Behavior change can happen in two main ways, acquiring new behaviors (e.g., begin to exercise regularly) or ending an existing behavior (e.g., quit smoking).

Structure
It was predicted that the majority of the studies identified in the meta-analysis would identify or utilize a two-factor structure for Decisional Balance. The dimensional structure of Decisional Balance was determined by identifying results of analyses such as principal components analyses (PCA), or structural equation modeling. The factor structures of the studies are described descriptively.
Additionally, it was predicted that the two dimensions, Pros and Cons, would not be correlated. The strength of association between the Pros and Cons measures was extracted when either the scale or construct correlation coefficient was reported in the studies.
The internal consistency of the Decisional Balance measures was also examined; for those studies that provided alpha coefficients, the alphas were averaged and their range reported.

Function
The functional relationship between the Stages and Decisional Balance were explored by examining the given graphs or constructing new graphs with Stage plotted on the x-axis and the mean T-scores plotted on the y-axis. It was predicted that the majority of the crossovers would occur during the Contemplation and Preparation Stage. The functional relationships in Decisional Balance were examined descriptively and tabulations of crossovers conducted.

Magnitude
Effect size magnitude was investigated for each Stage transition : 1) Precontemplation to Contemplation; 2) Contemplation to Preparation; 3) Preparation to Action; 4) Action to Maintenance. Stage transition effect size were calculated using Hedges' g, standard score, or graphical standard score estimation.

Calculation Method
The preferred and most accurate method of estimating effect size is the Hedges' g method. Hedges' g requires means, SD's, and n's for each group (either raw scores or T-scores ), and is defined as the difference between the sample means divided by the average pooled sample standard deviation .

M-M
Sp ooled ( 7 ) If sufficient data were not available to compute Hedges' g, one of the standard score methods was used to compute effect size. Both the standard score and the graphical standard score methods require the standard (T) scores (M = 50, SD = 10) for each group, and were calculated by subtracting the sample means and dividing by the standard deviation. The only difference between the standard score and the graphical method was that the sample means for the standard score method were obtained from the text or a table, whereas, the graphical method used scores estimated from a graph.

Graphical Estimation Procedure
Estimations of T-scores from graphs were made by using a ruler to measure the point against the y-axis where the T-scores are graphed. The ruler measurements were then used to interpolate T-scores between adjacent tic marks on the y-axis.
Chapter Five illustrated that the T-scores method and T-scores estimated from a graph for Pros and Cons measures were very highly correlated and no significant differences were found between the two methods , thereby demonstrating that the graphical method provides a good estimation of T-scores. It should be noted that the graphical method was used only when Hedges' g could not be calculated and the actual standardized scores were not available in text or table.

Calculation Method Adjustment
When data are reported as standard scores, such as z (M = 0, SD = 1) or Tscores (M= 50, SD= 10), effect sizes are determined by subtracting T-scores and dividing by 10. This procedure appears correct because it is known that the SD of Tscores equals 10. Therefore , dividing the T-score difference by 10 should result in the standardized difference between the means. However, this procedure is suboptimal because 10 is the total group SD, not the pooled within-group SD used to compute Hedges' g or Cohen's d . In most cases the within-group variance will be less than the total group variance (i.e., whenever the grouping variable has a non-zero relationship with the dependent variable) , since this will reduce within-group variance relative to total variance (i.e., SSerror = SS 101 a1 -SSbetwee n), computing effect size by subtracting T-scores underestimates Hedges ' g.
As described in Chapter 5, regression equations specifically designed to correct for this bias in the Pros and Cons of Change were developed . Therefore , effect sizes based on either the standard score or the graphical standard score methods were subsequently corrected for bias using the following equation: Pros and Cons Correction = ( ES) ( 1.151) -0. 044 where, ES= standard score effect size estimate.
Additional analyses were conducted to verify the effectiveness of this bias correction, including analog ANOVA's. In the event that calculation method appears to moderate the effect size distribution, all cross-sectional data analyses including effect size calculations would be conducted separately for the effect size methods.
Based on previous work , it was anticipated that 70% of the effect sizes would be calculated using Hedges' g, whereas 30% would be calculated using standard scores.
Sample Size Bias. Simple pooling of data occurs when each set of data from a particular study is pooled together without regard to differences in sample size.
Therefore, studies that include 10,000 participants provide the same weight in the analysis as a study with 10 participants. To avoid this problem, all effect sizes were corrected for sample size bias by weighting each data point by its respective sample size ( see Equations 9-11). Another concern regarding sample size is that Hedges' g has been shown to be upwardly biased with small sample sizes, especially those less than 20 . Although the overall n's for the datasets were quite large , since the effect size for the Stage transitions were based on the sample size of a given Stage, these n's could be reasonably small. Therefore, the obtained effect size was corrected for sample size bias . This calculation was computed using Hedges' formula: 122 SE= ( 10) 1 2n 1 n 2 (n 1 +n 2 ) OJ= SE 2 = 2(n 1 +n 2 )2 +n 1 n 2 (ES') 2 where, ES= the observed (uncorrected) effect size estimate, ES' = the corrected effect size estimate , N = the total sample size, SE = the standard error of the corrected effect size estimate, n 1 and n 2 = the sample sizes of the two groups (adjacent Stages) being compared , and ro = the inverse variance weight. The inverse variance weight reflects the precision of the effect size estimate , which varies as a function of sample size. This method was used to weight the contribution of each study effect size so that larger studies were given more weight in the calculation of the overall meta-analysis effect size .

Confidence Intervals
Confidence intervals illustrate the precision of a parameter estimate . The confidence interval provides an additional dimension to reporting effect sizes by indicating how "confident " one can be in the measure of magnitude obtained. The researcher has the flexibility to choose the level of probability , and this is done by setting the probability percentage for the confidence interval. For instance, a 95% confidence interval indicates that the confidence interval has a 95% probability of containing the population parameter.
The width of the confidence interval is directly related to : (1) the amount of data used to generate a given effect size; (2) the level of confidence interval chosen by the researcher; and (3) the computational model used (Borenstein & Rothstein, 1999) . The larger the amount of data used the more precise the measure of effect size and therefore the narrower the confidence interval. Inversely, the greater the confidence level chosen by the researcher, the wider the confidence interval will be.
Lastly , variance across studies attributed to entirely random influences are modeled with a fixed effects model. This model allows for confidence intervals that can actually reach zero. On the other hand, if the random variance and between-studies variance is assumed then a random effects model is typically employed. This model tends to produce wider confidence intervals and limits the ability of the confidence intervals to approach zero (i.e., no confidence interval but rather a point "estimate").
Ninety-five percent confidence intervals were calculated around the mean effect sizes. This indicates that the confidence interval has a 95% probability of containing the population parameter. Based on the standard error for the mean and a critical value from the z-distribution the confidence intervals were calculated as follows: where SE Es is the standard error of the effect size mean, OJ; is the inverse weight associated with the effect size i with i = 1 to k effect sizes included in the mean and ES is the mean effect size, z(l-a) is the critical value for the z-distribution (1.96 for a = .05 ). Confidence intervals were reported for combined effect sizes . 124

Homog eneity of Effect Size
In meta-analytic studies , variation among effect sizes sometimes occurs due to random error, though oftentimes this variation becomes statistically larger than one would expect (due to random or sampling error). Tests of homogeneity attempt to identify the difference between variation due to random error and variation due to systematic differences between study design and participant characteristics.
Ultimately , assessing heterogeneity of effect size is important because the lack of a homogeneous distribution suggests the presence of possible predictors or moderators of the effect size magnitude .
A variety of methods of heterogeneity are available , though the majorit y are designed for assessing variation in treatment effects with odds ratios and percent of variance accounted for measure of effect size . Since this study utilized standardized effect size measures, the test of homogeneit y employed in this study was based on the Q statistic.
The Q statistic is distributed as a chi-square with k-1 degrees of freedom (k = number of effect sizes) (Lipsey & Wilson , 2000). The homogeneity analysis will be calculated using the equation: where ES ; is the individual effect size for i = 1 to k (the number of effect sizes) , and OJ ; is the individual weight for ES ; calculated using Equation 11 defined above.
The Q statistic is most accurate for sample sizes greater than 10  or 20 (Takkouche , Cadarso-Suarez & Speigelman , 1999) and increases in accuracy as sample size increases . Low power can result in statistical tests that fail to detect heterogeneity. Since many of the sub-groups included less than 20 (e.g., many of the behavior categories have less than 20), a priori subgroup analyses were established to ensure primary theoretically-based hypotheses were thoroughly tested.
Firstly , subgroup analyses between behavior categories with greater than 5 studies were conducted as well as additional behavioral subgroup analyses on each of these behavior categories (i.e., smoking cessation vs. smoking acquisition). Additionally, regardless of identified heterogeneity, subgroup analyses were performed for categorical moderators. Additionally, other methods of assessing heterogeneity were also employed. In particular, graphical methods for testing heterogeneity were used .
Forest plots displaying point estimates and confidence intervals for individual studies and summary estimates were created. Excel macros to create forest plots were developed. Forest plots were examined for systematic patterns in study or construct characteristics.
Tests of homogeneity of the distribution of effect size were conducted on all effect sizes for each transition for both the Pros and the Cons. Fixed effects and Random effects models were tested (as described below). Additionally, tests of homogeneity were conducted on all subgroup and moderator analyses.
Due to the variability in the dimensions of each study, it was anticipated that the effect sizes would be heterogeneous. Upon the discovery of heterogeneity, the follow-up tests were conducted initially for calculation method and then subsequently for the remaining potential moderator variables.

Modeling Variance
Two methods of modeling variance , fixed and random effects modeling , were used in this study. The fixed effects model assumes the only source of variance is subject-level sampling error, whereas the random effects model assumes that the source of variance includes subject-level sampling error and study-level sampling error (i.e., random error). The fixed effects model can also include systematic variance due to identifiable moderating variables.
Although a third model, the mixed effects model, exists -the mixed model is not employed in this study. Theoretically, a mixed model assumes all three sources of error , sampling error, random error and systematic error. The mixed effects model is typically used less often than fixed and random effects models due to the complexity of the modeling procedures and lack of consensus in the field regarding these procedures.
In a fixed effects model , it is assumed that the variance in the effect size distribution is only due to subject-level sampling error. So it is assumed that an effect size from an individual study represents the population effect with only random sampling error associated with chance factors. Tests of homogeneity examine the assumption that error is only due to sampling error and thereby serve as an assessment for whether the fixed effects model holds , though as discussed below , evidence of heterogeneity does not necessarily rule out a fixed effects model.
If one assumes that the fixed effects model holds but additional error is due to a systematic difference in the coded variables then one can attempt to partition the effect size variance. This assumption can be tested by performing an analog to 127 analysis of variance. This analog ANOV A was used to explore variables by partitioning the total variance Q (Total) into Q (Between) and Q (Within) . Q (Between) values were used to test between group differences by using a Chi-square with df= p-1.
The meta-analytic analog to the analysis of variance for the fixed effects model is calculated as follows : where Q is the between groups, ES 1 is the weighted mean effect size for each B group, OJ 1 is the sum of the weights within each group, and} is the number of groups.
where Q is the pooled Q within groups, ES; is the individual effect size, ES 1 is the w weighted mean effect size for each group, OJ 1 is the sum of the weights within each group, i is the number of effect sizes, and j is the number of groups .
If the fixed effects model does not hold then another model is sought. The random effects model assumes that sampling error is accompanied by other sources of variability randomly distributed . The sampling error is considered to be from the subject-level whereas the random error is an estimate of between-studies variance.
This between-studies error is thought to be analogous to study-level sampling error.
The variance components can be illustrated by the following equation ): where , u 0 is the random or between-subjects component , and u; is the subject-level sampling error.
Identifying the random effects variance component is the more difficult of the two components and can be assessed using a noniterative methods based on method of moment or iterative methods based on maximum likelihood. The method of moments estimate is calculated using the following formula ): where, Q is the value of the homogeneity test , k is the number of effect sizes and OJ; is the inverse weight for each effect size. The iterative maximum likelihood random effects variance estimate is a more accurate measure and was calculated using SPSS Macros written by David Wilson  Additionally, it is suggested that a random effects model is appropriate for studies pooled across populations with pre-existing differences .
Additional justification for the random effects model where the random effects variance component is found to be significant due to the wide range of age groups and recruitment settings the populations are drawn from in the studies combined in this meta-analysis.

Follow-up Comparisons
When conducting follow-up comparisons for the ANOV A analog, a Bonferoni test is sometimes recommended since typical follow-up tests for ANOV A such as Tukey are not available. A Bonferoni correction would be used for data given a fixed effects model. Since the random effects model is already a conservative approach to examining relationships among the effect sizes, Bonferoni corrections were not used for follow-ups to random effects comparisons.

Publication Bias or The 'File Drawer" Problem
Publication bias is always a potential problem when conducting metaanalyses. One of the most concerning problems occurs when a meta-analysis is examining intervention effects, since historically there has been a propensity to published studies with statistically positive results. One common method of testing publication bias is the "fail-safe JC method. The impact of any remaining sampling bias was estimated by computing the "fail-safe K''. This is an estimate of the number of undiscovered studies that, if known, would in aggregate reduce the overall metaanalysis effect size so that it was not statistically significant. Fail-safe K is computed using the following equation: where Ko = the number of studies with an effect size of zero required to reduce the overall effect size to nonsignificance, K = the observed number of studies, ES= the observed mean effect size, and ESc = the minimum effect size deemed significant. In the current study it is not of theoretical importance to test if the effect size estimates calculated are significantly different from zero , in fact some effect sizes are anticipated to approach zero. In the context of this study ESc is not relevant and therefore the fail-safe K is not an appropriate measure of publication bias.
Since the current study is not directly examining overall treatment effects a more important bias measure in this context will be the examination of the publication characteristics as potential moderator variables. Therefore, effect sizes for published and unpublished studies will be compared to assess the potential impact of publication bias on effect sizes . Additionally, publication status will be treated as a potential moderator variable and will be subjected to the moderator analyses described above .

Missing Data Procedures.
A variety of missing data strategies have been developed to statistically deal with missing data, unfortunately very few have been specifically designed to deal with missing data in meta-analytic studies. Three types of missing data are found when conducting meta-analyses, these are: 1) entire studies; 2) essential information for computing effect sizes; and 3) missing characteristic variables.
When data is missing at the study level due to non-publication , typically this is considered publication bias and is often called the "file drawer" problem and is typically assessed using the fail-safe K. Unfortunately the fail-safe K is not relevant to the current study ; therefore the impact of the missing studies can not be assessed.
Some studies were missing information for calculating effect sizes. When possible , information was extracted from alternative sources . For instance , in one study sample sizes by stage were not reported precluding the ability to calculate the pooled variance . In this case, alternative sources of data (i.e., mean squared error term) were used instead . Additionally, for some studies alternative calculation methods were used to compute effect sizes (as described above) and adjustments were made to minimize potential bias. Otherwise , studies missing essential information to compute effect sizes were excluded from the study.
Data missing from characteristic variables, that is, potential moderator variables can be handled in a variety of ways: complete case analysis, mean substitution, and available case analysis . Available case analysis is the most common procedure used in meta-analytic studies and allows for the most data to be retained thereby increasing power. Available case analysis was used in this study.

Outliers
Since such a wide range of potential moderating variables were likely to be examined and large variations in effect sizes across the transitions were anticipated , 132 only the most extreme effect sizes (2 to 3 standard deviations from the mean) were considered for removal or recoding. First , any identified outliers were scrutinized to assess possible sources of error. Only variables with clear sources of error (that can not be corrected) were considered for removal. Next, effects sizes were compared to the mean effect sizes two and three standard deviations from the mean were identified. Studies were then coded as containing outliers or not and crosstabs were conducted with potential moderating variables.

Graphing Techniques
In order to visually compare results between moderator variables graphically several techniques were employed, including T-score by Stage graphs and forest plots.

T-score by Stage Graphs
In order to create graphical representations in the "classic" T-score by Stage This midpoint then functions as the T-score of 5 0 ( once the score is converted as described below) and is subtracted from all Stage scores. This gives the number of standard deviations above and below the midpoint for each Stage. The scores are then converted to T-scores by multiplying by 10 and adding 50. Finally , these Tscores are plotted by Stage.

Forest Plot s
Forest plots provide a visual tool for assessing differences between studies and study characteristics. The primary data provided in a forest plot is the effect size estimate and the confidence interval around that estimate. Typically the means and confidence intervals are plotted around an axis point that is set at zero; this provides a visual aid for assessing significance of the mean. That is, if the confidence interval around the mean does not cross the zero axis then the mean is significantly different from zero. For treatment effects, a confidence interval that is greater than zero would mean the mean is significant with respect to the null hypothesis. In this study , the point of interest is the comparative value of an individual mean to the overall mean of each effect size estimate for each transition since there is no null hypothesis that is being tested. Therefore, forest plots will utilize the mean effect sizes estimate of the given transition for its axis rather than the zero (i.e., the null) .

Power Analysis
The principal question of power for meta-analysis involves not so much whether or not the overall effect size is statistically significant but rather whether there is sufficient power to determine if the effect size distribution is heterogeneous.
Power for the Q test depends on the ratio of between-to within-study variance . It is difficult to know what this ratio might be a priori. As an estimate,  reports a range of .33 to 1.0 across a large number of meta-analyses that he examined.  suggest variance ratios of .33, .67, and 1.00 be considered small , medium , and large degrees of heterogeneity , respectively. The figure below gives power for this range of variance ratios for sample sizes of 10 to 150 studies and alpha= .05. These results suggest that for the overall meta-analysis (across behaviors) , power would be excellent for medium and large degrees of heterogeneity and would be very good (at least .80) even for a small degree of heterogeneity , since the sample size is nearly 150. For Q tests within behaviors, the degree of heterogeneity would have to be in the medium to large range for power to be good for the expected sample sizes . Based on the results of a previous study on Decisional Balance , the degree of heterogeneity within behaviors was anticipated to be fairly large, since significant Q test statistics were obtained for each of the five behaviors that were tested individually, with sample sizes ranging from only 9 to 15.

General Characteristics of the Studies
One hundred sixteen studies encompassing 55 different target behaviors (Table 3) were examined in this meta-analysis . Some studies reported multiple studies ( e.g.,  or multiple behaviors ( e.g ., Herrick et al, 1997), therefore a total of 146 cross-sectional datasets were included. One hundred forty four reported sufficient data for the Pros of behavior change and 144 reported sufficient data for the Cons of behavior change. Approximately 32,000 pieces of data were extracted, coded or computed to serve as the basis for subsequent analyses.  The studies utilized three types of sampling procedures , convenience samples (85%), randomly sampled participants (14%) , mixed studies (1 %), that is, they used a combination of the first two procedures . One study did not provide sufficient information to determine sampling method. The total number of participants for the meta-analysis was 85,272. The sample sizes for the studies ranged from 19 to 21,535 participants. The datasets came from a variety of sources, including peer-reviewed journals (55%), dissertations or theses (21 %), manuscripts in Preparation (9%) , conference presentations (8%) , unpublished data (5%), and technical manuals (3%). The author , year of study , target behavior , sample size , and description of participants, mean age, age group, sampling method and country for each study can 140 be found in Table 7. Additionally , information regarding each of the studies can be found in Appendix B.

Source Year
Robbins 2002 Robbins 1999 Robbins 2002 Rossi 1990 Rossi 1990 Rossi 2001 et al. Rossi      was maintained for 96% of the datasets. One dataset (Bane et al., 1999) reported 2 Pros and 2 Cons; three others

Crossovers. Crossovers occur when the standardized Pros and Cons scores
change position relative to each other. If, at one Stage Cons are higher than Pros and at the next Stage Pros are higher than Cons, then on a graph a literal crossing of lines occur, hence the name crossover. Therefore, it can be said that the crossover is a function of how much and when the Pros increase and the Cons decrease . In this study, crossovers were determined for all datasets that reported T-scores or graphs based on T-scores; crossovers could be determined for 86% of the with quite small mean sample sizes for Precontemplation, Contemplation and Preparation (n = 15, 25, 9, respectivel y). Following a similar trend the C-A transition had the largest sample size in Maintenance (n=83) , with notabl y smaller averages for the previous three Stages (n = 23, 29, 28). Lastly , the studies that displayed multiple crossovers displayed a somewhat unique pattern of mean sample sizes, with the largest mean sample sizes on the polar ends of the Stage distribution (Precontemplation , n = 112 and Maintenance , n = 131) and similar mean sample sizes for the middle three Stages (n = 73, 67, 65). Table 8 displays the mean sample sizes across the Stages of Change for each of the primary cross-over transitions.         Lauby et al. 1998 Co~dom     Robbins et al. 2001    were 2 SD or 3 SD from the mean were carefully examined. Once outliers were corrected for clerical or calculation error, preliminary analyses of the outliers were conducted. Between 4 and 9 effect sizes were identified as 2 or 3 SD above the mean for each of the transitions, with a total of 27 for Pros and 21 for Cons. The 20 effect sizes for Pros and 14 for Cons that were greater than 2 SD above the mean (but less than 3 SD above the mean) were examined by Stage transition. Essentially , all these effect sizes showed clear clusters. For example, 7 effect sizes were identified as 2 SD above the mean for Pros transition PC-C; four of these effect sizes Schumann et al., 2003;Banikarim et al., 2003 ; were clustered between -.3 and -.4 and three of the studies Callaghan et al., 2002; were clustered between 1.6 and 1.8. Interestingly, the three studies clustered around 1.6 and 1.8 examined exercise , which is a behavior previously found to have , overall, the largest effect sizes across the Stages  and in particular for the PC-C transition .

NI Study/ Author Year data-PC-CC-PR PRA A-M PC-CC-PR PR-A A-M Behavior set
Therefore, it is clear since essentially all effect sizes clustered together and often did so in theoretically predictable manner, for this study -the standard of 2 SD above the mean was considered insufficient for identif ying outliers. Further analysis was then conducted .
Effect sizes found to be 3 SD or more above the mean were examined for possible removal. In total 14 effect sizes were identified, 7 for Pros and 7 for Cons .
These outliers were examined to assess possible relationships with various potential moderator variables. No significant relationships were found between the outliers and each of the moderating variables, including behavior category, cessation versus acquisition behaviors , age , calculation method , and publication status. Significant differences were found for recruitment setting (x 2 = 15.002 , df= 6, p < .020). In particular a large portion of the outliers were studies that recruited participants from clinic settings. Analysis revealed that these datasets typically were small in sample size. Although all datasets included in this study had a sample size of 19 or greater , since the total sample size for each stud y is divided across the Stages of Change in a given dataset, some of these studies were found to have remarkably small sample sizes per Stage .
In the end, the outlier analyses provided the basis for three criteria to be used to determine the removal of the outliers (1) effect sizes must be at least 3 SD from the mean for a given Stage ; and (2) effect sizes must be based on sample sizes with Stage sample sizes less than 5; and/or (3) effect sizes must be greater than .2 from the next closest effect size cluster. Therefore, if the effect size was 3 SD abo ve the mean and met one of the other two criteria the effect size was removed . Of the 14 outliers identified as 3 SD above the mean, six Pros and six Cons were removed and one Pros and one Cons was retained (see Table 10 below).  The largest ES average was found for the Pros of Change , specifically in the transition between the Precontemplation and Contemplation Stages. This effect size was found to be significantly greater than zero for both the fixed and random effects model , with a mean of . 63 (p < . 0 l; Cl= . 61 to . 66) for the fixed effects and mean of .65 (p < .0l ; CJ= .60 to .70) for the random effects model. The random effects variance component and overall test for heterogeneity was found to be significant (v = .06; Q (df= 135) = 504.60,p < .01).
A smaller increase was seen in the transition from Contemplation to Preparation for the Pros of Change with significant effects for the fixed effects (M = .14,p < .01; CJ= .11 to .17) and random effects model (M= .17, p < .01; CJ= .12 to .21 ). The random effects variance component and overall test for heterogeneity for this transition was found to be significant (v = .03; Q (df= 121) = 275.01 , p < .01) Again, significant (p < .01) but smaller increases were seen for the Pros of Change for the transition Preparation to Action for both models. A mean of .12 was found for both models , though with a smaller confidence interval for the fixed effects (CJ= .08 to .16) than for the random effects (CJ= .06 to .18) model. The random effects variance component and overall test for heterogeneity was found to be significant (v = .04; Q (df= 99) = 206 .65,p < .01). Essentially no effect was found from Action to Maintenance. The fixed effects (M= .02 , p = .17; CJ= -.01 to .06) and random effects (M= .02,p = .60 ; CJ= -.05 to .08) means were not found to be significantly different from zero. As with the other transitions, the random effects variance component and overall test for heterogeneity was found to be significant (v = .06; Q (df= 107) = 290.01 , p < .01).
The Cons of Change showed somewhat of an inverse pattern across transitions, though with overall smaller effect sizes. The smallest effect for the Cons of Change was found for the transition from Precontemplation to Contemplation.
Small, but relatively larger decreases were found for the Cons from

01). A summary of all transitions for Pros and Cons of
Change for both fixed and random effects models is found in Table 11.   Table 12. Calculation Method. In order to ensure the regression correction developed in Chapter 5 for calculation method was effective, the first potential moderator assessed was calculation method. As described previously , effect sizes were calculated by two primary methods: Hedges' g and Standard Score. Seventy percent of the datasets were calculated using Hedges' g and 30% were based on the Standard Score method , which was adjusted using the regression correction. No significant differences were found between the two methods using random effects modeling. The between-group tests of homogeneity for calculation methods for the Pros and Cons of Change across Stage transitions are reported in Table 13. Additionally, using random effects modeling, no within group tests of heterogeneity were found to be significant. The sample sizes , mean effect sizes and 95% confidence intervals are presented in Table 14

Magnitudes within Behaviors.
Since heterogeneity was found overall and no significant differences were found for calculation method, behavior categories were explored as possible moderator variables.
Many of the 55 target behaviors included in this study are represented by only one dataset , thereby precluding meaningful subgroup analyses . Thirty-three of the target behaviors had data for each of the transitions for both Pros and Cons. For descriptive purposes , the magnitudes of effects for each of the transitions for these target behaviors are depicted in composite graphics and displayed in Appendix C.
Primary behavior categories were created (see Appendix D) in order to group together studies that examined similar behaviors. In total , 10 behavior categories included 3 or more datasets, including : contraception, condom use, diet, drugs , exercise , medical screening , organ donation, smoking, sex decisions , and stress .
Mean effect sizes , standard errors, 95% confidence intervals , and p-values for behavior categories with 3 or more datasets can be found in Table 15 and Table 16.

ES
The reference line in each of the forest plots represents the overall random effects mean for all the behaviors combined (see Table 11). The mean and confidence intervals for each of the behaviors is then plotted in comparison to the overall mean.
The forest plots show that for Pros for exercise are significantly greater than the overall mean in the PC-C and PR-A transitions. The Pros for smoking were found to be significantly less than the overall mean for PR-A and Pros of Medical Screening were found to be significantly less than the mean for the A-M transition. The Cons for diet were found to be significantly greater than the overall mean for the PC-C transition and the Cons for exercise were found to be significantly less than the C-PR transition .
.... Within group tests of homogeneity for each of the five major behavior categories were calculated (Table 17.

Smoking Cessation versus Smoking Acquisition
The behavior category of smoking was made up primarily of smoking cessation and smoking acquisition. Effect size means, sample sizes, standard error, 95% confidence intervals, and p-values for smoking cessation and acquisition were computed (Table 18) and composite graphics were created (Figure 8 and Figure 9).
In order to determine potential differences between these behaviors, an ANOVA analog using random effects modeling was performed on the two groups. No significant differences were found for the Pros of Change. Significant differences were found between smoking cessation and smoking acquisition for Cons of Change in the transition from PC-C (Q (df = 1) = 8.50, p < .01) and in the transition from PR-A (Q (df = 1) = 7.18,p < .01). Due to lack of data for the A-M transition for smoking acquisition differences between smoking cessation and smoking acquisition could not be computed.

Smoking Cessation -Pros and Cons of Smoking versus Quitting
The behavior category of smoking cessation was made up of studies that  (Table 19) and composite graphics were created (Figure 10 and Figure 11 ). In order to determine potential differences between these behaviors , an ANOVA analog using random effects modeling was performed on the two groups.

Condom Use -General, Main , Other
The behavior category of condom use was made up primarily of studies that examined either condom use in general or more specifically with a main partner or other partner. Effect size means, sample sizes, standard error, 95% confidence interva ls, and p -values for these three sub-sets of condom use were computed (Table   20) and compos ite graphics were created (Figure 12, Figure 13 and Figure 14). In order to determine potentia l differences between these behavioral subsets, an ANOV A analog using rando m effects modeling was performed on the three groups.

Dietary Fat Reduction versus Fruit and Vegetable Consumption
The behavior category of diet was made up primarily of studies that examined either dietary fat reduction or fruit and vegetable consumption. Effect size means, sample sizes, standard error, 95% confidence intervals, and p-values for dietary fat reduction and fruit and vegetable consumption separately were computed (Table 21) and composite graphics were created (Figure 15 and Figure 16). In order to determine potential differences between these two behavioral subsets, an ANOVA analog using random effects modeling was performed. No significant differences were found for the Pros or Cons of Change for any of the transitions of for these two groups.  03 .11 -.24 .19 .81 Note. N = sample size; ES= mean effect size; L = lower bound ; U = upper bound ; p = significance of effect as compared to zero; * indicates overall significant differences between groups at p = < .05.

Additional Moderators of Effect Size Distribution
Since overall effect sizes were heterogeneous and only a few instances of heterogeneity were found for Stage transitions within behaviors , additional moderators of effect size distribution were explored.

Age Group -Adol escents, College, and Adults.
Effect size means , sample sizes , standard error , 95% confidence intervals , and p-values for age group were computed (Table 22) and composite graphics were created (Figure 17, Figure 18 and Figure 19). Differences among age groups were assessed for the Pros and Cons effect sizes across each of the four Stage transitions using random effects modeling. No significant differences were found between adolescents , college students and adults for any of the transitions for Pros .
Significant differences for Cons of Change were only found for the PC-C transition (Q (df= 2) = 13.54, p < .01). Follow-up analogs showed that Cons of Change were significantly greater for adolescents than college students (p < .01) and adults (p < .01).

Cessation versus Acquisition
Effect size means, sample sizes, standard error, 95% confidence intervals , and p-values for cessation and acquisition behaviors were computed (Table 23)   Note. N = sample size; ES= mean effect size ; L = lower bound; U = upper bound; p = significance of effect as compared to zero; * indicates overall significant differences between groups at p = < .05 .

Healthy versus Unhealthy
Effect size means, sample sizes, standard error, 95% confidence intervals, and p-values for health and unhealthy behaviors were computed (Table 24) and 207 composite graphics were created (Figure 22 and Figure 23). Effect sizes for healthy and unhealthy behaviors were compared across the four Stage transitions using random effects modeling. Mean effect sizes for healthy behaviors were significantly greater than unhealthy behaviors for the Pros of Change in the transition from PR-A (Q (df= 1) = 9.08,p = .003) and from A-M (Q (df= 1) = 5.82,p = .007). No significant differences were found between healthy and unhealthy behaviors for the Cons of Change. ...

Response Format
Effect size means, sample sizes, standard error, 95% confidence intervals, and p-values for cessation and acquisition behaviors were computed (Table 25) and 209 composite graphics were created (Figure 24 and Figure 25). Effect sizes for Decisional Balance measures using "How Important" and "Agree/ Disagree" likert scales were compared across the four Stage transitions using random effects modeling. No significant differences were found between response formats for the Pros of Change. The mean effect sizes for the "Agree /Disagree " format were significantly greater than "How Important " format for the Cons of Change in the transition from PR-A (Q (df= 1) = 3.99 , p = .046).  -·-- ..... ....... ......

Country -US versus Non-US
Effect size means, sample sizes, standard error, 95% confidence intervals, and p-values for US and non-US studies were computed (Table 26) and composite graphics were created (Figure 26 and Figure 27). In order to determine potential differences between these sub-groups , an ANOVA analog using random effects modeling was performed . No significant differences were found for the Pros or Cons of Change across the Stage Transitions for studies conducted in the US or outside of the US.

Language -Engli sh versus Non-Engli sh
Effect size means, sample sizes, standard error , 95% confidence intervals , and p-values for studies conducted in English and in languages other than English were computed (Table 27) and composite graphics were created ( Figure 28 and Figure 29). In order to determine potential differences between these sub-groups , an ANOVA analog using random effects modeling was performed. No significant differences were found for the Pros or Cons of Change across the Stage Transitions for studies conducted in English or a language other than English.

Publication Statu s
Effect size means, sample sizes, standard error, 95% confidence intervals, and p-values for peer-reviewed journals and non-peer-reviewed sources were computed (Table 28) and composite graphics were created (Figure 30 and Figure   31 ). Effect sizes for datasets that were from peer-reviewed journals and non-peerreviewed datasets were compared across the four Stage transitions using random effects modeling . No significant differences between effect sizes were found between data from peer-reviewed journals and non-peer-reviewed sources for the Pros of Change. Mean effect sizes for peer-reviewed journals were significantly greater than non-peer-reviewed sources for the Cons of Change in the transition from PR-A (Q (df= 1) = 4.10,p = .043).

Actual versus Reversed
In some cases, the Pros and Cons are inversely defined, that is, the measures focus on the positive aspects of an unhealthy rather than of a healthy behavior ( e.g., Borland & Segan, 2000;. For example, a study may examine the Pros of smoking  rather than the Pros of quitting smoking . Actual and reversed refer to the way in which the data were used in the Pros and Cons analyses. Effect size means, sample sizes, standard error, 95% confidence intervals, and p-values for actual and reversed behaviors were computed (Table 29) and composite graphics were created ( Figure 32 and Figure 33) . Effect sizes for the behaviors with Pros and Cons reported in the "actua l" direction and behaviors with the measures reversed were compared across the four Stage transitions using random effects modeling. For the Pros of Change in the transition from PR-A , the mean effect size for "actua l" was significantly greater than the mean for the "reversed" measures (Q (df= 1) = 5.73,p = .02). For the Cons of Change, the "reversed" measures showed a larger decrease than the "actual" measures for the A-M transition.

Overview of Moderators
For descriptive and summary purposes the mean effect sizes for each of the primary moderator variables is listed in Table 30. Overview of Effect Sizes by Moderator Variable

Discussion
In examining Decisional Balance and Stages of Change for 55 different behaviors across a variety of populations , several common characteristics were found . Structure , function and generalizability of the Stages of Change and Decisional Balance measures as well as the magnitude of effect and homogeneity for effect size distributions across Stage transitions for the Pros and Cons are discussed.

Structure
The internal validity of the two-factor structure of Pros and Cons of Change identified by V elicer et al. (198 5) was maintained for 96% of the datasets, which is consistent with the 92% found by  and 94% found by . Although a large percentage of the studies support a much simpler configuration than posited by  as  suggests, it should be noted it is unclear whether the measurement development in each study includes a sufficient variety of items to test this comparison effectively.
One hundred twelve datasets included alpha coefficients as a measure of internal consistency. For Pros the average coefficient alpha was .82 (SD = .09), with a range from .58 to .96, whereas the average coefficient alpha for Cons was a bit lower at .76 (SD= .10) and ranged from .41 to .94. The internal consistency measures found in this study overall are slightly lower with ranges wider than previous studies. In comparison ,  found measures of internal consistency ranging from .75 to .95 and  ranging from .41 to .95, whereas the current study found an overall range of .41 to . 96. The similarity between the current study and the study by  is somewhat surprising since the current study has nearl y twice as many studies. It is likely that future studies will find internal consistencies well within this range. Additionally , many of the studies have large numbers of participants in one Stage and not another and these unequal distributions of sample size across the Stages may "pull" on the crossover resulting in a less stable pattern. Preliminary examinations of the use of effect sizes to graph crossovers reveal that the use of this procedure could help resol ve this problem , especially in the context of meta-analytic studies when relying on secondary data. Although not pursued in the present study , further exploration of the ability of the effect size graphic procedure to correct this problem is needed. In particular, a comparison study of various weighted transformations of crossover graphs with the procedure used in this study is warranted. It appears the use of weighting procedures may help elucidate the apparent cross-over biases.

Modeling Variance
Random and fixed effects models were fit to the effect size distributions for the Pros and Cons of Change across the Stage transitions. Statistically, for each of these analyses on the total sample, the random effect variance component was found to be significant. The significant random effects components for each of the Stage transitions for the Pros and Cons support the use of the random effects model.
Theoretically, due the large variation on populations across the 146 datasets, a random effects modeling approach was warranted. Another important distinction between fixed and random effects models is the overall inferential goal. A fixed effects model aims to make inferences regarding the particular observed parameters whereas the random effects model aims to make inferences regarding the distribution of effects . Therefore, the random effects model will enable greater generalization beyond the observed studies, which is an important goal of this study .
The decision to use a fixed versus random effects model is an important one, and was so especially in the context of this study. Table 31 and Table 32 show the pattern of results for the primary moderators using fixed effects versus random effects modeling. Between-group heterogeneity is displayed for the major moderator categories (in bold) and within-group heterogeneity is displayed for the sub-groups (non-bolded text). It is clear that the pattern effects are distinct between the two models. A much larger percentage of heterogeneity is identified with the fixed effect model than the random effects model. Overall, this pattern of effects is as predicted.
Increased standard error in the random effects model produces wider confidence 223 intervals, therefore in the context of hypothesis-testing there is a greater likelihood of retaining the null with random effects relative to fixed effects. In sum, a random effects model is theoretically most consistent with the inferential goals for this study and was statistically supported.

Magnitude of Effects
The largest effect size average was found for the Pros of Change, specifically in the transition between the Precontemplation and Contemplation Stages. The magnitude of this effect was .65 standard deviations . Additionally , smaller increases were seen in transitions from Contemplation to Preparation (Hedges' g = .17) and Preparation to Action (g = .12), with essentially no effect found from Action to Maintenance (g = .02). The Cons of change showed an inverse pattern across transitions , though with overall smaller effect sizes. Essentially no effect was found from Precontemplation to Contemplation (g = -.08), with relatively small decreases in Cons from Contemplation to Preparation (g = -.15), Preparation to Action (g = - Precontemplation to Action, whereas the Weak principle is approximately a½ standard deviation  or .62 SD (as found by  decrease in the Cons of Change as one progresses from Precontemplation to Action).
Since the Strong and Weak Principles measure the maximum increase and decrease of the Pros and Cons from Precontemplation to Action, rather than an absolute difference, these two principles are "biased". That is, the principles result in a potential over-estimation of the cumulative effect size across the three transitions PC-C, C-PR, and PR-A. Therefore it was predicted that (PC-C) + (C-PR) +(PR-A) < l SD (standard deviation) for Pros and (PC-C) + (C-PR) +(PR-A) < .62 SD for Cons. Based on the overall effect size estimates across the Stage transitions this prediction was not entirely supported. The total cumulative effect size for Pros= .96 and the cumulative effect for Cons= .70. Although the total for the Pros of Change was just slightly below 1 SD the Cons of Change was actually greater than .62 SD.
In general , use of effect sizes to assess the magnitude of Stage transitions can provide guidance in the development of tailored interventions by suggesting how much intervention resources should be allocated for each construct at each Stage transition. But most importantly , in order-to best utilize effect size estimated for intervention purposes, detailed analyses of moderators of heterogeneity is essential for identifying where and when these effect sizes may vary .

Heterogeniety
As predicted, the overall effect size distributions for each of the Stage transitions for both the Pros and Cons were found to be heterogeneous . This warranted investigation of potential moderators of effect size distribution. Graphical and statistical approaches were used to explore moderators.

Calculation Method
No significant differences were found between the two primary calculation methods. This finding provides additional support for the effectiveness of the 228 adjustment for calculation method developed in Chapter 5. The regression adjustment for calculation enabled 30% of the studies to be retained in this metaanalyses that otherwise would have been excluded . Overall, this increased the power of the study and allowed more thorough moderator analyses to be conducted .

Target Behaviors and Behavior Categories
The overall heterogeneity of the distribution (found for the effect sizes for both the Pros and Cons for each of the Stage transitions) signified that modifiers of magnitude were likely present across the datasets. The most apparent potential moderator of effect size was thought to be behavior .
Due to the large number of target behaviors represented by only one or two studies, analyses by target behaviors were limited. Additionally , an important consideration for the interpretation of magnitude across moderators , in relation to the small target behavior sample sizes, is the potential for differential impact of unequal representation of moderators. For example, in this meta-analysis 21 datasets investigated exercise, yet only one study examined calcium intake. Exercise studies were generally found to have greater magnitudes than other behaviors, specifically in the PC-C and PR-A transitions (see Figure 6), and therefore a disproportionate number of exercise studies in an alternate moderator analysis, such as publication status, could create an upward bias in a given subset for these transitions. Therefore unequal weighting of moderators such as behaviors may influence differences between behaviors on the results of a given dimension of the constructs.
Behavior categories were created to aggregate similar behaviors together to enhance the ability to conduct sub-group analyses . Preliminary descriptive analyses 229 were performed on the ten behavior categories with more than 3 datasets. In general, the mean effect sizes for the behavior categories did not vary greatly from the overall means (for all 146 datasets) for the Pros and Cons of each Stage transition. Fifty percent of the mean effect sizes for the behavior categories varied less than .1 SD from their respective Stage transition mean . No mean effect size was found to be larger than .42 SD from the given mean for the Stage transition. The largest difference in means between each of the behavior categories was .65 SD . These relationships can be seen graphically in the forest plots above (see Figure 6 and Five behavior categories had sample sizes of greater than five. These five behaviors ( condom use, diet, exercise , smoking, stress) were included in analog ANOV A analyses to assess differences among the behaviors. Significant differences between behavior categories were only found for the Pros of Change. These differences were in the transitions between PC-C and PR-A. Exercise was found to be significantly greater than each of the four other behavior categories for PC-C and significantly greater than smoking and diet in the PR-A transition . Interestingly ,  found exercise to have the largest magnitude of effect from Precontemplation to Action for the Pros of Change in comparison to other behavior categories examined (specifically diet, smoking , and condom use). It appears that instead of having larger effects across all stage transitions, the differences for exercise are in the PC-C and PR-A transitions . One important consideration is the potential difference in algorithms found in the exercise studies. Marcus et al. (1992) used a measure called the Contemplation Ladder. This measure was subsequently used by other researchers. Future studies should compare effect size differences between these two types of algorithms. Lastly, condom use was found to be significantly greater than smoking and stress in the transition from PR-A for Pros.
Within-group heterogeneity was assessed for each of the behavior categories to determine if variation within the categories existed. Of the 45 heterogeneity tests, only 2 were found to be significant: the distribution of effect sizes for the C-PR transition of exercise and the PR-A transition for smoking. This indicates that the overall behavior categories are homogeneous and likely few moderators are present within the behavior categories .
Although it is beyond the scope of this study to conduct full moderator analyses of each of these behaviors, a priori sub-groups analyses were conducted and variety of comparisons explored. Firstly, since the smoking behavior category was primarily made up smoking acquisition and smoking cessation, these two behaviors were compared. Smoking acquisition and cessation were in fact found to be different in the PR-A Stage transition. Interestingly, there were also differences found between these two behaviors in the PC-C Stage transition. In efforts to understand these differences, general differences between these behaviors were considered. For instance, the all smoking acquisition studies were conducted with adolescents. Since all smoking acquisition studies were all conducted with adolescent samples, comparisons between age groups for the smoking category were not possible. In the larger study context, when all studies were combined, no differences were found 231 between age groups; therefore it is believed that age is not moderating this effect.
Clearly more research is needed to understand the influence on age since age is nested in these analyses.
Unlike smoking cessation, smoking acquisition is a prevention behavior. That is, by examining behavior change in smoking acquisition one is looking at More simply, the overall pattern of differences between smoking acquisition and smoking cessation were consistent with the findings of the differences between acquisition (e.g., exercise) and cessation (e.g., quitting drugs) behaviors across all the datasets. As with smoking, comparisons revealed significant differences for the PC-C and PR-A transitions.
Smoking cessation was examined more closely in efforts to determine Another a priori behavioral comparison included the behavior category of condom use. Condom use was comprised of three primary sub-groups, condom use in general, for main partner, and for "other" partner. Condom use in general and condom use with main partner were found to be significantly greater than condom use with "other" for the C-PR transitions for Pros. Interestingly, this difference creates a composite profile for condom use with "other" that is similar to infrequently or "yearly" performed behaviors (as seen in Figure 36). In these profiles the Pros and Cons appear to have essentially the same T-scores and show essentially no change from C-PR. This similarity of pattern could be attributed to the fact that condom use with an "other" or secondary partner may be as infrequent as other "infrequent" behaviors.
The final a priori behavioral comparison was between the two diet subgroups, dietary fat reduction and fruit and vegetable consumption. No significant differences between these two groups were found.
In addition to the a priori behavioral comparisons and the moderators discussed above, several other moderators were explored. V elicer et al. (2000) illustrated that the relationship between the Stages and Decisional Balance for an unhealthy behavior is different than for a healthy behavior. That is the pattern for an unhealthy behavior was such that the Cons decreased across the Stages whereas the Pros displayed a curvilinear pattern, which paralleled the decline of the Cons in the later Stages. In contrast, healthy behaviors showed more of an X configuration, with the Pros continuing to increase across the Stages whereas the Cons decrease across the Stages. This finding was replicated in the current study. Unhealthy behaviors showed a significant decrease in the Pros of Change in the PR-A and A-M transition, producing the curvilinear pattern  describes.
Two response formats were used for the Pros and Cons measures in the majority of studies. Significant differences were only found for the Pros of Change in the PR-A. No significant differences were found for the Cons of Change. It is unclear what this difference indicates, but it would be warranted to explore these differences further. Unfortunately, the majority of the studies utilized the How Important format which may have influenced the degree of significance found between the two.

Generalizability .
Overall, the generalizability of the Transtheoretical Model across a variety of populations was supported with this large set of studies examining Stage of Change, Pros and Cons, and the relationship between them. A variety of moderators were explored but most notably, no significant differences in effect size distributions were found across the Stage transitions for country or language. These homogeneous findings lend important credence to the generalizability of the observed constructs.
Although an attempt was made to examine demographic variables within studies of a particular behavior, these analyses were plagued by small sample sizes . Therefore as more studies emerge, it will be necessary to continue to look at the Transtheoretical constructs for particular demographic variables across behaviors. The populations varied greatly across studies and although many subject characteristics were examined as moderator variables, the investigation of some of these sample characteristics (e.g., gender) are best examined using longitudinal techniques and therefore interpretation should be made with caution. It is important that population characteristics continue to be examined to make a more definitive statement of generalizability.  originally found support for the strong generalizability of the TTM for Stage of Change and Pros and Cons across 12 behaviors.  provided additional support for the generalizability of these constructs across 37 Behaviors. The current study continues to provide support for the generalizability of the model for a growing number of behaviors, 55 behaviors in total. Although there is strong support on the whole for the generalizability of the Transtheoretical Model and the integration of the Decisional Balance measures across the Stages of Change, there were some study characteristics that should be examined more closely to assess potential limitations of the model or necessary adaptations of the constructs. For instance, when averaged, behaviors identified as infrequent or "yearly" (e.g., mammography) seemed to show an overall lack of change between C-PR (see Figure 36) for both the Pros and Cons. In this case, the typical staging (e.g., time frame) for Contemplation and Preparation may not be sufficient. It will be important in the future to assess the relationships of other TTM constructs across the Stages for moderators such as frequency of behavior (i.e., "yearly") in order to establish if these relationships persist before making clear statements regarding generalizability. That is, if this pattern of essentially no change for the Pros and Cons from Contemplation to Preparation for behaviors that are performed "yearly" is also found for other constructs, such as Self-efficacy and Processes, this would give rise to the concern to the generalizability of Contemplation and Preparation Stages for behaviors that infrequently occur. It is clear that further research is needed to more clearly identify such patterns. Table 33 provides an overview of the between-group heterogeneity across the moderator variables. Although overall generalizability for the model was found, it is clear that a variety of characteristics can influence the degree of change in the Pros and Cons during a particular stage of the change process. These results provide a guide for where future studies can focus in order to establish more definitive relationships between variables and establish fine-grained analyses of these nuances.

Table 33
Between Groups Heterogeneity for Primary Moderators In sum, the results suggest interventions may provide the greatest impact with maximum efficiency by placing the most emphasis on increasing the Pros during the Precontemplation Stage, followed by additional efforts during the Contemplation and Preparation Stages. In contrast, efforts to decrease the Cons are most likely to be useful during the Preparation Stage with continued emphasis into the Action Stage.
Contrary to prediction, these results are only partly consistent with previous studies, which agrue that Decisional Balance is especially salient in the earlier Stages of Change . Instead it appears that Pros of Although a large number of studies overall were included in this metaanalysis, small sample sizes often precluded meaningful analyses of moderators.
Continued efforts to gather studies to increase these sample sizes should be made.
The primary analyses conducted in the study included ANOVA analogs to compare moderator subgroups. In addition to ANOV A analogs, one can employ the use of meta-regression to examine the association of effect among study characteristics.
Meta-regression is a more sophisticated approach to assessing moderators when heterogeneity is observed . Future studies should examine moderators using meta-regression in order to determine overall which moderators contribute to the variance in the effect size distributions.

240
All datasets utilized cross-sectional data; therefore although the effect sizes were exploring the increase and decreases across the Stage transitions between participants in various Stages, it cannot be assumed the same relationships will necessarily occur when examining the movement of participants from one Stage to another with longitudinal data. Therefore, it is important these results be compared to the examination of these relationships using longitudinal data.

Introduction
The primary goal of examining cross-sectional data is to use readily available or easily attainable data to explore theoretical relationships that explain a phenomenon across a temporal dimension. It is anticipated that cross-sectional Cons) and Self-efficacy (Situational Confidence or Temptation) . Additionally, the TTM explains behavior change strategies through ten Processes of Change.
A previous study examining the TTM (Prochaska , Velicer , Guadagnoli , Rossi & Di Clemente , 1991) has demonstrated the ability of longitudinal studies to validate cross-sectional results. To date, no study has investigated the ability of longitudinal data to validate cross-sectional results in the context of TTM Stage transitions.
Additionally, there are currently no established techniques for examining the effect sizes of Stage transitions in longitudinal data, therefore this study was treated as an exploratory examination oflongitudinal datasets. Additionally, previously examined cross-sectional relationships will be compared to these longitudinal results.

Stages of Change
In health psychology, the concept of change occurring in a series of stages has been examined in efforts to understand the temporal aspects of change in human nature or behavior. An example of the stage concept can be seen in Hom's (1976;Hom & Waingrow, 1966) work with smoking behavior and cessation. Hom developed a four stage process of change investigating smoking behavior, which consisted of 1) contemplation of change; 2) the decision to change; 3) short-term change; and 4) long-term change. Although arising independently and in a different context, the Stages of Change as conceptualized by the TTM are similar to Hom's stages. Over the years, through its own evolution, the TTM ultimately identified five Stages (DiClemente,  .
The algorithms for the five Stages of Change are specific for each behavior, but usually follow these general Stage concepts. Participants are considered to be in the Precontemplation Stage if they report an undesired status, that is, the presence of a problem behavior or the lack of a healthy one, and express no intention of changing 263 in the next six months . Participants are considered in the Contemplation Stage if they intend to change in the next six months. Participants in the Preparation Stage plan to change in the next month and have begun to engage in target behaviors, but have not yet met particular criteria. Participants reach the Action Stage once they have met the given behavioral criteria. Lastly, if the participant has met the specified behavioral criteria for greater than six months, they have reached the Maintenance Stage .

Decisional Balance
The TTM originated by integrating theories of psychotherapy as well as incorporating constructs from alternative models. One of the most important and reliable TTM constructs, Decisional Balance , was inspired by Janis and Mann ' s (1977) conflict model of decision-making.   The development of the TTM Decisional Balance measure (Velicer , Di Clemente, Prochaska, and Brandenburg, 1985) was based on the 8 factors ( 4 gains and 4 losses) of . The researchers constructed the scale to study the decision-making process across the Stages for smoking cessation. Instead 264 of achieving an 8 factor-structure as anticipated , principal components analysis identified two orthogonal components. These two components were called the Pros and Cons of Smoking.
Following the original study by Velicer , the use of the construct began to expand, encompassing an array of behaviors such as exercise , condom use, and mammography screening. This early work culminated in a paper by , which looked at patterns in Decisional Balance across Stages in 12 behaviors .

Strong and Weak Principles
Across twelve studies , mathematical relationships were found between the pros and cons of changing and progress across the early Stages into Action  In re-examination of the Strong and Weak Principles, the magnitude of the maximum increase in the Pros of Change was again found to be greater than the maximum decrease in the Cons of Change from Precontemplation to Action across 37 difference health behaviors  . Consistent with Prochaska ' s (1994) Strong Principle , the average effect size for the pros was approximately one standard deviation (d = 1.05, SD= .45), almost identical to  original finding (d = 1.06,SD= .26).  findings also revealed that Prochaska's Weak Principle might not be so weak. That is, the average effect size for cons was stronger (d = .62, SD= .38) than was found in the previous study (d = .45,SD= .22) by , though clearly the cons remains weak relative to the pros. Practical implications of these principles are that the pros of changing must increase twice as much as the cons must decrease, suggesting that an intervention place twice as much emphasis on raising the benefits as on reducing the costs or barriers .

Stage Transitions
The across 18 countries utilizing measures in more than 10 languages. The largest effect size was found for the Pros (see Table 1), specifically in the earliest transition, Precontemplation to the Contemplation Stages (PC-C) .. Smaller increases were seen in following two Stage transitions , Contemplation-Preparation (C-PR) and Preparation-Action (PR-A), and essentially no effect found in the Action-   These remarkable relationships between the Pros and Cons and the Stages of Change have been examined cross-sectionally. The cross-sectional relationships are an essential part of this program of study and clearly provide important insights into the theoretical model and for its practical applications. In efforts to continue to expand this research, to broaden its generalizability and strengthen its potential for application , it is important to begin to explore these relationships in a longitudinal context. The current study aims to begin to establish methods for exploring these relationships and preliminarily compares cross-sectional data with longitudinal data in order to help establish how well cross-sectional data can predict longitudinal relationships in the context of Stage transitions.

Research Hypotheses and Predictions
Hypothesis 1. There is a relationship between the cross-sectional and longitudinal Stage transition effect sizes for Decisional Balance.
Prediction 1. A previous study  demonstrated that cross-sectional and longitudinal data examining patterns of change were comparable.
It is predicted that the magnitudes of effect for the cross-sectional and longitudinal data will show similar patterns for Decisional Balance across Stage transitions.
Prediction 2. It is anticipated that the effect size index for the longitudinal data will produce larger effect sizes than the effect size index for independent groups (used for the cross-sectional data) due to the nature of dependency in the longitudinal data. Therefore it is predicted that the longitudinal Stage transition effect sizes will be larger that the cross-sectional Stage transition effect sizes.

Procedure
Since there are no established techniques for examining the effect size of

A Population-based , Stage-based Expert System Intervention for Smoking
Cessation : Random-digit dialing procedure was used to recruit a representative sample of smokers from Rhode Island .
A total of 32, 456 calls were conducted, 14,266 participants were identified as eligible , and 12,109 agreed to complete a preliminary phone survey. Of these, 4209 were smokers and 7813 were non-smokers. A total of 4144 participants were 271 included. These participants were randomly assigned to intervention or assessment only. The intervention group received Expert System materials in the mail, including a feedback report and stage-matched self-help manuals. The assessment only group was assessed in 6 month intervals. Participants in the intervention group received progress questionnaires at 3 and 6 months. Phone surveys were given for participants that did not respond via mail within 2 weeks . All participants were assessed by mail or phone at 12, 18 and 24 months.

Impact of Simultaneous Stage-matched Expert System Interventions for
Smoking, High Fat Diet and Sun Exposure on a Population of Parents (Prochaska, Velicer, Rossi, Redding, Greene , Rossi , Sun, Fava , LaForge & Plummer , in press): Schools participating in an on-going health promotion study provided a list of parents of 9 th graders. Initial screening identified 3 507 potential households . A total of 2931 respondents were contacted by phone, with one parent recruited from each eligible household . Eligibility requirements necessitated the participants to be at risk for at least one of the three health risk behaviors (sun, smoking, and diet). Four Participants were randomly assigned to one of two groups , Intervention (N = 1209) or Assessment Only (N= 1251). Participants in both groups were administered follow-up assessments at 12 and 24 months .

Randomized Controlled Community Trial of the Efficacy of Multi-Component
Stage-Match ed Intervention to Increas e Sun Protection among Beachgo ers (Weinstock, Redding , Rossi, Maddock, 2001) : Seven coastal beaches in Rhode Island were selected for the study. Participants were randomly assigned to intervention or control conditions. Follow-up assessments were conducted for all study participants at 2, 12 months, and 24 months after baseline by mail or telephone . Study retention rates were 83% of baseline at 2 months (N= 1930), 70% of baseline at 12 months (N = 1628); and 62% of baseline at 24 months (N = 1449). Participants completed a survey on the beach with a trained interviewer. Ages of the participants ranged from 16-65, with an average age 33 years . The sample was primarily female (60%) , white (94%), single (51 %) or married (40%), with at least a high school education (88%) , and with a median income of $45,000-65 ,000 per year.

Analysis
The longitudinal data was examined by behavior with treatment and control groups combined. Data was analyzed separately by behavior for each study sample.
Firstly behaviors were examined separately by adjacent time points (e.g ., baseline to 6 months, 6 months to 12 months). Then the longitudinal data were combined across time points by behavior separately for each sample.

Stage Transition Membership. Stage transition membership was established
by identifying subjects that moved from one Stage to the next from one time interval to the next. Figure 38 where Sgain = the standard deviation of the mean of the difference between the two time points , and r = the correlation of the two means.

Finally, the cross-sectional and longitudinal Stage transition effect sizes for
Pros and Cons will be descriptively compared.
T-score by Stage Graphs . In This midpoint then functions as the T-score of 50 ( once the score is converted as described below) and is subtracted from all Stage scores. This gives the number of standard deviations above and below the midpoint for each Stage . The scores are then converted to T scores by multiplying by 10 and adding 50. Finally, these Tscores are plotted by Stage.

Results
Overview. Two behaviors were examined across three longitudinal datasets with a total of 8534 participants . Multiple time points were examined across the 276 behaviors for a total of 17 sets of comparisons. A total of 136 effect sizes were computed using Hedge's g.

Smoking Cessation -"Random Digit Dial " Sample
Mean effect size estimates of the Pros and Cons for each of the Stage transitions for smoking cessation with the "random digit-dial" (RDD) sample across adjacent time points were calculated and are displayed in Table 35. Across the Stages from baseline to six months, small increases were seen from PC-C as well as C-PR for the Pros of Change . A larger magnitude is seen in the opposite direction      Table 37. Across the Stages from baseline to six months, a large increase was found in the PC-C with a smaller increase found for the C-PR transition , followed again by a large increase in the transition form PR-A. For the Cons of Change small increases were seen from PC-Caswell as C-PR. A large decrease was found for the PR-A transition. As with the RDD sample , due to the fact that only participants in the pre-Action Stages (Precontemplation , Contemplation , and Preparation) were recru ited into the original study no participants were available for the A-M transition for the baseline to 6month time points . A graphical depiction of these transitions can be seen in Figure   44.
For time points 6 months to 12 months for the Pros of Change, essentially no change occurred from PC-C. A small increase was found from C-PR, followed by a large increase for the PR-A transition. Finally , a large decrease was seen for the A-M transition. For the Cons of Change, small decreases were found for the PC-C and C-

Smoking Cessation -All Time Points Combined for Parent Sample
Participants in each of the four transition groups were combined across time points in order to increase sample size for each group. Once combined, mean effect sizes and 95% confidence intervals for each transition for the Pros and Cons were computed (Table 38). The largest increase for Pros was found for the PC-C transition followed by a small increase in C-PR. The magnitude of effect for the transition from decreased from PR-A and continued to decrease in A-M. Essentially no change was found in the transition from PC-C and C-PR for the Cons of Change followed by a large decrease from PR-A. Finally, a small increase was found from A-M . These transitions for Pros and Cons are graphically depicted in Figure 47.   Table 39. Across the Stages from baseline to two months, essentially no change was seen from PC-C. A large increase was found for C-PR for the Pros of Change, with smaller increases from PR-A and A-M. For the Cons of Change, a small decrease was found in the PC-C with essentially no change found for the C-PR transition. From PR-A a small increase was found followed by moderate decrease. A graphical depiction of these transitions can be seen in Figure 48.
The transitions for time points 2 months to 12 months for the Pros of Change showed a quite different pattern than baseline to 2 months, with a small increase in the PC-C transition followed by a small decrease in the Pros of Change. The last two Stages, PR-A and A-M showed essentially no change. In contrast a small increase for the Cons of Change was found for the PC-C transition. The Cons then decreased slightly in C-PR followed by a small increase in PR-A. Finally , essentially no change was seen from A-M. The magnitudes of effect for each of the transitions for Pros and Cons are illustrated in Figure 49.
In the final time points , 12 months to 24 months , for the Pros of Change a moderate increase was found for the PC-C transition. Due to a sample size of one, no effect size could be calculated for the transition from C-D. Essentially no change occurred in the Pros of Change for the transitions from A-M. A large decrease was found for the transitions from PC-C for the Cons of Change. Again , no data was available for the transition from C-PR. Finally , a small decrease was seen in the PR-A followed by essentiall y no change for A-M. Due to missing data for transition from C-PR , these changes are not illustrated .

Sun Protection -All Time Points Combined/or Beach Sample.
Participants in each of the four transition groups were combined across time points in order to increase sample size for each group. Once combined, mean effect sizes and 95% confidence intervals for each transition for the Pros and Cons were computed (Table 40). Although it appears that small changes occurred for many of the Stage transitions, confidence intervals indicate the only significant change occurred for the Pros in the PR-A transition. This change was a small increase in Pros for the PR-A transition. The magnitudes of effect for each of the transitions for Pros and Cons are graphically depicted in Figure 50.   Table 41. Across time points baseline to 6 months for the Pros of Change, a moderate decrease was found in the PC-C transition with essentially no change for the C-PR transition . Lastly, a slight increase occurred in the transition form PR-A. For the Pros of Change , small decreases were seen from PC-C whereas essentially no change occurred from C-PR.
A small increase in magnitude of effect was found from PR-A. As mentioned previously, due to the fact that only participants in the pre-Action Stages (Precontemplation , Contemplation, and Preparation) were recruited into the original study no participants were available for the A-M transition for the baseline to 6-294 month time points. A graphical depiction of the magnitudes of effect for the Stage transitions can be seen in Figure 51.
Across time points 6 months to 12 months for the Pros of Change, a small increase was found in the PC-C transition , followed by a moderate increase in C-PR.

Discussion
Two behaviors, sun protection and smoking cessation, were examined in this study. Two different samples were examined for each of the behaviors. Smoking cessation was examined in a sample of adults that were randomly contacted by phone using a random digit dial procedure as well as in a sample of parents that were identified in connection with a larger health behavior study with adolescents in schools. This same parent sample was analyzed separately for sun protection . Lastly, sun protection was examined in a sample of beach goers. (e.g., baseline to 6 months with 6 months to 12 months) participants may be included in more than one time point and therefore contribute more than one set of data. This type of dependency in the combined data was not accounted for in the crosssequential analyses that combined sets of time points. Therefore, although the combined data analyses have the advantage of larger sample size, the results should be interpreted with caution due to the dependency in the data. More research is needed to understand the impact of the dependency and methods developed to account for such dependency.
As predicted, the overall effect sizes appear larger for the longitudinal study size measure, making it difficult to rely on the effect size patterns. Additionally the effect size measure used in the cross-sectional study is calculated using a different procedure. It is unclear how this difference effects the comparison of the two types of data. More research is essential in order to resolve these important issues.
Despite these caveats , composite profiles for each of the behaviors are somewhat consistent with the cross-sectional data. The cross-sectional profile of the This investigation serves as a very preliminary look into the comparison of cross-sectional and longitudinal data. These preliminary examinations found that overall the cross-sectional and longitudinal data were reasonably comparable , with the earlier stages looking more similar between the two than the later stages. This study is the first step to understanding the utility of cross-sectional data for Such an endeavor is likely to increase the effectiveness and utility of an already successful model. Identify the target behavior according to whether it described Acquisition as ending a current behavior (cessation -e.g., quitting smoking) or starting a new one (acquisition -e.g., bone density testing) . Code using the following: 1 = acquisition 2 = cessation Healthy , Code the target behavior according to whether the behavior is Unhealthy, rooted in a healthy (e.g., exercise) or unhealthy (e.g., binge Prevention or drinking). If a behavior does not examine a health related Non-Health behavior (administrative change) code accordingly. Code related using the following: 1 = healthy 2 = unhealthy 3 = non-health related

APPENDICES Appendix A
Behaviors that examine prevention behaviors should be coded separately. Prevention of a behavior typically involves individuals whom are likely to be not engaging / exhibiting in a particular behavior, but may or may not in the future. (e.g, smoking acquisition, depression prevention). Code using the following:

= prevention Frequency
Identify the frequency in which the behavior is generally performed. A behavior is coded as daily/regularly if the behavior is generally performed daily or multiple times a week (e.g ., flossing teeth, exercise). Situational describes the frequency of a behavior for behaviors performed when the situation arises ( e.g., condom use). If a behavior is typically performed one a year ( e.g., STD screening) or less ( e.g., radon testing) it should be coded as yearly. Code into one of the following categories: 1 = daily/regularly 2 = situational 3 =yearly+

Sample Descriptors
Sample Enter the population description as described in the Description article/study.

Percent Female
Enter the percentage of female participants in each study as stated in the article . If not reported specifically and sufficient data is available, calculate percentage of females.

Percent Male
Enter the percentage of male participants in each study as stated in the article. If not reported specifically and sufficient data is available , calculate percentage of males. Mean Age Report the mean age for the entire sample as directly stated in the article. If not reported specifically and sufficient data is available, calculate the mean age.

Age Group
Enter the age group of the participants . This will be based on the description the sample , as well as the setting of the study.
Code age group into one of the following categories: 1 = adolescents 2 = college 3 = adults 4 = mixed Educational Level Report the mean years of education based on the entire sample . It should be entered exactly as it is reported in the text. Sampling Method Enter the way in which the sample of participants in each study was selected.

Stage Variables and Descriptors
Cross-Over Stages Enter the Stages between which the pros and cons scores graphically cross. This must be based on T-scores values . If tscores are not reported in a The two most typical response formats for Decisional Category Balance are either a likert scale which asks participants to indicate "how important " the given items are to them or to the degree to which they agree or disagree with each of the items . Code response format into one of the following categories: I = "How Important " 2 = "Agree I Disagree " 3 = other Number of Pros Enter the number of pros indicated by the author that were Items used in the final Decisional Balance measure.

Number of Cons
Enter the number of cons indicated by the author that were Items used in the final Decisional Balance measure .

Alpha Pros
Enter the coefficient alpha for the Pros scale calculated specifically for the study. Do not report alpha's that were reported based on previous studies.

Alpha Cons
Enter the coefficient alpha for the Cons scale calculated specifically for the study . Do not report alpha's that were reported based on previous studies. Note. Any data that can not be identified will be indicated in the database by "NR" (not reported).   ss Not serious ly considering quitting within the next 6 months Not Not intending to quit reported smoking in the next six months ss Reported smoking at least one cigarette per day during the past 7 days -and not serious ly considering quitting within the next 6 months ss Women not ready to obtain gonorrhea and Chlamydia screening every time they ever changed partners or had unprotected sex with their current main partner or current side partner

Contemplation Preparation Action Maintenance
Seriously considering Seriously considering Has quit smoking for Has quit smoking for quitting within the quitting in the next 6 under 6 at least 6 months next 6 months, however months and were months they were not considering planning to quit within quitting within the next the next 30 days, in 30 days, had not made a addition they made a 24 quit attempt of 24 hours hour quit attempt in past in past year, or both year Intending to quit smoking Intending to quit smoking Not Not in the next six months but in the next month and reported reported not intending to quit tried to quit in the past smoking in the next year month?

Seriously considering
Planning to quit within Quit within the past 6 Quit more than 6 months quitting smoking within the next 30 days and had months before baseline the next 6 months or made at least one 24-hour considering quitting quit attempt in the past within the next 30 days, year but had not made any quit attempts Individuals thinking Individuals planning to Individuals who had been Individuals who had been about obtaining obtain gonorrhea and obtaining gonorrhea and receiving gonorrhea and gonorrhea and Chlamydia Chlamydia screening Chlamydia screening for Chlamydia screening for screening every time they every time they ever 6 months or less every more than 6 months every ever changed partners or changed partners or had time they ever changed time they had ever change had unprotected sex wit h unprotected sex with their partners or had partners or had unprotecte their current main partner current main partner or unprotected sex with their sex with their current mair or current side partner current side partner current main partner or partner or current side current side partner partner ..

Contemplation Preparation Action Maintenance
Individuals thinking Individuals planning to Individual s who had been Individual s who had been about obta inin g obtain gonorrhea and obtaining gonorrhea and receiving gonorrhea and gonorr hea and Chlamydia Chlamydia screening Chlamydia screening for Chlamydia screening for screening every time they every time they ever 6 months or less every more than 6 months every ever changed partners or chan ged partners or had time they ever changed time they had ever change had unprotected sex with unprotected sex with their partners or had partners or had unprotecte their current main partner current main partner or unprotected sex with their sex with their current mair or curre nt side partner current side partner current main partner or partner or current side current side partner partner Intention to change Serious intent ion to Initiation of overt Sustaining behavioral within the next 6 months. change in the next 30 behavioral change . change for 6 months or days. more .
Acknowledge they have a Plans to take action Modifying their behavior The individual continues t problem , but are not quite within a definitive time-and environment (within resist temptation and to ready to do anything frame the past 6 months) reinforce action about it; may have indefinite plans to take action in the next 6 months (in the next 6 month s) Seriously contemplat ing Planning to quit in the Quit for less than 6 Quit for more than 6 quitting smokin g in the next 30 days and also months months next 6 months and not in having made a quit preparation atte mpt of at least 24 hours in the last year. Thinking abo ut quitting Planning to quit smok ing Continuous cessation for Continuous cessation for smoki ng in the next 6 in the next 30 days and greater than I day, but greater than 6 months months already made a recent less than 6 months quit atte mpt

Conte mplation Preparat ion Action Mai ntenance
I currently do not I currently exercise some, I currently exercise I currently exercise exercise , but am thinking but not regular ly regular ly, but have only regular ly, but have only of starting to exercise in begun doing so within the begun doing so within the ,, the next 6 months last 6 months last 6 months I currently do not I currently exercise some , I current ly exercise I I currently exercise exercise , but I am but not regularly reg ularly and have been regularly and have done sc thinking about starting to exe rcising at the for longer than 6 months exercise in the next 6 recommended leve l for 6 months months or less No screening Not used no screening at least two mammograms mammogram in past 2 mammogram 2-4 years in last 4 years , one in last years , but plans to have ago, wit h one in the past two years, with plans to one in the next 6 months 2 years, intends to have have mammograms in the mammogram in future future Seriously considering Seriously considering Not reported Not reported quittin g within the next 6 quitting in the next 6 months , however they months and were were not considering planning to quit within quitting within the next the next 30 days, in 30 days , had not made a addition they made a 24 quit attempt of24 hours hour quit attempt in past in _J)_ast year, or both year ss Do you regul arly take your insu lin as you were told to by your health care provider? No , and I do not intend to in the next 6 months .

Contemplation Preparation Action Maintenance
Has been smoking in the Has been smoking in the Has not been smoking in Has not been smok ing for past 24 hr and is planning past 24 hr and is planning the past 24 hr but not the past 6 months to quit within the next 6 to quit within the next longer than 6 months months but not yet within month the next month Planning to quit within Planning to quit within Not reported Not reported the next 6 months the next month Do you regularly follow Do you regularly follow Do you regular ly follow Do you regu larly follow your glucose self-testi ng your glucose self-testing your glucose self-testing your glucose self-testing plan? No , but I intend to plan ? No, but I intend to plan? Yes , and I have p Ian? Yes, and I have beer , in the next 6 months.
in the next 30 days . been , but for less than 6 for more than 6 months. months. Do you regu larly take Do you regularly take Do you regularly take Do you regularly take you your insu lin as you were your insulin as you were your insulin as you were insulin as you were told to told to by your health told to by your health told to by your health by your health care care provider ? No , but I care provider? No , but I care provider? Yes, and I provider? Yes, and I have intend to in the next 6 intend to in the next 30 have been , but for less been for more than 6 months . days . than 6 month s. months.
Not reported Not reported Not reported Not reported Not reported is an interest in acquiring is an interest in acquiring regularly , less than 6 behavior in next six behavior in next month months months

Contemplation Preparation Action Maintenance
Smokers who have Smoked for more than six Has quit smoking within Has quit smoking for mon smoked for more than six months with at least one the last six months than six months months without history of serious quitting attempt serious quit attempts and and is thinking about are thinking about quittin g within the next quitting 30 days Considering adopting Planning to adopt SCis Had been providing SCis, Had been providing SCis SCis for more than 80% for more than 80% of for less than 6 months , to to more than 80% of their of their patients who their patients in the next more than 80% of their patient s who smoke d, and smoked within the next 6 month patients who smoked had been doing so for mor months than 6 months Thinking of consistent ly Planning to consistent ly Have been consistently Have been consistently usin g condoms in the use condoms in the next usin g condoms for less using condoms for more next 6 months 30 days than 6 months than 6 months Never smoked but the re Nev er smoked but there Not reported Not reported is an interest in acquiring is an interest in acquiring behavior in next six beh avior in next month months

Smokers who have
Smoked for more than six Has quit smoking within Has quit smoking for mon smoked for more than six month s with at least one the last six months than six months months without history of serious quitting attempt serio us quit attempts and and is thinking about are thinking about quittin g within the next quitting 30 days  Does not exercise and has no intention to do so in the next 6 months

Contemplation Preparation Action Maintenance
Thinking about eat five Ready to eat five fruits Have recently begun to Have sustained fruits and vegetables a and vegetables a day and eat five fruits and consumption of five fruits day and intend to do so in intend to do so in the next vegetables a day, but and vegetables daily for the next six months thirty days have been doing so for more than six months six months or less

Not reported
Not used Not reported Not reported Seriously considering Considering quitting in Not reported Not reported quitting in the next 6 the next 30 days, has months or during attempted to quit in the pregnancy last 24 hours during current pregnancy or the past year ..

Seriously intending to Seriously intending to
Recently reached the goal Continued the goal take action in the take action in the behavior behavior for at least 6 foreseeable future immediate future months Does not exercise but Does not exercise but Does exercise , but for Does exercise and has bee intends to do so in next 6 intends to in next 30 days less than the last 6 for more than the last 6 months months months )

Contemplation Preparation Action Maintenance
Planned to quit smoking Planned to quit in the Not reported Not reported in the next 6 months, but next 30 days , and has had not in the next 30 days or a serious cessation planned to quit in the attempt. next 30 days, but had not made a serious cessation attempt. Planned to quit smok ing Planned to quit in the Not reported Not reported in the next 6 months, but next 30 days, and has had not in the next 30 days or a serious cessation planned to quit in the attempt. next 30 days , but had not made a serious cessation attempt.
Intending to quit smoking Intending to quit smok ing Reporting abstinence Reporting abstinence from within the next 6 months within the next 4 weeks from smoking for less smoking for more than 6 and reporting a serious than 6 months months quit attempt in the past year No , but I am planning to No , but I am planning to Yes, and I have been Yes, and I have been doin start to always use them start within the next 30 doing so for 6 months or so for 6 months or longer within the next 6 months days less No, (I do not always use No , (I do not always use Yes, (I do not always use Yes, (I do not always use condoms when I have condoms when I have condoms when I have condoms when I have sex) sex) but I am planning to sex) but I am planning to sex) and I have been and I have been doing so start to always use them sta11 within the next 30 doing so for 6 months or for 6 months or longer within the next 6 months days less ::- ...)

Contempl ation Preparation Act ion Ma intenance
No , (I do not a lways use No , (I do not always use Yes , (I do not always use Yes , (I do not always use condoms when I have condoms when I have condoms when I have condoms when I have sex) sex) but I am planning to sex) but I am planning to sex) and I have been and I have been doing so start to always use them start within the next 30 doing so for 6 months or for 6 months or longer within the next 6 months days less Intendi ng to quit withi n Intention to quit within Not reported Not reported the next 6 months the next 30 days Intended consistent Were not using condoms Using condoms Reported using condoms condom use within the consistently, but intended consi stent ly, but for 6 consistent ly (every time next 6 months to do so within the next months or less they had intercourse) for month more than 6 months Intended cons istent Were not using condoms Using condoms Reported using condoms condom use within the consistent ly, but intended consistent ly, but for 6 consistent ly (every time next 6 months to do so within the next months or less they had interco urse) for month more than 6 months No , but I am planning to No , but I am planning to Yes , and I have been Yes , and I have been doin start to always use them start within the next 30 doing so for 6 months or so for 6 months or longer within the next 6 months days less Current ly do not exercise , Current ly exercise some , Currently exercise Currently exercise but am thinking about but not regularly regular ly but have only regula rly and have done sc sta11ing to exerc ise in the begu n doing so within the for longer than 6 months next 6 months last 6 months :>,. ss I do not intend to stay off drugs comp lete ly in the next 6 months ss Involved in at least one role related to bullying and not intending to change their behavior in the next 6 months ss No intention of quitting in the near future ss "I presently do not exercise and do not plan to start exercising in the next six months"

Contemplation Preparation Action Maintenance
I intend to stay off drugs I intend to stay off drugs I have stayed off any I have stayed off any drug completely in the next six completely in the next 30 drug use for less than 6 use for more than 6 month months but not in the days months next 30 days Involved in at least one Involved in at least one Not involved in any of Not involved in any of the role related to bullying role related to bullying the three roles of bullying three roles of bullying and intending to change and intending to change their behavior in the next in the next 30 days 6 months Currently smoking and Currently smoking , but Currently not smoking , Currently not smok ing, an intending to quit in the had quit for a period of at but had smoked within had not smoked for at leas next year least 24 hours within the the last 6 months 6 months last 6 months "I presently do not "I presently get some "I presently exercise on a "I presently get some exercise but I have been exercise , but not regular basis, but I began exercise and have been thinking about starting to regularly" only within the past six exercising regularly for exercise within the next months" longer than six months" six months" 0 Engaging in any activ ity done on the feet with the bones supporting the body's weight for at least 30 min, a day, and at least three times per week : "No, and I don't intend to in the next six months"

Contemplation Preparation Ac tion Maintenance
Not current ly limiting Not currently limiting Limiting dietary fat Limiting dietary fat intake dietary fat intake, but dietary fat intake, but intake for 6 months or for longer than 6 months individuals have thought individuals have thoug ht less about doing so in the past about doing so in the past month and are either month and are somewhat mildly confident or not at or very confident that all confident about they will make some of making changes in the these changes in the next next month month Engaging in any activity Engaging in any activity Engaging in any activ ity Engaging in any activity done on the feet with the done on the feet with the done on the feet with the done on the feet with the bones supporting the bones support ing the bones supporting the bones supporting the body's weight for at least body's weight for at leas t body's weight for at least body's weight for at least 30 min, a day, and at least 30 min, a day, and at least 30 min, a day, and at least 30 min, a day, and at least three times per week: three times per week: three times per week: three times per week: "No, but I intend to in "No, but I intend to in the "Yes, I have been but for "Yes, I have been for more the next six months" next 30 days" less than six months" than six months" '     Organ Donation -Intentions (N=4) .... ...................