The Potential Clinical Utility of Methods for Estimating Prior Standing in Specific Cognitive Domains: A Feasibility and Illustration Study

This study evaluated the viability of a potential strategy for improving the detection of cognitive decline over currently available IQ-based methods. The proposed strategy makes use of differential cognitive effects across different neurocognitive disorders. It involves examining estimated/obtained difference scores (D-scores) for the specific cognitive domain(s) (SCDs) most affected by a disorder. The current study undertook a broad feasibility test of the strategy as a preliminary step in the development of specific cognitive domain estimation methods (SCDEMs). Clinical and control group score distributions were reconstructed from IQ and SCD test (SCDT) means and standard deviations reported in previously published studies of mild Alzheimer 's disease (mild AD), chronic alcohol abuse (CAA), and mild traumatic brain injury (mTBI) . For each test, the percentage of area shared by the two reconstruct ed clinical and control distribution curves was calculated. Percent overlap values for tests measuring the same SCD were then pooled across studies of the same disorder and averaged, thereby forming indexes that served to estimate SCD sensitivity. Comparable IQ indexes were also formed. The average SCD and IQ overlap values were then compared. The main result suggests that diagnostic accuracy could be improved considerably for mild AD, and, to a lesser extent, for CAA and mTBI, by using SCD versus IQ D-scores. The development of SCDEMs appears clinically worthwhile, especially given their potential application to a disorder of such importance as AD , although their utility may be lower or considerably lower, for other disorders.

Often, the best baseline data are prior cognitive test results. Unfortunately, such data are frequently either unavailable , or cannot be obtained in a timely manner. Even when they are available, clinicians must carefully consider whether the data provide a representative baseline for comparison. The representativeness of prior test data depends on a number of factors, such as the reliability and temporal stability of the previously administered tests. Also, a person's pattern of strengths and weaknesses may change with age and education (Kaufman, 1990;Reynolds, 1997;Woodruff-Pak, 1997). Additionally, events intervening between the time of the previous evaluation and the onset of an event or condition prompting a ref en-al may limit the value of prior test results. Consider, for example, a 40-year-old patient with 12 years of education who is referred for assessment of cognitive functioning following recent mild to moderate traumatic brain injury (TBI).
High school IQ scores might provide a potentially helpful baseline against which to compare the patient's current performance. However, if the patient had been a heavy drinker since the late teens, the high school IQ scores might no longer be representative of the more recent pre-TBI baseline. That is, they might not aid in disentangling the relative contribution of alcohol abuse from TBI in accounting for cognitive decline, especially if current performance falls below that of the high school IQ scores.
Given the potential limitations of accessing or using p1ior cognitive test results, clinicians frequently estimate a patient's cognitive baseline. Typically, this is done using indirect, inferential methods --both formal and informal --to estimate a patient's prior IQ score. IQ scores are popular for this purpose because they provide an index of global cognitive ability, and correlate more or less strongly with virtually every cognitive domain (Crawford , 1992;Kaufman, 1990;Schinka & Vanderploeg, 2000) . Thus, one score can assist in setting expectations for performance across many cognitive domains.
Accordingly, clinicians typically use the IQ estimate as a benchmark for forming expectations about a patient's performance in other cognitive domains, and take deviations from these expectations as potential indicators of disorder or loss. Thus, the diagnostic process often involves two inferential links: the estimation of prior IQ, and, then, the use of the estimated prior IQ to formulate expectancies for performance in specific cognitive domains .
Estimated (or actual) baseline IQ (i.e., before the onset of a condition affecting cognitive functioning) is typically refen-ed to in the literature as "premorbid" IQ.
According to Graves et al. (1999), a better term might be "no-morbid" IQ. In most cases, I will refer to it as "prior" IQ. This avoids the implicit suggestion raised by the term "premorbid" that disorder is now present. It also avoids the suggestion raised by the term "no-morbid" that the estimate reflects a person's pristine intellectual ability in the absence of any previous cognitive disorder or risk factors.

Informal Methods of Estimating Prior IQ
Because prior IQ estimation is such a common practice, a review of the literature on this topic is wan-anted, especially because alternative procedures will proposed. In formulating prior IQ estimates, practitioners often rely on clinical judgment (Smith-Seemiller , Franzen, Burgess, & Prieto, 1997). However, voluminous literature suggests that cognitive data integration of this sort is limited (e.g. Arkes, 1981;Dawes, Faust, & Meehl, 1989;Faust et al., 1988;Kareken & Williams , 1994;Meehl, 1954). First, it may be difficult to subjectively evaluate the validity of the data used in fo1ming the estimates.
This may be particularly true of data bearing an uncertain relationship with IQ (e.g., occupational history, military history , or hobbies). Second, information bearing on the strength of association between extra-test data and IQ often is not readily available or is not broadly recognized. 1 In the absence of this knowledge, clinicians might rely too heavily on info1mation that is weakly correlated with IQ, and too little on more valuable info1mation. Third, even when clinicians are aware of the coITelations among variables, they tend not to adjust their estimates accordingly or optimally (Kareken & Williams, 1994), something that can be very difficult to do subjectively. For example, optimal combination might involve weighting Variable X 2.7 times more than Variable Y and 1.63 times less than Variable Z. Fourth, clinicians tend to be overconfident in their judgments (Kareken & Williams, 1994), and the more data they have, the more overconfident they tend to become. However, judgment accuracy does not necessarily depend on the quantity of data per se, but rather on the degree to which the data are valid and non-redundant (Dawes et al., 1989). Overconfidence tends to lead to overly extreme predictions, rather than properly regressed prediction .
Actuarial judgment involves the use of predetermined or a prespecified means of data combination, and is based on empirically established relations (Meehl, 1954). Well over 100 studies (see Dawes et al., 1989) show that actuarial methods almost always equal, and frequently exceed, the accuracy of clinical judgment, thereby making it a better overall method. Accordingly, actuarial methods for estimating prior IQ ought to be at least as accurate, and quite possibly more accurate than estimates based on clinical judgment alone . Certainly, on a patient-by-patient basis, there will be instances in which clinical judgment is more accurate than actuarial judgment. For example, a clinician might reject a prior IQ estimate derived from a demographic regression formula when the patient's history clearly suggests high intellectual ability in the presence of low standing on the variables that enter the formulae (Schinka & Vanderploeg, 2000). However, determining when to countervail an actuarial conclusion appears to be much more difficult than it might seem, and in many instances such countervailing ove1turns an otherwise con-ect decision (Faust et al., 1988). Studies addressing this issue show that clinicians tend to call exceptions too frequently and that, for each incorrect actuarial conclusion that is corrected, there will be one, or more than one, correct actuarial conclusion spoiled, leading to no overall net gain or, worse, a loss in overall judgmental accuracy (Dawes et al., 1989). Thus, there are strong grounds to believe that formal or actua1ial methods will not decrease , and may well increase, the accuracy of prior ability estimates. As will be discussed later, the methods described here incorporate actuarial or statistical judgment methods.

F01mal Methods of Estimating Prior IQ
There are four general formal, or quasi-formal, approaches for estimating prior IQ (Spreen & Strauss, 1998). The first employs actuarial regression formulae based on a patient's standing on background demographic variables. The second is based on current performance on WIS subtests (e.g., Vocabulary) thought to be resistant to brain damage.
The third takes the highest current WIS subtest score or highest level of achievement in everyday tasks as a marker of prior IQ and as the standard against which performance on other tests is compared. The fourth is also an actuarial regression model but instead uses pe1f01mance on word-reading tests . Other methods include combinations or hybrids of the four methods just mentioned. All of these methods attempt to capitalize on markers of cognitive ability that are, or are thought to be, highly correlated with prior IQ, yet minimally or relatively unaffected by brain impairment.
Demographically Based Actuarial Methods for Estimating Prior IQ Using data from the 1955 WAIS standardization sample, Wilson , Rosenbaum, Brown, Rourke, and Whitman (1978) Wilson et al.'s (1978) lead, Barona , Reynolds , and Chastain (1984) developed demographic regressio n formulae for estimating W AIS-R IQ scores. The formulae were based on the data contained in the W AIS-R standardization sample. Barona et al. added the variables of geographical region of residenc e, and urban versus rural residence to those that Wilson et al. (1978) used. The squared multiple correlations between all vruiables in Barona fo1mulae were .38,.24,and .36 for VIQ, PIQ, and FSIQ, respecti vely, slightly lower than the Wilson formulae. The formulae SEE were 11.79, 13.23, and 12.14 for VIQ, PIQ, and FSIQ, respectively, slightly larger than the Wilson formulae . Due to the effects of regression toward the mean , Bru·ona et al. warned that serious over-or under-estimation could occur for individuals with IQs below 69 or above 120, respectively . A subsequent revision to these formulae (Barona & Chastain, 1986), based on a subset of the WAIS-R sample, had similar SEE and generally failed to improve predic tive accuracy over the original 1984 formulae .
Studies on the Barona and Wilson formulae have generally shown that they perform only slightly better than chance in estimating patients ' prior IQ (e.g., Bolter, Gouvier, Veneklasen, & Long, 1982;Goldstein , Gary, & Levin, 1986;Hawkins, 1995, Schinka, 2000Karzmark, Heaton, Grant, & Matthews, 1985;Klesges , Fisher, Vasey, & Pheley, 1985;Silverstein, 1987;Sweet, Moberg, & Tovain, 1990). That is, they do not provide much improvement over assuming that a patient's prior IQ was in the Average range. This assumption will be correct about 50% of the time (Hawkins, 1995;Schinka & Vanderploeg, 2000) assuming a normal distribution of scores.  (Karzmark et al., 1985;Wilson et al., 1978) and about 12 points for the Barona formulae (Barona et al., 1984;Eppinger, Craig, Adams, & Parsons, 1987). These values, especially the latter, approach the 15-point standard deviation of the Wechsler scales, and therefore produce a distribution of predicted scores that is nearly the same as that of scores on the test.

Vocabulary and the Hold-Don't-Hold Strategy
Results from early studies suggested that performance on the WIS Comprehension, Information, Picture Completion , and Object Assembly subtests was relatively preserved (i.e., they "hold") in va1ious pathological brain conditions. In contrast, pe1fonnance on the Digit Span, Digit Symbol, Arithmetic, and Block Design subte sts tended to deteriorate (i.e., "don't hold") (Kaufman , 1990;Lezak, 1995;Vogt & Heaton, 1977;Wechsler, 1958). Based on this information, Wechsler (1944) developed a Deterioration Index to aid in the detection of cognitive loss. The index was a ratio based on the "hold" and "don't hold" subtests of the Wechsler-Bellevue Intelligence Test. Wechsler (1958) revised the index for the WAIS and replaced Comprehension with Vocabulary as a "hold" subtest, and Arithmetic with Similarities as a "don't hold" subtest. McFie (1975) subsequently proposed using only the Vocabulary and/or Picture Completion subtests as markers of prior IQ.
Additionally, the method also increases the likelihood of over-or underestimation of prior IQ by ignoring chance fluctuations in subtest performance, by failing to account for the effects of regression toward the mean (Reynolds, 1997), and by overlooking agerelated changes in the pattern of WAIS subtest performance (Kaufman, 1990).
Lezak' s Best Performance Method Lezak's (1995) best performance method (BPM) takes the patient's highest level of performance as the "standard against which all other aspects of the patient's current performance are compared" (p. 106). According to Lezak, a patient's highest level of prior functioning might be reflected in such variables as their highest current WIS subtest score, their highest occupation level, or other special accomplishments or proficiencies.
The BPM (Lezak, 1995) assumes that among neurologically normal individuals, test scores tend to cluster around a mean level of performance and that for cognitively impaired individuals, their highest obtained test score provides the best marker of their previous level of performance.
This latter assumption ignores error variance and normal variability in test scores.
Normal individuals often show wide discrepancies between their best performance and their more typical or average performances. As a result BPM will virtually always overestimate prior IQ. This can be demonstrated by conside ring a normal individual with a FSIQ of 100 and mean subtest score of 10. Given the average 7-point range of subtest scatter (Matarazzo, Daniel, Prifitera, & Herman, 1988), this individual's score on any one subtest is likely to be 10 ± 3.5. Because an obtained score reflects both "true" score and measurement error, any individual's highest subtest score will contain a positive error compone nt. Thus, the person's highest subtest score is likely to be 13.5, which is 1.16 SD above his or her mean subtest score of 10. According to the BPM, this individual's prior FSIQ would corrnspondingly be estimated to be 1.16 SD above the mean, or about 117 (High Average range) --nearly 40 percentile points above the person's actual FSIQ of 100 (Average range). Mortensen, Gade, and Reinisch (1991) (Krull, Scott, & Sherer, 1995) consists of three formulae derived from the W AIS-R standardization sample. Each formula combines age, education, occupation, and race , with a patient's raw score on either the Vocabulary or Picture Completion subtest, or both subtests. In a subsequent revision (OPIE-R), these authors recommended using the Vocabulary formula when the patient's Vocabulary raw score is 4 or more points higher than their Picture Completion raw score, and using the Picture Completion formula when the reverse is true. If Vocabulary and Picture Completion do not differ by at least 4 points , the formula incorporates both subtests (Williamson, Krull, & Scott, 1996). Vanderploeg and Schinka (1994; developed 33 regression equations for predicting prior IQ --one for each of the 11 W AIS-R subtests --based on data from the W AIS-R standardization sample. Each equation combined age, gender, race , education, and occupation with a current subtest score . Vanderploeg, Schinka, and Axelrod (1996) identified the three most robust equations for estimating VIQ, PIQ, and FSIQ (3 equations each, 9 total). These equations used the Vocabulary , Inf01mation, or Picture Completion subtests. For each IQ index, the highest estimate produced by the three equations is taken as the patient's estimated prior IQ.
Relative to the Barona method, the OPIE and BEST-3 methods have smaller SEEs (6.29 to 11.86, across methods). Consequently, they have higher squared multiple correlations (.41 to .78, across methods), and higher correlations between estimated and obtained IQ (.74 to .87, across methods) (Ritchie, Lam, & Rankin, 1996;Vanderploeg et al., 1996). Regarding accuracy, the BEST-3 formulae produced overestimates of about 5 points (Vanderploeg et al.,(1994;. The OPIE formulae correctly classified 63% of a cross-validation sample into the same qualitative IQ category as their obtained FSIQ scores . None of the OPIE IQ estimates were off by more than two categories (Krull et al., 1995). One study, however, found no significant difference in the accuracy of the three methods (Axelrod, Vanderploeg, & Schinka, 1999 Several researchers have developed methods for estimating prior IQ that use current reading ability. These methods were initially based on observations that reading ability appeared to be relativ ely preserved in dementia (Nelson & McKenna, 1975).
These methods attempt to capitalize on a "hold" skill, and thus are variants of the Hold-Don't-Hold method . Nelson and O' Connell (1978) Nelson and O'Connell (1978) theori zed that the utility of single word-reading for estimating prior IQ depends less on application of grapheme-phoneme conversion rules (reading skills) and more on vocabulary (i.e., previous familiarity with the words); hence the inclu sion of only phonemically irregular words on the NART .
Revisions of the NART. Several revisions and modifications of the NART have been undertaken . Crawford (1992) created a revision known as the NART-R UK, or NART-2, for estimating W AIS-R IQ scores. Ryan and Paolo (1992) adapted the NART for use with older Americans. Blair and Spreen (1989) Grober & Sliwinski, 1991). Grober, and Sliwinski (1991) subsequently developed a regression model based on AMNART errors and years of education to predict W AIS-R VIQ . These various revisions of the NART accounted for 59 % to 69% of the variance in predicted IQ. The SEE of the various regression formulae ranged from 7 points (for VIQ) to 12 points (for PIQ).
Cognitive decline and NART performance . The NART and its revisions were developed for differentiating patients with dementia from normals . Therefore, its clinical utility hinges on the resilience of single word reading to dementia . Numerous studies have addressed this issue. Among non-demented elderly, NART performance appears to be stable over periods of up to 6 years (Schmand, Geerlings, Cees, & Lindeboom, 1998).
Use of the NART in disorders other than dementia. Although the NART was developed to identify dementia, this measure and its revisions are often used to estimate prior functioning in other disorders (O'Carroll, 1995). Studies have provided mixed support for this practice. For example, NART performance appears to be relatively stable among the patients with closed head injmy Watt & O'Carroll, 1999), early idiopathic Parkinson's disease (Lees & Smith, 1983), and depression (O'Carroll, 1995). In contrast, NART performance was attenuated among patients with Huntington's disease and Korsakoff s syndrome . Moreover, NART performance may well be reduced in the presence of dominant hemisphere damage .
Retrospective accuracy of the word-reading methods. The results of the few studies examining the retrospective (rather than concurrent) accuracy of NART -estimated IQ show that they tend to underestimate IQ scores obtained 3 to 5 years earlier by about 2 to 4 points. Further, the D-score standard deviations in these studies are rather large, ranging from about 8 to 10 points (Berry et al., 1994;Carswell, 1992), with the IQ of about one in five individuals being underestimated by at least 10 points (Carswell, Graves, Snow, & Tierney, 1997).
Modifications and extensions of the word-reading method. There have been a number of extensions and modifications of the NART. Beardsall and Brayne (1990) developed an equation for predicting full NART scores from performance on its first 25 words (Short NART). This test showed a high correlation with the full NART (.83 to .93) Crawford, Parker, Allan, Jack, & Morrison, 1991), and produced VIQ estimates that were only minimally less accurate than those based on the full NART (Crawford et al., 1991). Although these results suggested that the Short-NART could be used with reasonable confidence, Crawford et al. (1991) argued that it requires administrative and scoring adjustments, and thereby introduces needless potential clerical error into the calculation of NART-estimated IQs. Moreover, the full NART does not take long to administer and is generally well tolerated by patients.
The Spot-the -Word Test (STW) (Baddely, Emslie, & Nimmo-Smith, 1993) requires patients to identify the real word in each of a series of word/non-word pairs presented aurally or visually. The test was intended, in part, to reduce the frequency and extent of IQ underestimates among patients who may be familiar with a NART item aurally but not orthographically, or vice-versa. Research on the STW, however, has yielded generally disappointing results (Beardsall & Huppert, 1997;Law & O'Carroll, 1998;Watt & O'Carroll, 1999).
Beardsall and Huppert developed The Cambridge Contextual Reading Test (CCRT) (Beardsall, 1998;Beardsall & Huppert , 1997;Beardsall & Huppert , 1994). The CCRT places the NART-2 within meaningful sentences, which patients read aloud. The intention was to reduce the likelihood of underestimating p1ior IQ among patients who mispronounce NART-2 words they likely know. The final regression formula combined CCRT score with sex and years of education. It accounted for 68% of the variance in VIQ estimation and had a SEE of 7 .80 --values similar to those reported for other wordreading methods. Studies on the CCRT show that placing NART-2 items in sentences significantly reduces overall pronunciation errors, and that this effect is strongest for patients with mild to moderate AD or who have poor word reading ability (Beardsall, 1998;Beardsall & Huppert, 1994;Conway & O'Carroll, 1997;Law & O'Carroll , 1998).
Although this result suggests that the CCRT might be supe1ior to the NART for such patients, firm conclusions await retrospective confirmation.
The Reading subtest of the Wide Range Achievement Test-Revised (WRAT-R) is commonly used to screen reading ability and shows moderate to strong correlations with IQ (Cooper & Fabroni, 1988;Kareken, Gur , & Saykin, 1995;Spruill & Beck, 1986). Kareken, Gur, and Saykin (1995) developed regression fo1mulae for estimating W AIS-R IQ scores that combined WRAT-R reading score with demographic variables. The squared multiple correlations were .67, .62, and .72 for VIQ, PIQ, and FSIQ, respectively , with respective SEEs of 10.42, 11.82, and 10.24, slightly larger than the SEEs of other word reading measures.
Other Approaches to Estimating Prior IQ Schlottman and Johnsen (1991) developed the Intellectual Correlates Scale (ICS) for estimating prior IQ in brain-damaged individuals on the basis of changes in interests and attitudes following brain insult. The ICS, however, has demonstrated poor reliability over time (Raguet et al., 1996). Wrobel and Wrobel (1996) (Shipley, 1930). SILS-estimated/obtained W AIS-R FSIQ correlations were high (i. e., .85 and .87) in a sample of male psychiat1ic patients .The formulae SEEs were 6.21 and 6.26 (Zachary, Crumpton, & Spiegel, 1985). Even though the multiple-choice format of the SILS may make it a somewhat easier test than the WIS Vocabulary test, which requires generation of a response (Yuspeh, Vanderploeg, & Kershaw, 1998), it remains susceptible to the same limitations of other Hold-Don't-Hold methods.

Methods for Predictin~ Prior Intellectual Level in Children
Attempts at developing methods for estimating prior IQ in children (e.g., Reynolds & Gutkin, 1979;Sellers, Bums, & Guyrke, 1996;Vanderploeg, Schinka, Baum, Tremont, & Mittenberg, 1998) have generally yielded disappointing results. This is not unexpected given that children appear to have greater va1iability in IQ over time than adults and that IQ in children can be affected by numerous factors Franzen, Robbins, & Sawicki, 1989;Sattler, 1992).

Summa1:y
The Average IQ range of 90 to 110 encompasses 50% of the normal distribution .
Accordingly, a clinician should be co1Tect about 50% of the time, by always assuming that a patient's prior IQ fell within this range. The challenge is to develop methods that permit greater accuracy (Schinka & Vanderploeg, 2000). A number of methods have been developed to accomplish this aim, but none have been highly successful and some have serious flaws. The BPM systematically overestimates IQ, often by a gross margin.
The Hold-Don't-Hold strategy has a number of weaknesses that have lead to it's being discredited. Demographic regression approaches, such as the Barona  Intelligence scales were designed to assess global intellectual capacity and not specific cognitive abilities related to brain function (Kaufman, 1990;Putnam et al., 1999;Sattler , 1992;Woodruff-Pak, 1997 Alzheimer's disease (AD), particularly in its earliest stages. PIQ D-scores can be more sensitive to certain types of brain dysfunction than either VIQ or FSIQ D-scores (Gouvier, Bolter, Veneklasen, & Long, 1983) and therefore, might be better able to detect cognitive decline. However, methods for estimating prior PIQ are less, or considerably less accurate than for VIQ or FSIQ, suggesting that PIQ D-scores should be used cautiously .
Dissociation between SCD functioning and IQ performance probably reflects several factors, in particular the relative insensitivity of intelligence scales to some changes in brain state (Damasio, 1994;Schlosser & Ivison, 1989). Dissociation between IQ and SCOT performance was demonstrated in a study involving 35 mild AD patients, who were administered the WAIS-III and the Wechsler Memory Scale-III (WMS-III) (Psychological Corporation , 1997). As expected, the mild AD patients had lower WAIS-III IQ and WMS-III scores than the standardization sample, but the discrepancy was much greater for the WMS-III scores . Additionally, the mild AD group's WMS-III scores were less variable than their IQ scores, suggesting that the group's memory abilities were more consistently depressed than their IQ scores. For example, respective standard deviations for the VIQ and the WMS-III General Memory Index were 13.1 versus 8.6.
The study suggests that, all other things being equal, a method for estimating prior WMS-III versus WAIS-III standing would provide a more accurate and sensitive means of detecting AD. Improved detection of AD (and possibly other disorders) could permit earlier and potentially more effective interventions (McLendon & Doraiswamy, 1999; Woodruff-Pak, 1997). Moreover, early diagnosis affords AD patients and their families more time to plan for long-term care and to develop coping strategies.
The foregoing discussion suggests that the clinical utility of IQ D-scores may be constrained by the magnitude and base rate of IQ impai1ment in ce1tain disorders. If a disorder minimally effects intellectual level, then IQ D-scores scores will be of limited value in diagnosing it. Alternatively , if the disorder has a greater effect on a SCD than on IQ, SCD D-scores might provide a more powerful diagnostic aid than IQ D-scores. This implies that for certain disorders, it might be possible to improve on the diagnostic accuracy of IQ D-scores by estimating prior functioning on the SCD (or SCDs) most affected by the disorder.
Unfo1tunately, formal methods for estimating prior SCOT standing (specific cognitive domain estimation methods, SCDEMs) are generally, although not entirely lacking (Williams, 1997). As a result , clinicians commonly link expectations about how a patient ought to perform on a variety of neuropsychological tests to estimated prior IQ.
This process involves two far from perfect inferential links: (1) estimating prior IQ, and (2) then using the IQ estimate to formulate expectations for performance on other tests. There are a number of limitations to Schlosser and lvison ' s (1989) study. First, the regression formulae were not cross-validated. Second, the two groups, especially the AD sample, were quite small. Third, the severity of dementia among the 16 AD patients was not clearly specified. However, the subjects were recruited from hospitals and nursing homes, and it seems likely that they had at least mild, and quite possibly moderate , dementia. This raises the question of whether the method would be sensitive enough to differentiate between individuals with mild or early dementia and nondemented patients who present with memory complaints --the situation where it would be most useful. Finally, a partial replication of the study using the WMS-R raised doubt that instruments such as the NART are effective predictors of mem01y functioning (O'Carroll et al., 1994). Crawford, Obonsawin , and Allan (1998) examined the NART as a predictor of Paced Auditory Serial Addition Test (PASAT) raw scores in 152 healthy British subjects.

Other investigators have attempted to develop
The P ASA T is a widely used test for assessing information processing and sustained and divided attention. A NART-based regression formula for predicting PASAT score yielded a multiple R of .52 and a SEE of 34. 87. Crawford et al. (1998) suggested that NART-predicted PASAT scores could provide an estimate of prior ability on this task, which in turn could aid in detecting decline in information processing and in sustained and divided attention. In a sample of normal subjects and patients with either schizophrenia or bipolar disorder, Hawkins et al. (1993) found a strong correlation (.83) between raw scores on a measure of single word reading ability (Gates-MacGinitie Reading Vocabulary Test) and on a measure of naming ability (the Boston Naming Test).
There is so little additional work in this area, that, beyond some assurances that prediction of prior abilities in SCDs seems potentially feasible, determining which variables serve as the best predictors is, presently mostly a matter of educated conjecture.
Further, preliminary to this, one needs to examine whether, or the extent to which, accuracy might be increased if prior functioning in SCDs could be predicted.

The New Approach
Recently, two groups of authors Schinka & Vanderploeg, 2000) independently called for the development of SCDEMs as a means of improving diagnostic accuracy. The authors did not, however , offer guidelines for developing such methods. The present project attempts to address this omission. It proposes a potential new approach or strategy for improving diagnostic accuracy through the development of SCDEMs, and then undertakes an initial feasibility analysis of the approach's potential clinical utility. The approach capitalizes on divergent patt erns of SCD impairment in different neurocognitive disorders. It involves identifying the SCDs that might optimally differentiate between those with and without a particular neurocognitive disorder. This information could then be used to prioritize and guide the development of SCDEMs , which, in turn, would permit calculation of SCD D-scores that could aid in improving detection of cognitive decline and diagnostic accuracy . If the disorder in question causes more decline in a SCD than in IQ, then, all other things being equal, SCD D-scores ought to detect the decline more effectively than IQ D-scores.
The validity of the proposed strategy rests upon two basic assumptions : (a) that in certain neurocognitive disorders , the base rate and/or magnitude of impairm ent in some SCDs (as measured by their assoc iated SCDTs) are greater than the base rate and magnitude of impai1ment in global intellectual functioning; and (b) that it is possible to develop SCDEMs for estimating prior SCD standing. That is, that prior SCD standing is , to some extent, predictable given an optimal set of variables sufficiently correlated with the SCD and sufficiently resistant to disruption by the neurocognitive condition. To the extent that either of these two assumptions fails to hold, the clinical utility of the method would be reduced or completely undermined.

PURPOSE OF THE STUDY
The present study attempted to provide an initial test of the proposed strategy's potential to improve the detection of cognitive decline over IQ in particular neurocognitive disorders. It addressed three related , main questions . First, for specific disorders, are certain SCDs particularly effective in differentiating patients from controls? Second, would diagnostic accuracy for those disorders be improved over current, IQ-based methods should it be possible to estimate prior standing on the identified SCDs? Third, would the potential increase in diagnostic accuracy be sufficient to wairnnt development of SCDEMs? The results of this study were expected to help guide and prioritize the future development of SCDEMs.
The feasibility of the strategy was examined for mild AD, CAA, and mTBI. The method involved use of summary data reported in previously published studies of these disorders. The decision to study mild AD, CAA, and mTBI as opposed to other disorders was based on several factors: (a) these disorders are common reasons for referrals to neuropsychologists, (b) it may be difficult to detect the cognitive decline associated with them, (c) it may be hard to differentiate them from other competing etiologies, (d) the SCOT performance of patients with these disorders may be disprop01tionately impaired relative to their performance on IQ tests, and (e) these disorders can be associated with very high costs to individuals and to society.
For example, the annual incidence of TBI in the US is approximately 2 million, and the vast majority of cases are mild. Across all categories of TBI, the annual cost for acute care alone is estimated at $25 billion (Williamson , Scott, & Adams, 1996). Alcohol problems affect about 10% of the US adult population, or approximately 30 million individuals, and cost US society an estimated $ 100 billion annually in treatment and in lost productivity (Drug Abuse USA, 1996;Parsons, 1996;US Census Bureau, 1998).
Alzheimer's disease is a progressive degenerative brain disorder . Estimates of the incidenc e and prevalence of AD vary widely due to methodological differences across studies (Bondi, Salmon, & Kaszniak , 1996). However, it is estimated that approximately 10% of individuals over the age of 65 years have dementia, and that AD accounts for as many as 50% of those cases (Nixon, 1996). About 4 million Americans currently have AD, with an annual cost to US society of about $100 billion annually (Alzheimer's Association , 1999; Gilliard & Rabins, 1999). Given current prevalence and incidence rates , approximately 10 million people in the US may suffer from AD by the year 2030 (Woodruff-Pak:, 1997). Diagnosis of AD is usually made by exclusion and can be confamed only at autopsy (McKhann et al., 1984). Differential diagnosis of AD may be difficult at times. For example, research indicates that of patients with dementia , less than 75 % are correctly classified by dementia subtype (e.g., AD vs. vascular dementia ) (Barrett, Haley, Harrell, & Powers, 1997;Chui & Zhang, 1997;Ryan, 1994). Accurate and early detectio n of AD is critical to initiating appropriate treatment early in the disease process, when it might be most effec tive (McLendon & Doraiswamy, 1999). This , in tum , may help to reduce emotional and financial burden to families and to society.
The results of the study were expected to support the strategy. That is, it was anticipated that for each disorder being studied, analysis would show that one or more SCDs (as measured by their corresponding SCDTs) exceeded the accuracy of IQ measures. Further, it was hypothesized that this difference would be of sufficient magnitude (i.e., 10% or greater) to suggest that the development of SCDEMs is worthwhile, that is, could improve detection of cognitive decline and reduce falsepositive e1Tors, thereby increasing diagnostic accuracy . The 10% benchmark is somewhat arbitrary, but it is intended only as a rough guide.

METHOD
For the sake of clarity, a brief overview of the method will be presented first, followed by a more detailed description. viewed as a marker of the potential sensitivity of the SCD versus IQ to the disorder, and provided an indication of the relative ability of SCD D-scores versus IQ D-scores to detect cognitive decline associated with the disorder. The magnitude of any discrepancy was viewed as a measure of the potential improvement in diagnostic accuracy that might be obtained by developing SCDEMs.
Arguably, the strategy would have been tested more rigorously had actual SCDEMs been developed and examined for clinical utility across several disorders.
However, the effort involved in developing even a single SCDEM and examining its clinical utility in a single clinical sample with a single disorder is substantial. Moreover, it might well be premature to unde1take such a laborious and potentially expensive project without first examining the viability of the strategy. Based on these guidelines, 22 studies (8 mild AD, 9 CAA, and 5 TBI) were consecutively selected, starting from an extensive search of approximately 800 Psychlnfo and PubMed abstracts, and proceeding, where indicated, to a full review of the studies.
Many studies that initially appeared promising turned out to be unsuitable once the full report was examined. More studies were used than originally planned (22 vs. 15) in an effort to increas e the trustworthiness of the findings, especially in light of some of the allowances that were made during study selection. A description of each selected study is provided in Appendix A.
The most common reasons for study exclusion were the absence of reported, or clearly reported, means and/or SDs; use of only one or two WIS subtests; use of experimental rather than standardized neuropsychological tests; use of heterogeneous clinical samples (e.g., a "brain damaged" or a "dementia" group comprised of subjects with multiple etiologies) --a particularly troublesome problem in identifying mild AD and mTBI studies; and use of summary scores, (i.e., scores on several SCDTs collapsed into a single summary score) . Other reasons for study exclusion included lack of clearly reported means and SDs (e.g., difficult to read bar graphs with error bands), omission of means and SDs for key tests, and very small sample sizes (i.e., < 5).

Selection of IQ and SCDT Distributions for Reconstruction
The IQ measures included VIQ, PIQ and FSIQ from various versions of the Wechsler Intelligence Scales or, in one case , the Wechsler-Bellevue Intelligence Scale.
The Processing Speed Index was the only factor score selected for distribution reconstruction because of its reported high sensitivity to brain damage in general (Hawkins, 1998 Indexes (IMI, WMI , and GMI, respectively), thereby cove1ing all of the theorized stages of memory processing. Distributions were reconstructed for other WMS-III indexes (e.g., Visual Immediate Index) when the reported means and SDs appeared as sensitive to the disorder as IMI, WMI , and GMI.

Con-elations among the Selected SCDTs and IQ
Memory loss is a cardinal feature of Alzheimer 's disease (Bondi et al., 1996;Nixon, 1996) and may be affected to a relatively greater degree than IQ in the early stages of the disorder (Bornstein & Chelune, 1988 Patients with AD can exhibit language impairment, particularly in naming and verbal fluency (Bondi et al., 1996;Nixon, 1996). Accordingly, distributions were reconstructed for the Boston Naming Test and for verbal fluency tests. The Boston Naming Test generally shows modest con-elations with WAIS-III IQ (.38, .42, and .44 for VIQ, PIQ, and FSIQ respectively) (Mitrushina et al., 1999;Psychological Corporation, 1997). The verbal fluency measures used here involve generating words starting with a specific letter or belonging to a designated category as rapidly as possible within a short time period. Fluency to letter cue is often considered to be an executive functioning task (i.e., initiation) , whereas semantic fluency is more often considered to be a language task.
It was sometimes difficult to determine from the reported studies whether the letter cue or semantic cue task had been administered, and thus verbal fluency was grouped with other language measures. Verbal fluency is moderately con-elated with WAIS-III VIQ , PIQ, and FSIQ, (61, .48, and. 59, respectively) (Psychological Corporation, 1997).
Attentional abilities can also be impaired in AD (Lezak, 1995;Nixon, 1996). Part A of the Trail Making Test is often described as being sensitive to attention, visual scanning, eye-hand coordination speed, and information processing. In general, the correlations between this test and IQ scores ranges from modest to moderate levels (-.27 to -.46) (Psychological Corporation, 1997;Tremont, Hoffman, Scott, & Adams, 1998), although lower correlations have been reported (Yeudall , Reddon, Gill, & Stefanyk, 1987).
Alzheimer's disease is also associated with impairment in executive functioning and abstract reasoning (Bondi et al., 1996) Part B of the Trail Making Test is often viewed as tapping executive functioning; its con-elation with IQ scores tends to be moderate (i.e., -.42 to -.66) (Psychological Corporation, 1997;Tremont et al., 1998), although lower correlations have been reported (Yeudall et al., 1987). The P01teus Mazes are considered to be sensitive to executive impai1ment, pa1ticularly planning ability (Lezak, 1995). Studies have suggested modest correlations between performance on the Porteus Mazes and IQ (e.g., Porteus, 1965;Watson & Klett, 1974). The Stroop Inte1ference Task is considered to be a measure of complex attention and executive functioning. Successful performance requires inhibiting a prepotent response in favor of an unusual one. Correlations between the Stroop and IQ vary across studies but appear to be generally modest (Mitrushina et al., 1999;Spreen & Strauss, 1998).
Visuospatial impairment is also commonly seen in AD patient s (Nixon, 1996).
The Benton Facial Recognition Test (BFRT) was designed to assess recogni tion of unfamiliar faces, or a sub-domain of visuospatial ability. Trahan's (1997)  How ever, given their wide use in clinical practice, it was decided to recon struct their distributions for separate evaluation. The DRS correlates moderately with WAIS-III VIQ , PIQ, and FSIQ (.59, .58, and .61, respectively) (Psychological Corporation, 1997).
As was the case with mild AD, c01Telations between IQ and the SCDTs selected for distribution reconstruction in the CAA and mTBI studies were generally modest (with some exceptions to be covered later) , and it seems unnecessary to go into full details.
However , a brief description --similar to that provided above for mild AD --of the cognitive dysfunction typically associated with CAA and mTBI helps provide a context for subsequent discussion .
Early in the course of abstinence, chronic alcohol abusers tend to exhibit mild-tomoderate impairment in executive functioning, particularly in abstract reasoning and problem solving. These patients also tend to show impairment in perceptual-motor and perceptual-spatial abilities, as well as in learning and memory (somewhat greater for nonverbal as opposed to verbal information) (Allen & Landis, 1997;Lezak, 1995;Parsons, 1996). Most studies show that cognitive functioning improves with sustained abstinence, but the recovery process may take up to 5 years (Parsons, 1996).
In the case of mTBI, many individuals perform in the impaired range on tasks of complex attention and cognitive set shifting. Deficits may also occur in learning and memory, and in abstraction and problem solving (Dikmen, Temkin, & Armsden, 1989;Kay , 1986;Levin, Benton, & Grossman, 1982;Levin et al., 1987;. Decreased information processing efficiency is common (Gronwall & Wrightson, 1974Lezak, 1995) in mTBI. It may contribute to attenuated performance on other cognitive tasks, and might also underlie the memory and concentration complaints of mTBI patients (D. J. . Moreover, improvement in speed of infonnation processing tends to parallel reduction in postconcussion symptoms (Dikmen et al. , 1989;Gronwall, 1976). Post-concussion symptoms may include blurred vision, headaches, dizziness, anxiety, depression , and sleep disturbance (Levin et al., 1987;Rutherford, 1989). Results of well-controlled studies indicate that by 1 to 3 months post-injury , most mTBI patients do not differ to a statistically significant degree from matched controls on tests of neuropsychological functioning (e.g., Dikmen et al. , 1989;Levin et al., 1987). Given that most mTBI and CAA patients improve or recover over time, the current strategy might be most useful in identifying cognitive decline during the period before recovery occurs, or in the minority of cases with persistent deficits.

Distribution Reconstruction and Analysis Procedures
Distribution Reconstruction Procedures SCDT and IQ distributions were reconstructed based on reported IQ and SCDT means and SDs. All reconstructed distributions were arbitrarily set to a sample size of 100 hypothetical subjects, thereby permitting consistency and unifo1mity in the reconstructed distributions. Because the shape of a distribution is dependent on its mean and standard deviation, increasing or decreasing sample size has little or no effect on the general shape of distribution cmves. Few studies reported score rang es. Consequently, reconstructed distributions were set either to a test's entire range of scores, or, for tests with large ranges (e.g., WAIS IQ), were set to scores that coITesponded to approximately ± 3.5 to 4.0 SD of the mean. It is understood that this work involves approximations and that here and elsewhere, adjustments to the original data may be necessary, but might also alter, to varying degrees, the tme nature of underlying data. However , such approximations and adjustments are not at all uncommon in initial stages of development or in initial feasibility examinations. Further, in the present case, these assumptions generally do not work in favor of the hypotheses , and they pe1mit a much broader analysis than would otherwise be possible.
The distdbutions were reconstructed using the STANDARDIZE and NORMDIST commands in Excel 98™ (Microsoft, 1997-98). The STANDARDIZE command returns the z-score for all test scores in a normal distribution with a given mean and SD. The NORMDIST command returns the cumulative area under the curve for each score in a normal distdbution. For example, scores that fall at 1 SD below the mean and 1 SD above the mean receive a NORMDIST value of .34 and .84, respectively . Therefore, the area under the curve associated with a particular score (i.e., score band-the interval between the next lowest and next highest scores) had to be calculated by subtraction as follows: Area of score 0 = NORMDIST 0 + 1 -NORMDIST 0 _ 1 • However, summing across all of the score bands results in twice the area under the curve. This is because the score bands overlap such that 50% of the area of a score band also belongs to the next highest score band and the other 50% belongs to the next lowest score band . Accordingly, the area of each score band was divided by 2 before being multiplied by 100 (the number of hypothetical subjects in each sample). The resulting values provid ed a reconstructed normal frequency distlibution with a mean and SD identical to that report ed in the 01iginal study. The distdbutions were then graphed using Excel's graphing functions. For each single SCOT, the graphs of the reconstructed clinical and control distdbutions were then plotted and juxtaposed to aid visualization of the degree to which they overlapped.
Selected graphs are presented in Appendix B for purpos es of illustration.

Calculating Percent Overlap
For each distribution, the test score closest to the point at which the two distributions intersected was identified (Intersect scores). The z-scores for these two Intersect scores were obtained using Excel's STANDARDIZE function. The area under the curve between the mean and the two z-scores was obtained from a published table (Berkowicz, Ewen, & Cohen, 1976,  for the mild AD group and 27.6 ± 7.8 for the control group. Accordingly, the graphs of the reconstructed distributions overlapped such that the mild AD distribution fell to the left of (i.e., was shifted lower than) the control distribution. The reconstructed distributions intersect ed at a TMT-A score of 21 seconds (Intersect score). For the mild AD distribution, the z-score for the Intersect score was . 91. Refere nce to the table of the area under the normal curve revealed that 31.86% of that area falls between the mean and z = .91. As the mild AD group's Intersect score fell above the mild AD mean , the area of interest was that extending beyond (i.e., higher than) the Intersect score. Accordingly, 18.14% (i.e., 50% -31.86%) of the mild AD distribution fell above the Intersect score.
Each SCDT was grouped in only one SCD. These average percent overlap values for the SCDs were refeITed to as SCD indexes. Analogous IQ indexes were calculated. The SCD and IQ indexes provided a measure of the average extent to which clinical and control groups overlapped on IQ tests and on tests measuring the same SCD. As such, the indexes were also considered to provide a marker of the degree of SCD versus IQ sensitivity to the disorder. The smaller the value of the index, the greater the sensitivity.
SCDT and IQ indexes were as follows: (a) The ALL IQ index was the average percent overlap for all reconstructed IQ distributions (i.e., VIQ, PIQ, and FSIQ).

U) The OMNIBUS index included the Dementia Rating Scale and Mini-Mental
State Examination, which were used in some of the mild AD studies.
For each disorder, each SCOT index was subtracted from each of the two IQ indexes (ALL IQ and VIQ/FSIQ). The resultant values were considered to be markers of a SCO' s sensitivity to the disorder relative to IQ. Positive values suggested that the disorder attenuated SCD to a greater extent than IQ; negative values suggested that IQ was more sensitive to the disorder. As a secondary analysis, Bonferroni-corrected pairwise t-tests were used to determine the degree to which the difference scores exceeded chance levels. This was intended to enhance confidence in the results but was not meant to replace the previously set 10% criterion: that a difference between the SCOT and IQ indexes of 10 percentage points would be required to provide support for the rationale and for the development of SCOEMs.

RESULTS
The percent overlap values for each pair of reconstructed clinical and control group distributions are presented by disorder in Tables 1, 4 Table 3 shows that the greatest difference occurred between MEMORY and the two IQ indexes (ALL IQ and VIQ/FSIQ). That is, across the mild AD studies, the amount of overlap between the reconstructed clinical and control distributions was considerably less for memory SCDTs than it was for IQ. The differences were statistically significant. This suggests that memory is more sensitive to mild AD than IQ is, and thus, that memory SCD D-scores might be more efficacious than IQ D-scores at detecting cognitive decline in mild AD, and improving diagnostic accuracy . Moreover, the magnitude of difference between the MEMORY index and the ALL IQ and VIQ/FSIQ indexes easily exceeds the 10% criteria for supporting the development of memory SCDEMs.      Table 2). Three studies (i.e., Haxby et al., 1990;Kirk & Ke1tesz, 1991;Petersen et al. , 1999) reported results for the Dementia Rating Scale (DRS) and/or the Mini-Mental State Examination (MMSE). These are omnibus tests of cognitive functioning commonly used in the assessment and staging of dementia. Haxby et al. and Petersen et al. used control groups; Kirk and Kertesz did not. DRS no1ms from Schmidt et (rep01ted in Spreen & Strauss, 1998 provided the comparison group for Kirk and Kertesz's sample .

IQ index minus SCOT index). Inspection of
The mean percent overlap value for the reconstructed omnibus distributions was 18%.
This was 33% less than ALL IQ and 36% less than VIQ/FSIQ (both significant at p < .05). This result suggests that omnibus measures may be more efficacious than memory mea sures in differentiating patients with mild AD from no1mals; how ever, the number of comparisons in the Omnibus index was quite small, thereby raising questions about the trustworthiness of the result.
The present results raise the question of whether memory SCDTs might also be efficacious in identifying patients with mild cognitive impairment (MCI). These patients are at increased risk for developing AD but do not yet meet diagnostic criteria for it (Jones & Fe1Tis, 1999). This issue was addressed by reconstmcting and analyzing memory and IQ distributions for a group of MCI patients and controls (Petersen et al., 1999). For the MCI group, the mean percent overlap was 86% for ALL IQ and 48% for MEMORY . Thus, there was 38% less clinical versus control overlap, on average, for the memory SCDTs than for the IQ measures. This result is not necessarily unexpected since MCI is typically defined by memory impairment in the context of relatively preserved functioning in other SCDs and in daily functioning. Nonetheless, the result supports the clinical utility of memory SCDEMs in MCI and suggests that memory SCD D-scores would be more effective, or much more effective, than IQ D-scores at detecting MCIrelated cognitive decline. Table 4 lists each of the selected CAA studies, summarizes the sample sizes, provides the means ± SD for each test selected for distribution reconstruction, and lists the percentage of shared area for each reconstructed distribution pair. Table 5 lists the mean ± SD percent overlap for each CAA index, and provides the number and range of the individual SCOT percent overlap values comp1ising them. The CAA OTHER index was comprised of the Grooved Pegboard Test and a verbal fluency test.  Table 6   Also, the mean percent overlap for the EXECUTIVE index was 9% and 12% lower than for ALL IQ and VIQ/FSIQ, respectively (see Table 6). These differences approached or met the 10% c1iteria, but were not statistically significant. This result was, however, in the expected direction given reports of impaired executive functioning in CAA (e.g., Parsons, 1996). It suggests quite tentatively that executive SCD D-scores might be more efficacious than IQ D-scores at detecting cognitive decline in CAA.      Table 5).
Two of the CAA studies reported scores for the Halstead Impairment Index and for a modified Halstead Impairment Index 3 • Being omnib us indexes of cognitive functioning, they were treated separately from SCDTs in the present analysis . On average , these indexes had 10% less overlap than ALL IQ and 13% less than VIQ/FSIQ, suggesting that they are more sensitive to CAA than IQ. However, the small number of comparisons renders the results rather tenuous.  Attentional SCDTs were somewhat more sensitive to mTBI than IQ measures.

Mild Traumatic Brain Injury
The difference approached the 10% benchmark when PIQ was removed from the IQ comparison. This result allows the tentative conjecture that attention SCD D-scores might be more efficacious than IQ D-scores in detecting cognitive decline associated with mTBI . The results also suggest that attentional SCDTs share more variance with PIQ than with VIQ/FSIQ. That SCD D-scores tapping attention might improve detection of cognitive decline in mTBI is consistent with rep01ts in the literature describing impairments in such functions , and in other functions with a co-dependency on attentional capacities, such as working memory and processing speed (e.g., D. J. G. . None of the remaining SCD versus IQ comparisons approached the 10% criteria. Of interest, the relatively small differences that were obtained are not consistent with the marked effects sometimes assumed to occur within these cognitive domains in mTBI in comparison to changes in IQ.     Table 2).

DISCUSSION
The present study was an initial feasibility test of a new approach to estimating prior cognitive abilities. The analysis examined the extent to which SCD D-scores might be more efficacious than IQ D-scores at detecting cognitive decline associated with particular disorders. The results could help to guide and prioritize the development of SCDEMs for improving diagnostic accuracy in specific disorders.
Across the mild AD studies , the mean percent overlap was 23% to 26% less for memory SCDTs than for IQ, a result that achieved statistical significance. That is, there was 23% to 26% less overlap, on average, between the reconstructed clinical and control SCDT distlibutions than there was between the reconstructed clinical and control IQ distributions. Larger differences between the mean SCOT and IQ percent overlap values indicate greater SCDT sensitivity to the disorder . Accordingly, the result suggests that memory D-scores might be more efficacious than IQ D-scores for detecting cognitive decline in mild AD and at identifying individuals with the disorder. This finding is consistent with Schlosser and Ivison's (1989) report that WMS MQ D-scores were supelior to IQ D-scores for distinguishing AD patients from normal controls. The current results expand on Schlosser and Ivison's findings by providing the first systematic demonstration across different patient samples of the potential superiolity of memory Dscores versus IQ D-scores for identifying cognitive decline in mild AD. Moreover, the result exceeded the 10% benchmark set as a rough guideline for determining whether fmther development of SCDEMs might be justified .
Some of the percent overlap values for memory SCDTs were quite small, thereby raising the question of whether dementia sevelity in some samples exceeded mild leve ls.
If so, the present result might provide an inflated estimate of the potential efficacy of memory D-scores. However, the finding that memory SCDTs had 38% less overlap than IQ measures, even among MCI patients, argues against a more negative interpretation.
Across the mild AD studies, the mean percent overlap was 33% to 36% percent less for omnibus measures than for IQ, although the limited number of individual compa1isons in this analysis raises questions about the trustworthiness of the result. This caveat notwithstanding, the result indicates that omnibus measures were more sensitive than IQ to mild AD. The result probably follows from the wide net these measures cast in screening cognitive functioning. Because these measures tap cognitive functioning broadly, they are likely to capture impairment not only in memory but in other SCDs as well. Poor overall scores on these measures could be achieved via impairment in any SCD. Thus, although D-scores based on MMSE and DRS total scores may be useful for distinguishing patients with AD from n01mals, they may have less utility for distinguishing between AD and other neurocognitive disorders, especially when the pattern of cognitive impairment is a key differentiating feature.
The potential value of the strategy for identifying mild AD patients more efficiently may be considerable. For example, the present results showed that up to 26% more mild AD cases could be identified through the use of memory D-scores and appropriate cutoffs than might be identified using IQ D-scores . Given reported US incidence and prevalence rates for AD (Hebert et al., 1995), this result translates to annual potential identification of over 20,000 new mild AD cases that might otherwise remain undetected. Although this figure assumes ce1tain optimal conditions, an increase in accuracy by 5% to 10% may well be realistic. Further, the MCI reanalysis demonstrated the possibility of identifying patients in a prodromal phase of AD , thereby providing the opportunity to introduce treatments that might slow or prevent conversion to AD .
Across the CAA studies, executive measures showed 12% less overlap than VIQ and FSIQ combined, tentatively suggesting that executive D-scores might be more efficacious than IQ D-scores at identifying cognitive decline in CAA. Strong correlations (i.e., colinearity) between IQ and some of the selected executive SCDTs, especially the Category Test, might well have constrained this examination of differential effectiveness.
Nonetheless, this result is consistent with research indicating diminished executive and abstract reasoning abilities in CAA patients (e. g. Parsons, 1996). In contrast, the finding that memory SCDTs were less sensitive than IQ to CAA was unexpected, especially in light of research indicating disproportionately decreased memory functioning in CAA (e. g., Parsons, 1996). The reason for this outcome is uncertain, but it might reflect some idiosyncrasies of the selected studies, or baseline IQ differences between the control and clinical subjects in these studies . Results suggesting that the combination of the Grooved Pegboard Test and a verbal fluency test were more sensitive than IQ to CAA are difficult to interpret because the two measures seem to tap different abilities. However, this result could reflect executive dysfunction (e.g., decreased initiation and impaired motor planning and sequencing) and/or generally reduced performance speed. Alternatively, the sensitivity of the combination of Grooved Pegboard and verbal fluency tests could be a reflection of both tests being valid yet non-redundant indicators of cognitive dysfunction in CAA The reconstructed clinical and control Halstead Impairment Index (HII) distributions showed 10% to 13% less overlap than the reconstructed IQ distributions.
However, because HII is a composite index, it is unlikely to aid in differential diagnosis when the pattern of impairment is a distinguishing feature.
Although not always clearly specified, the severity of TBI across the selected studies appeared mild to moderate. Even so, the mean percent overlap values were generally similar for SCDs versus IQ, with discrepancies of less than 10%. In fact, constructional tests appeared less sensitive than IQ to mild to moderate TBI. For the TBI studies, attentional measures showed 6% to 9% less overlap than IQ. This result is congruent with previous reports of impaired attentional functions in mTBI (e.g., D. J. G. . Howev er, the size of the differenc e is relatively small and suggests that attention D-scores would not provide much of an advantage over IQ Dscores at detecting cognitive decline in mTBI, at least for the attention SCDs included in the present study. The failure to find a more robust discrepancy between SCDs and IQ across the mTBI studies could partly reflect patient status, that is, many patients eventually recover following mild to moderate head injury and there simply may have been relativ ely small overall differences to detect. Finding that memory D-scores might exceed IQ D-scores at detecting mild AD was expected: memory impairment is the cardinal feature of mild AD, and memory measures are the most sensitive psychometric indicators of the disorder , even in its earliest stage s (e.g. Bondi et al., 1996). Still, almost all of the research on estimating prior ability --eve n for detecting dementia --has focused on estimating prior IQ, thereby suggesting that any such expectancy has not guided researc h. The current findings suggest that research efforts aimed at improving AD detection may be more fruitful if directed at developing memory SCDEMs, rather than at plior IQ estimation. Indeed , it seems likely that more exact and refined methods than were used here might well improve the sensitivity of memory D-scores. For example, predictive variables, or more exactly, postdictive valiables could be identified and combined through much more exact means, such as discriminative function analysis and multiple regression.
In contrast to the mild AD results, it was unexpected that memory SCDTs would be found to be less sensitive than IQ to CAA, and that attention SCDTs would tum out to be only very slightly more sensitive than IQ to mTBI. These counterintuitive results indicate that it may be difficult to identify accurately or optimally the differential cognitive effects and magnitude of impairment that best characterizes a disorder and differentiates it from others. The method used in the present study provides a viable method for characterizing these patterns . That is, by refining and extending the method used here, (i.e., calculating and comparing percent overlap values across multiple studies of a disorder) it might be possible to identify the dimensions and levels of functioning that best characterize disorders. Knowing these dimensions and levels and applying them to the formation of properly developed decision procedures could help considerably in discriminating between those with and without a particular disorder. When combined with p1ior test results or with estimates of p1ior SCD standing , such information could provide an impo1tant key to accurate diagnosis.

Limitations
The limited support for the strategy obtained in the CAA and mTBI analyses could reflect a number of methodological limitations. First, colinearity between IQ and some SCDT measures indicates that they may be, in large part, measuring the same function(s). Such colinearity likely limits the possibility of achieving or detecting a difference in sensitivity to disorder. Second, the tests used in the selected studies might not have been the most sensitive to CAA and mTBI. For example, the present results provided some very limited evidence that the Grooved Pegboard Test might be particularly sensitive to CAA , but only one study used this test. Also , a more difficult test of complex attention and information processing, such as the PASAT, might have been better able to separate mTBI and control groups. Third, the overall level of overlap in the SCDT and IQ distribution pairs across the mTBI and CAA studies was fairly high (see tables 5 and 8). Although this may be a unique feature of the selected studies, it may also reflect a lack of strong SCDT and IQ performance decrements in these disorders. Fourth , although studies were screened for the presence of adequately diagnosed clinical samples, the results are, nevertheless, constrained by the appropriateness of the subject selection procedures used in the original studies. For example, the inclusion of non-demented subjects or subjects with other types of dementia could have affected the outcome of the mild AD analysis. Similarly , the inclusion of mTBI subjects who were nearly recovered from their injuries could have decreased effects.
The lack of more robust support for the strategy in the CAA and mTBI analyses does not discredit the strategy for improving detection of cognitive decline in particular disorders . Indeed , formal selection and combination of the variables that based on compaiison of predicted and obtained performance, optimally discriminate between those with and without a disorder may improve diagnosis considerably. Accordingly, the strategy might have some utility in CAA and in mTBI if variables with more discriminative power, either singly or in combination, were identified. Moreover, the strategy may be useful in diagnosing other disorders that disrupt cognitive functioning.
The CAA and mTBI results do suggest , however, that efforts to improve IQ estimation should not be broadly or completely abandoned in favor of developing SCDEMs. Normative samples were sometimes used as a substitution for control groups.
Given sufficient demographic similarity , normative samples should provide a reasonable representation of the clinical group's prior ability level. However, this procedure is suboptimal because differences may exist between any given selected sample and a normative sample . For example, contamination could have occmTed between the demographic variables used to fo1m the normative groups and the outcome variables in the present study. Such effects may be difficult to detect, however, and it seems unlikely they would have consistently altered results in a particular direction, even if present.
All of the reconstructed distributions were normal, and it was not possible to account for skewness, or ceiling and floor effects, in the original data. When ceiling and floor effects were present in the original data, the reconstructed distributions were truncated, accordingly, at either end of score range. This truncated effect was seen in minority of cases suggesting that it probably did not affect the results greatly. Further, ceiling and floor effects generally indicate that the clinical and control groups would have performed even more disparately if the test contained sufficiently easy or difficult items.
Consequently, ceiling and floor effects seem unlikely to have caused systematic underestimation of percent overlap, in tum spuriously supporting the strategy, although firm conclusions are difficult without the original raw test data.
Although the methods followed in this study were imprecise and often involved estimations and approximations, they seem to have been sufficient to achieve the investigative aims. It is not unusual, in exploratory work, both across the soft and hard sciences, to set up approximations, make educated guesses, and proceed in the face of considerable ambiguity. Certainly, the proposed strategy could be tested more rigorously by developing and using actual SCOEMs. However, such an endeavor was well beyond the scope of this particular project, especially given the general lack of SCOEMs and the need for a broader initial analysis.
One curious and remarkably consistent result was that SCOT indexes were 3% more sensitive than VIQ/FSIQ as opposed to ALL IQ in all but two comparisons. This implies that SCOTs consistently share slightly greater variance with PIQ than with either VIQ or FSIQ. The result is generally consistent with previous reports that PIQ may be more sensitive to brain damage than VIQ (Gouvier et al., 1983). No particular source of methodological artifact could be identified that could have produced this result.

Summary
The results of this initial feasibility study provide relatively strong support for the proposed strategy for one of the three disorders, specifically, for the development of memory SCDEMs for detecting mild AD. The main result suggests that memory Dscores might be more efficacious than IQ D-scores at detecting cognitive decline in mild AD, and at improving detection of the disorder. The results of the CAA and mTBI analyses provide less, or considerably less, support for the strategy. For these two conditions, SCDEMs, at least in the cognitive domains, or via the composites examined here, might not meaningfully improve diagnostic accuracy beyond properly derived IQ D-scores. The lack of robust support for the strategy in CAA and mTBI might reflect several factors, including, but not limited to colinearity between IQ and some of the SCDTs selected for distribution reconstruction. The CAA and mTBI results do not discredit the strategy, as it may well have utility under different circumstances, or with different disorders. Moreover, the strategy and the method used to test its feasibility may help prioritize and guide the development of SCDEMs in other neurocognitive disorders.
Even if turns out that SCDEMs cannot be viably developed, the current method could help to identify which cognitive variables, in which combination , and with which D-score cutoffs best distinguish among neurocognitive disorders. This could lead to more accurate characterization of disorders and could be key to improving diagnostic accuracy in certain areas of neuropsychology.

Implications and Possible Future Directions
Diagnostic accuracy is often essential to treatment and care planning and may serve to improve patient outcomes, lessen caregiver stress, decrease financial strain on patients and their families, and reduce public health care burden. The strategy developed and tested here provides a potential means of improving diagnostic accuracy, at least for mild AD. This is important because early detection of AD may permit implementation of newly available treatments at a stage of illness where they are likely to be most effective at slowing cognitive decline.
Determining the potential clinical utility of SCD D-scores more accurately requires development and testing of actual SCDEMs. The current results suggest that priority might be placed, at least initially, on the development of memory SCDEMs.
Results of previous studies offer a potential guide for developing viable SCDEMs (e.g., Crawford et al., 1992;Crawford et al., 1998;Hawkins et al., 1993;Schlosser & Ivison, 1989). The first step would involve a search for a set of variables that are strongly related to SCD performance, yet are relatively unaffected by the disorder in question. Once identified, these variables could be combined via multiple regression. The validity of SCDEMs could be initially and tentatively assessed using cross-sectional designs; however, retrospective designs would be necessary to firmly test their validity.

APPENDIX A Characteristics of the Selected Studies
Mild Alzheimer's Disease Studies Bigler, Hubler, Cullum, and Turkheimer (1985) obtained brain CT scans from patients with early AD. Based on these scans, they estimated ventricular volume and calculated an index of cerebral atrophy. They examined the relationship between these measurements and WAIS and WMS perfonnance. There were 42 subjects, 23 men and 19 women, with a mean education level of 13.1 ± 3.5 years and a mean age of 67.9 ± 9.9 years. There was no control group. Although not specified in the report, it appears that subjects were drawn from university medical centers in western United States. Bigler et al. (1985) reported WAIS IQ indexes and WMS MQ only. In the present study, reasonably similar to WAIS old age standardization sample in terms of age and education, although they were older than the WMS standardization sample. This is not expected to have impacted the results significantly since MQ is age corrected and because Bigler et al. reported MQ scores, not raw scores. Botwinick, Storandt, and Berg (1986) followed 18 subjects with mild AD for four years. Subjects' annual performance on 16 cognitive tests was compared with that of 30 control subjects matched on age, sex, and socioeconomic status. In the present study, distributions were reconstructed based on the means and SDs reported from the first of the annual evaluations. WAIS FSIQ scores were prorated from reported means and SDs for the WAIS Information, Comprehension, Digit Symbol, and Block Design subtests .
Regarding potentially misleading overlap values and possible cut points in the reconstructed (normal) distributions: 78% of the mild AD but only 15% of the control groups fell below a VF score of 22 words; 75% of the mild AD but only 8% the control groups fell below a score of 58 on the 85-item BNT; and 49% of the mild AD and only 8% of the control groups fell below a Benton Visual Retention Test copy score of 8. Haxby et al., (1990) conducted a longitudinal study of regional cerebral metabolic rates and neuropsychological functioning in 11 patients with mild AD and 29 controls matched on age, sex, and education. Mean follow-up duration was 26 months. In the present study, distributions were reconstructed based on the reported means and SDs from the initial evaluation. Regarding potentially misleading overlap values and possible cut points in the reconstructed (normal) distributions, 61 % of the mild AD group but only 5% of the control group scored above 149 seconds on the TMT-B. Kirk and Kertesz (1991) compared the spontaneous drawings of 38 patients with probable AD with those of 39 controls. The groups did not differ significantly in terms of age or education. The severity of dementia in the AD group was not specified but given their capacity to complete the research test battery, were probably mild-to-moderately impaired, not severely impaired. The control and AD patients ' drawings were compared, however , the control group's IQs and MQ were not reported. Therefore, for the present study, the reconstructed mild AD IQ and MQ distributions were compared to reconstructed distributions based on the W AIS-R and WMS standardization samples. The AD group appeared to be reasonably similar to the standardization samples demographically. Distributions for the drawing scores were not reconstructed because the variables on which they were scored seemed unique to this study. The authors also administered the DRS to patients and this was treated separately.
The Psychological Corporation (1997) conducted a number of small studies aimed at characterizing the WAIS-III/WMS-III performance of patients with various diagnoses including mild AD, CAA, and TBI. The WAIS-III/WMS-III was administered to 35 mild AD subjects with a mean age of 72.2 years (SD = 7 .8). This group was better educated than the same-aged subgroup of the standardization sample: Most (48.6%) of the mild AD subjects had at least 16 years of education, whereas only 14% of the same-aged standardization sample fell in this education range. The higher education level of the mild AD group likely did not unfairly favor the research hypothesis. Lower versus higher education levels are generally associated with higher rates of dementia and with relatively greater impairments at comparable levels of pathological process than (Bondi et al., 1996). Consequently, patients with comparable levels of mild AD who have more education would be expected, on average, to score better on cognitive tests than those with average or below average education levels. Given their higher education level, one might expect that the prior IQ level of this particular mild AD sample was somewhat higher than that of the standardization sample mean of 100 and SD of 15. Accordingly, it seems unlikely that comparing the reconstructed mild AD distributions to the WAIS -III/WMS-ID standardization sample's mean and SD would have biased the results toward a spurious conclusion of exaggerated decline in the mild AD group.
For the Psychological Corporation (1997) study, distributions were reconstructed for the WAIS-III VIQ, PIQ and FSIQ indexes. The WAIS-ill Working Memory Index (WMI) was omitted in favor of the WMS-III WMI, because it included a measure of non-verbal attention span that is not included in the WAIS-III WMI. Moreover, examination of the mild AD group's reported WAIS-III WMI and WMS-III WMI means and SDs revealed that the latter was a more sensitive measure. The VCI and POI indexes were omitted because examination of the reported mild AD means and SDs revealed that these indexes were not as sensitive to mild AD as was the PSI index. Inspection of the reported means and SDs for the mild AD group clearly indicated performance below or far below the standardization group on most WMS-III indexes. Petersen et al. (1999) characterized patients with mild cognitive impairment (MCI) using a combined cross-sectional and longitudinal design. They compared perfo1mance on neuropsychological tests for 234 controls, 76 patients with MCI, and 106 patients with either very mild AD or mild AD. Scores on the Clinical Dementia Rating scale for these groups were 0.5, 0.5, and 1.0 for MCI, very mild, and mild AD, respectively. Patients were assigned to groups using diagnostic consensus. However, the basis for differentiating the very mild AD patients from the MCI patients was unclear.
For the present study, distributions were reconstructed for the 48 subjects with very mild AD, a decision that reflected the concern in the present study with improving methods for early detection of AD. The very mild AD group was younger and less educated than the control group. Although these differences were statistically significant, their magnitude was relatively small (very mild AD group: mean age= 75.6, mean education= 12.5, control group: mean age= 79.8, mean education= 13.3

Chronic Alcohol Abuse Studies
Ban-on and Russell (1992) sought to determine whether a common alcoholic WAIS pattern resulted from right hemisphere damage and whether it could be characterized as loss of fluid intelligence. They compared patients with either right hemisphere damage, left hemisphere damage, or alcoholism, to normal controls matched on age and education. There were 40 subjects in each group. The alcoholic subjects were inpatients at least 35 years old who had been drinking heavily for at least 20 years. At the time of their participation in the study, all subjects had been abstinent for over 2 weeks following detoxification.
Using neuropsychological tests and PET imaging, Dao-Castellana et al. (1998) investigated the possible presence of frontal dysfunction in neurologically normal, chronic alcohol abusers. Subjects were 17 chronic alcohol abusers and nine same-aged controls. The CAA group had a 13 ± 9-year history of drinking, with an average recent daily alcohol consumption of 243 ± 126 grams. The control group had a significantly higher education level than the CAA group, which authors thought might reflect the reportedly poor social and professional adaptation of the CAA patients. The lower education level of the CAA group may have biased results in favor of the present study because education is positively correlated (more or less strongly) with several cognitive tests. However, examination of the percent overlap values (table 4) suggests that this potential biasing effect had minimal impact on the overall results. The only IQ index reported by the authors was WAIS FSIQ and the only memory measure was WMS MQ.
Also reported were VF and Stroop interference and error scores. Of note, 14% of the CAA subjects and 24% of the control subjects fell below a Stroop error score of zero .
Also, about 61 % of the CAA subjects and only 1 % of the control subjects had Stroop error scores greater than three. Jones and Parsons (1971) investigated abstraction ability in matched groups of CAA, brain damage d, and control subjects (n = 40, each group). The alcoholic subjects averaged 7.23 years of heavy drinking. They were tested after withdrawal symptoms had subsided, with an average interval between admission and testing of 42.43 days. WAIS FSIQ mean and SD is reported for the control group. For the CAA group, WAIS FSIQ was estimated using the Shipley-Hartford test. The Category Test was used to measure abstraction ability. Long and McLachlan (1974) compared matched controls and alcoholic patients on multiple cognitive tests and on a general index of cerebral dysfunction (Modified Halstead Impairm ent Index). The alcoholic subjects had been drinking heavily for about 9 years. All had been detoxicated at the time of testing. The average duration from admission to time of testing was 11.41 days.
Oscar-Berman, Clancy , and Weber (1993) evaluated discrepancies between IQ and memory performance in alcoholic men. Subjects were 59 men divided into four groups: young and old alcoholics, and young and old normal controls. All alcoholic subjects had been drinking for at least 5 years, and had been abstinent for at least 4 weeks prior to testing. All participants were from similar socioeconomic backgrounds and did not differ on level of education. The authors compared the groups on the WAIS and W AIS-R IQ indexes and on the WMS and WMS-R . For the purposes of the present study , only the more contemporary WAIS-Rand WMS-R indexes were compared.
The Psychological Corporation (1997) reported on the WAIS-III/WMS-III perfo1mances of 28 CAA patients with a mean age of 53.3 ± 10.2 years. The duration and severity of alcohol abuse in the CAA subjects was not specified; all were detoxified prior to testing, although the time between detoxification and testing was not specified.
Overall, the CAA group was better educated than the same-aged subgroup of the standardization sample; however , most subjects in both groups had 12 years of education.
Distributions were reconstructed for VIQ, PIQ, FSIQ, and the PSI factor score.
Distributions were also reconstructed for the WMS-III IMI, GMI, WMI, and for the Visual Immediate and Visual Delay Indexes. Smith, Burt , and Chapman (1973) sought to dete1mine if CAA patients from middle and upper socioeconomic backgrounds showed the same pattern of neuropsychological impairment found in two previous studies of CAA patients from lower socioeconomic backgrounds. Subjects were males aged 35-55 years admitted to a private hospital for treatment of alcoholism. The sample size could not be determined.
The duration and severity of alcohol abuse was not specified. All patients had been detoxified prior to testing, but the duration of abstinence was not specified. Subjects were reported to have above average IQ and education levels. A control group of similar age and education, but with even higher IQs, formed the comparison group. Wilson, Kolb, Odland, and Wishaw (1987) compared patients with CAA (n = 49) to a control group (n. = 60) and to groups of patients discrete unilateral frontal, temporal, or parietal lesions on a series of neuropsychological tests. The CAA and control groups were comparable in terms of average age (41.4 and 41.9 years for CAA and controls, respectively). The control group had a higher education level (11.8 years vs. 10.2 years for the CAA group) -a small but statistically significant difference (based on t-tests conducted by the present author). This difference could have favored the research hypothesis in the present study. The CAA subjects had a 5-to-20-year history of alcohol abuse; all were abstinent for more than 2 months at the time of testing Yohman, Parsons, and Leber (1985) compared 37 middle-aged alcoholics to 20 non-alcoholic controls matched on age and education. Neuropsychological testswere given 7 weeks and 13 months after detoxification. For the present study, test results from the first assessment were selected for distribution reconstruction because some subjects reportedly resumed drinking at some point prior to the 13-month follow-up testing . The mean alcohol consumption of the CAA group was 9.0 ounces for 13 years. For this study, IQ scores were prorated from reported WAIS Comprehension, Digit Symbol, Digit Span, and Block Design subtests.
Mild Traumatic Brain Injury Studies Bassett and Slater (1990) evaluated 19 adults with mTBis and 10 with severe TBis. Dist1ibutions were reconstrncted only for the mild injury and for matched control groups. The average duration between the time of the injury and testing was 3 weeks. IQ was measured either with the W AIS-R or WISC-R. The percent overlap values for the reconstructed distributions are presented in Table 7. Of note, 43% of the mild CHI group and 8% of the control group produced a non-perseverative en-or score of 10 or greater on the WCST. Con-igan, Agresti, and Hinkeldy (1987) sought to re-examine and extend previous research on the psychometric properties of the Halstead Category Test (CT). Subjects were drawn from a rehabilitation setting that included patients with closed head injury, likely rather severe in nature. The mean age of the CHI group was 27.70 ± 10.03. The authors did not use a control group. Therefore, for the present study , the CT and IQ performances of the CHI group were compared to that of normative samples. Dodrill' s 1987norms (repo1ted in Mitrushina et al., 1999 were used for comparison on the CT because this sample was reasonably similar to Conigan et al.'s in terms of age and education. Corrigan et al. reported results only for the W AIS-R PIQ and VIQ indexes, not forFSIQ. Johnstone, Hexum, and Ashkanazi (1995) administered the WRAT-R Reading subtest (WRAT-R) as an estimate of prior overall cognitive ability. They then subtracted WRAT-R z-scores from cognitive test z-scores, thereby obtaining a difference z-score that were intended to estimate decline in intelligence and SCDs following TBI. Subjects were 97 TBI patients refen-ed as outpatients for neuropsychological evaluation. Subjects' mean age was 33.24 ± 1.28 years; their mean education level was 12.59 ± 2.39 years.
According to the authors, lack of information precluded a grading of injury severity.
There was no control group, therefore , reconstructed distributions for the W AIS-R (only VIQ and FSIQ were reported) and WMS-R were compared with the tests ' standardization samples. Means and SDs were also reported for the TMT-A and TMT-B . Johnstone et al. used norms published by Fromm-Auch and Yeudall (1983) in their conversion of TMT-A and TMT-B raw scores to z-scores. Accordingly , these norms were used for comparison with the reconstructed TMT distributions of the TBI group. Dist:Iibutions were not reconsu·ucted for the WRA T-R since it was used to estimate prior IQ. As such, the authors must have expected pe1formance on it to be relatively preserved following TBI.
Of note, 59% of the TBI group and 13% of the standardization group fell above a score of 35 seconds on the TMT-A; 82% of the clinical group and 6% of the standardization group fell above a score of 82 seconds on the TMT-B.
The Psychological Corporation (1997) published results on WAIS-III and WMSill performances of 22 TBI patients with moderately-to-sev erely injured TBI (closed head injury). Subjects had a mean age of 26.9 ± 11.5 years. The duration between time of injury and testing was 6-18 months. Reconstructed TBI distributions were compared to the WAIS-III/WMS-III standardization sample. This was an approp1iate comparison given reasonable demographic congruity between the clinical and standardization samples . Disu·ibutions were reconstructed for the three main IQ indexes and for the PSI.
For the WMS-III, distributions were reconstructed for WMI, IMI, and GMI. Additionally, distributions were reconstructed for the Visual Immediate and Visual Delay indices given results suggesting that these indices might be very slightly more sensitive to TBI than some of the other WMS-III indices. Uzzell, Dolinskas , and Langfitt (1988) studied the impact of visual field defects (VFDs) as sequelae of head injury on neuropsychological functioning. Subjects were 159 head-injured patients classified into four groups on the basis of the presence or absence of VFDs and according to injury severity (minor-to-moderate or severe). Uzzell et al. did not use a control group. Distributions were reconstructed for the minor-to-moderate head injured groups only. The percent overlap values for the reconstructed distributions are reported separately for the VFD and non-VFD groups in Table 7. Reconstructed distributions were compared to the WAIS and WMS standardization samples, given reasonable demographic congruity between these samples and Uzzell et al. 's. Also, given reasonable demographic congruity between samples, Dodrill's 1987 norms (reported in Mitrushina et al., 1999) were used as the comparison group for the TMT-A and TMT-B.