Development and Initial Validation of a Structured Interview for the Detection of Malingered Mild Head Injury

Neuropsychologists often assess individuals involved in litigation, and who may be in a position to obtain significant financial compensation. Malingering measures are often employed in these assessments, however, the accuracy of these measures may be less than satisfactory. Most malingering measures are not targeted to mild head injury, a very common type of claim. The goal of this research was to begin the development and validation of a measure specifically designed to detect malingering of mild head injury symptoms. More accurate assessment can help identify false claims, and also establish the merit of genuine claims. A 472-item questionnaire was developed by three writers based on knowledge of head injury sequelae and malingering detection techniques. This questionnaire contained both common, plausible symptoms following injury, and implausible symptoms. Malingering individuals may not only overendorse true symptoms, but may identify implausible symptoms or deny positive attributes in an attempt to appear impaired. The ability of this questionnaire to discriminate between individuals answering honestly and individuals faking mild brain injury was evaluated in a classic simulation study with a university sample. Participants (N = 330) were instructed to either respond honestly to the items (honest responders) or to respond as if they had sustained a mild brain injury and were attempting to demonstrate deficits in order to receive financial compensation (malingering responders) . Malingering responders scored significantly and substantially higher than the honest responders on the questionnaire. Cutoff scores accurately classified approximately 95% of participants, with high rates of sensitivity and specificity. Results of cross-validation analyses indicated that total scores on the questionnaire may have good cross-situational consistency. And, an analysis of within-subject response consistency over a relatively short period of time suggested that individuals answering honestly produced more consistent responses over time than malingering responders. Overall, this measure shows promise in discriminating honest responders from those feigning mild brain injuries. Future research will focus primarily on the questionnaire's ability to distinguish individuals with true head injury sequelae from other populations, such as those involved in litigation and individuals with disorders similar in symptom presentation to mild head injury (e.g., toxin exposure and depression). Acknowledgment There are many people who have given me their time and support during this process. My sincere appreciation, respect, and thanks goes to Dr. David Faust, my major professor, for all of his guidance over the years. I would also like to thank my committee members, Drs. Dominic Valentino and Hesook Suzie Kim, for their insights and input into this project. Drs. Allan Berman and James Campbell have always been there to support me, and I thank them for being part of my defense committee. Brett Plummer and David Strong deserve special thanks for all of their help; this project would have been much more difficult without them. A very warm thank you goes to the faculty, graduate students, and staff in the Department of Psychology and the Counseling Center for everything they have done for me; I will miss you very much. I would also like to thank my wonderful friends for all of their encouragement and understanding. And last, but never least, I want to thank my parents and family for all of their love and support throughout my life.


Statement of the Problem
In neuropsychology, the problem of malingering has been given much attention in recent research. Malingering, as defined by the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV), is the "intentional production of false or grossly exaggerated physical or psychological symptoms, motivated by external incentives such as avoiding military duty, avoiding work, obtaining financial compensation, evading criminal prosecution, or obtaining drugs" (DSM-IV; American Psychiatric Association [AP A], 1994, p. 683). Whether or not the DSM-IV description is deemed acceptable, at least two dimensions seem essential to the definition of malingering, intentionality and misrepresentation of symptoms . A person can be considered to be malingering when they are intentionally misrepresenting symptoms of a disorder.
Both intentionality and misrepresentation appear necessary because, for example, information can be misrepresented unintentionally secondary to normal human shortcomings or the presence of disorder, versus the feigning of disorder. For example, someone may have difficulty recalling the specifics of an accident due to a concussion and resultant post-traumatic amnesia. In addition to the two dimensions noted, the definition of malingering is value laden. An individual may pretend to be unconscious in order to avoid being further injured in a robbery attempt. This example does not capture the usual, intended meaning of malingering, or at least the type we are most concerned about societally. Malingering involves deceit for some type of personal gain, often at the expense of some other party. A more appropriate example would be someone feigning symptoms of brain damage to obtain some form of compensation for their injury, be it a case involving worker's compensation, personal injury, or social security disability.
Neuropsychologists often assess individuals who are in litigation, and who may be in a position to obtain significant financial compensation. Research suggests that normal individuals simulating brain impairment on neuropsychological instruments can produce profiles that trained professionals judge to reflect genuine abnormality, and that the rate of error in detecting malingering may be considerably greater than one would hope (e.g., .
Malingering measures are increasingly implemented in forensic assessments, however, the accuracy of these methods may be less than satisfactory, or their status remains experimental (see . Most present malingering measures are not targeted at mild head injury, a very common type of claim, or they assess cognitive performance exclusively. It is highly unlikely that those malingering mild head injury will limit their "efforts" to cognitive measures, and rather will also feign symptom reports. In fact, some individuals will exclusively malinger when reporting symptoms, as might be the case when certain cognitive shortcomings are present but are falsely attributed to an accident. Further, much research across wide areas in psychology shows that maximal judgment accuracy is typically achieved by properly combining multiple, non-redundant, sources of valid information (although perhaps fewer, or far fewer, such variables than many suppose, see ). For these reasons, it seems likely that an interview format specifically aimed at identifying malingering of mild head injury could prove helpful. The goal of the current research program was to begin the development of such a measure. More 2 accurate identification of veracity can help not only to identify disingenuous claims, but to establish the merit of genuine claims.
Intended Purpose of the Measure The current research aims to create a questionnaire that can distinguish between individuals who present feigned versus genuine symptoms of mild head injury. The measure is not presently intended to necessarily differentiate the factors underlying such misrepresentation. As mentioned, there are numerous reasons, both unintentional and deliberate, why information may be misrepresented. For example, as defined in the DSM-IV (AP A, 1994), Somatoform Disorders involve the presentation of physical symptoms that lack an adequate, or any, medical basis. Individuals with such disorders are preoccupied with having a disease and may grossly misperceive, unintentionally, their own bodily sensations or level of functioning. Thus, a deviant result on the measure may well signal the likelihood of over-reporting or misrepresentation, but not necessarily distinguish among possible causes. Whether the measure has the potential to make such differentiations, for example, by examining response patterns in conjunction with other sources of information, will require research beyond the scope of this project.
Individuals may use various approaches when attempting to fake disorder. For example, an individual may malinger based on his or her own general knowledge of a cooperation. Many cases of head injury are embroiled in legal claims with large financial stakes, raising the concern that individuals will exaggerate or fake disorder, and do so successfully, given the current state of diagnostic methods. Alternatively, and of at least equal concern, individuals with legitimate injuries may be misclassified as malingerers.

Approaches of Malingerers
It is difficult to measure malingering because, among other thihgs, not all malingerers are likely to deceive in the same way. Individuals who are asymptomatic may fake symptoms in order to deceive. Others may exaggerate symptoms that they have, or pretend that symptoms are still present after these problems have abated. Individuals may misrepresent the true cause of their symptoms. For example, an individual may have true memory deficits, but she may try to convince professionals that the deficits are due to a head injury when they are really due to a long history of substance abuse and were present before the injury. A person may intentionally deceive by claiming a false baseline, for example, reporting that he had an excellent memory pre-injury, when in reality the opposite is true. The intent is to create the impression that a decline in functioning has occurred when it has not. A person may attempt to fake a disorder by denying positive attributes, i.e., before the head injury, she may assert that everything in life was ideal, and now everything is problematic. Finally, a person may not put forth adequate effort on testing in order to appear impaired. For . example, he or she may not work as fast as possible on timed measures, or may deliberately miss items. It cannot be assumed that these and other approaches are all highly associated with each other, that is, that a person using one such strategy will equally employ various other strategies as well, or that individuals would all use the same strategies. 5

Approaches to Detecting Malingering
If individuals malinger in different ways, it appears unlikely that narrow approaches to malingering detection, that are designed to uncover only one or a few strategies, will achieve adequate sensitivity. A combination of methods, when combined or integrated properly (e.g., via actuarial methods), will most often be more accurate than a single method alone. Currently, most malingering detection methods can be roughly classified as falling within one of two groups: a) objective personality tests (which can be considered a type of structured interview method), for which scores are usually tallied and compared to normative standards, and b) performance-based measures. Current interview methods are primarily aimed at detecting individuals malingering psychopathology. On the other hand, most performance measures are used for detecting feigned cognitive deficits.
The Minnesota Multiphasic Personality Inventory (MMPI;  is the most frequently used method for detecting individuals faking psychiatric disorder. The F scale of the MMPI, for example, capitalizes on stereotypes regarding psychopathology, the idea being that malingerers will endorse items based on faulty knowledge. The F scale contains items that describe symptoms that laypersons are prone to falsely associate, or over-associate, with psychopathology. A markedly elevated score on the F scale may be indicative of malingering. Meta-analyses looking at MMPI malingering scales identified the F scale as among the best measures of over-reporting ; and meta-analyses of the MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) have yielded similar results, although there may be some shift in the relative sensitivity of the various malingering scores, and more so in the normative standards that should be applied (Rogers, Sewell, & Salek.in, 1994). 6 An earlier study suggests that the MMPI may be useful in identifying individuals feigning neuropsychological deficits ; and, in fact, the MMPI is often used as a malingering measure in neuropsychological evaluations.
Subsequent studies, however, have produced mixed results, with a general association usually found between faking neuropsychological disorder and elevations on MMPI scales, but with considerable variation in the strength of these associations (e.g., see Berry & Butcher, 1998). There are probably complex methodological reasons for these varying findings, but they again suggest heterogeneity in malingering strategies, and the fact that a malingering measure primarily relying on overendorsement of the emotional sequelae of head injury (at least as presently designed) will tend to miss malingerers who do not emphasize both direct neurological and affective components.
A second interview approach, which has been constructed specifically to detect malingering, is the Structured Interview of Reported Symptoms (SIRS; . The SIRS was constructed for use with psychiatric patients. Unlike the self-administered MMPI, the SIRS is administered in a structured interview format. It consists of eight primary scales and five supplementary scales, which are intended to tap into different deception strategies an individual may use and different approaches to detection. For example, the scales include rare or improbable symptoms, symptom combinations that rarely occur together, and symptoms that untrained individuals are likely or unlikely to associate with psychopathology. High endorsement on these types of symptom scales point toward possible feigning . Like the F scale, the SIRS capitalizes on an individual's possible stereotypes and misinformation. It also takes into account consistency of responding within the measure and atypical changes in the course of the 7 disorder. Thus, construction of the measure took into account the possible heterogeneity of malingering strategies and hence the need for multiple approaches to detection. This measure has been validated on a variety of populations, however, it was not designed to assess malingering of neuropsychological dysfunctions, and therefore, has not been appraised for this purpose. As noted above, most measures for feigned neuropsychological dysfunction target cognitive performances. One such approach is to evaluate if the individual is performing as well as she can on a specific measure. Symptom validity testing ) is based on this approach. Symptom validity testing, sometimes referred to as "forcedchoice testing," presents an individual with a forced-choice task, and by one possible criterion, the person's "effort" is considered inadequate if it falls significantly below chance levels. If a person can produce scores significantly below chance levels, then they must be consciously answering incorrectly in order to appear impaired, that is, production of below chance performance requires recognition, and withholding of, correct responses.
These methods have been used for various sensory and cognitive deficits, but are most often used to assess memory impairments. One strength of this method is that a positive result, that is when an individual's performance falls well below chance levels, strongly suggests a purposeful attempt to fake impairment and is relatively unambiguous. One criticism of this method is its transparency. Individuals who are attempting to deceive may discern the detection strategy and perform normally on the measure, or at least not below chance levels. Research, in fact, tends to show low sensitivity (Rogers, Harrell, & Liff, 19 93) · Further attempts at refinement, such as varying the number of potential responses to make cognitive tracking of performance more difficult, may well lead to increases in 8 detection rate. Also, attempts have been made to adjust cutoff scores to increase sensitivity without impacting too severely on specificity. For example, even individuals with serious brain injury tend to miss few items, and cutoff scores that identify implausibly poor performance show some promise .
Some researchers have examined whether detection strategies can be designed for, or derived from, current neuropsychological batteries (e.g., .
Reitan and Wolfson's promising approach measures inconsistencies in responding that are not expected on a neurological basis and can result from an individual's inability to recall the pattern of"dysfunction" they feigned previously, secondary to normal limits of human memory. Reitan and Wolfson examined both the consistency of scores and item responses on retesting. Their research suggests that individuals involved in litigation are less likely to produce consistent scores on retesting than those not involved in litigation, mainly due to a tendency to fail items previously passed, whereas true deficits following head injury are more likely to show consistency or improvement over time in line with the expected course of recovery.

The Current Research Project
As noted, it is unlikely that all individuals with intent to deceive will use the same approach to appear impaired. Currently, there are several methods that aid in the detection of malingered psychological disorder (e.g., the MMPI scales and the SIRS).
There are also several methods aimed at detecting malingered cognitive deficits (e.g., symptom validity testing, and examining for performance inconsistencies on neuropsychological tests) . Given that individuals may choose to malinger only some types of symptoms, on only certain measures, it is most likely that multiple detection methods, 9 properly combined (e.g., via actuarial judgment), will achieve greater accuracy than a single measure. Currently, there is no method that specifically capitalizes on what is known about the symptoms and course of mild head injury. The goal of the current research is to begin development of an interview-based measure that is similar in format and strategy to the SIRS and the F scale of the MMPI, but that is targeted specifically at the detection of feigned impairment from mild head injury.
The present research included three phases: a) preliminary item generation, b) a community survey to aid in further item generation and refinement, and c) an initial validation study, hereafter referred to respectively as : Item Generation, Community Survey, and Validation Study. In the Item Generation phase, items were generated based on literature about the sequelae following mild head injury, malingering detection strategies, and information bearing on public conception and misconception of head injury.
-For example, misconceptions about mild head injury might well be endorsed by individuals attempting to feign symptoms, but not by those with genuine symptoms.
The second phase of this research, that is, the Community Survey, was designed to further knowledge regarding lay views about head injury. This information was to be used to further design and refine "expectation-based" items. Past studies have examined laypersons' knowledge of head injury, but have involved impressions about very general symptoms (e.g., headaches) rather than about more specific symptoms that can be used to capitalize on malingerers' faulty beliefs and lack of knowledge. The current Community Survey included common symptoms of mild head injury, and also symptoms which, in actuality, are highly unlikely to occur after a head injury but might be perceived as plausible. The survey also examined beliefs about the course of these symptoms over 10 time, including, in particular, when the symptoms will improve, if ever, again exploring for possible common misconceptions. For example, although symptoms following mild head injury usually improve over time and often remit, malingering individuals may think that various symptoms will worsen over time.
This Community Survey also explored laypersons' ability to differentiate symptoms of mild to moderate head injury from other disorders that can affect neuropsychological functioning, specifically dementia and depression. Symptoms of mild head injury are often ambiguous and nonspecific, as are symptoms of many disorders. For example, attention and concentration problems may be present in cases of brain damage, physical illness, or psychological disorder. Although asking about possible symptoms of head injury, previous surveys with laypersons have not evaluated the ability of individuals to distinguish between disorders. Thus, for example, the seemingly accurate or informed responses sometimes obtained in those surveys may or may not reflect genuine knowledge of head injury as opposed to relatively undifferentiated stereotypes about neurological disorders. As the main purpose of this part of the Community Survey was separate and apart from the dissertation, it will not be described further.
Following the creation and refinement of items, the third phase of the research, the Validation Study, was conducted. Using a college student population in a classic simulation study, half of the participants were instructed to respond to the questionnaire items honestly, and the other half to respond as if they had sustained a mild head injury and were attempting to demonstrate deficits in order to receive financial compensation.
This phase of the research was intended to evaluate the potential discriminatory power and characteristics of the items. 11 These steps in item generation and validation, which comprise the dissertation, are certainly not viewed as complete development and testing of this measure, although they would seem to be reasonable starting points. Ultimately, the intended purpose of the measure is not to distinguish between normal individuals faking or not faking disorder, but individuals feigning the symptoms of head injury versus those with genuine head injury. It was hoped that as part of this early validation work, a small group of head-injured individuals could be recruited to begin this next stage of research. Given the difficulty of recruiting such a sample, and the intended and realistic scope of a dissertation, it was described as a supplemental study to be conducted if feasible, and was not part of the formal research proposal. Participants for this supplemental study were recruited through undergraduate psychology courses, and involved individuals that experienced genuine head injuries. Participants in this phase of the research were either asked to answer the questionnaire honestly regarding any current symptoms resulting from their head injury, or, alternatively, to exaggerate their symptoms, as if they were pretending to have more severe deficits in order to maximize financial compensation for their injury. The analyses for this study were originally intended to examine whether the items could distinguish between the two response sets, and investigate the properties of the scale. Unfortunately, not enough individuals with recent head injuries were recruited to complete this supplemental study, and therefore it will not be described further.
Phase I. Item Generation Method Procedure. Item generation was initially conducted independently by three individuals including the author, David Faust,Ph.D.,and David Strong,Ph.D . (a local 12 psychologist who is working with the research team) . Items were based on information regarding possible symptoms of mild to moderate head injury and approaches to identify malingerers. In order to generate a diverse pool of potential questions, items were not examined together until each writer had generated a relatively large number of items.
Items were combined to form an initial item pool of approximately 400 items. It was unclear in advance what questions would be more or less useful, and consequently, at this point, inclusion of possible items was more important than the exclusion of items. All of the items were worded in order to use a dichotomous response format. A dichotomous format reduces administration time, while still providing adequate variability given the large number of items to be initially examined. Whether the response format might need to be changed for some or all of the questions was to be decided as information was gathered about the measure.
Items were grouped by content area in order to facilitate revisions. The initial item pool was reviewed for coverage of the content domain. There was a wide variety of items covering probable and improbable cognitive, emotional, and somatic symptoms that could arise following head injury. Many of the items were designed to appear plausible to laypersons, but, in truth, were improbable or highly improbable judging from scientific knowledge about head injury. An example of such an improbable item is: "When people speak to fast, I get dizzy." Also, a large number of items were designed to be rather extreme, such as, "My coordination is so bad that I can hardly pour a drink without spilling it." These improbable and extreme items were included because they may seem plausible to malingerers, given stereotypes about head injury. Further, because such items are likely to be endorsed rarely by honest responders with true head injury, even modest 13 endorsement rates by malingerers should contribute to accurate identification. Items also addressed behavioral changes, normal human shortcomings, and socially desirable characteristics. Items such as these were included because, along with over-reporting symptoms, malingerers may fail to acknowledge normal human shortcomings in an effort to present themselves as overly positive, moralistic citizens. Several items were added to address detection approaches and content areas that seemed to be underrepresented in the initial sample of items, e.g., items regarding course of symptoms.
The item pool also contained a number of items which, in actuality, are not uncommon symptoms following head injury, such as sensitivity to light and concentration problems. These items may be useful in identifying individuals with true symptoms and tracking their progress over time, as well as identifying malingerers. Although, as noted, these items are not uncommon among those with true head injury, the heterogeneity of symptomatology makes it unlikely than any one individual will experience more than a minority of the overall symptom group. Therefore, malingerers might be detected by an overly broad endorsement of such items.
All items from the Community Survey (see below) were added to the initial item pool. The content of the Community Survey items was preserved, however, the format of these items needed to be changed to parallel the initial item pool questions. For example, the item, "Feeling numbness or tingling in hands," was rewritten as, "I feel numbness or tingling in my hands." The item pool was carefully reviewed by each writer for clarity and redundancy.
As the items were to be used initially with individuals who had not experienced a head injury, it was important to consider the use of the word "injury" in items and the time 14 frame used. It was decided that items referring to the injury should specify a time frame, such as: "My eyesight is the same as it was before my injury," and "Since my injury, my eyes have been very sensitive to light." Instructions given to the non-head-injured participants provided a time frame. Of the items that referred to an injury, 3 5 did not seem suitable for this change in format, an example being, "I have flashbacks of my head injury." These 35 items were all placed at the end of the questionnaire, and, as necessary, participants were provided with further instruction on how to answer them (see further below). The majority of the items did not refer to an injury, and were written in the present tense, for example, "My vision is often blurry." Two writers (M. A. and D. S.) reviewed and discussed each item and decided which to change, delete, or keep . A large number of items from the Community Survey replaced nearly identical items in the initial pool due to redundancy. By retaining the Community Survey items, the data from the community sample could be compared with the university sample at a later point in time, if desired. Several items which were nearly identical were dropped or rewritten to maintain some degree of uniqueness. However, numerous items that were related or similar to other items were retained, for example, the items, "I have no trouble understanding what people say," and "I can usually follow conversations." This overlapping content created the potential to examine response patterns for consistency in order to identify such possible characteristics as carelessness in completing the inventory, as can be done with the MMPI and MMPI-2 .
The third writer then reviewed the items, revising several items and recommending some slight revisions in wording. Also, some items were made more extreme in content, thereby potentially reducing the number of normal or genuinely head-injured individuals endorsing 15 the item. For example, rather than, "At times my eyes lose focus," which is not uncommon among normal individuals, the item was reworded as, "I can't focus my eyes at all," which is uncommon, certainly for the general population and even for individuals with mild head injury. The items were also reviewed for grammatical errors and clarity by two other individuals who did not write any items, and some minor changes were made. The result was a total of 472 items, which comprise the Traumatic Brain Injury Questionnaire (TBIQ).
The order of the TBIQ items was randomized and then reviewed to avoid similar items appearing adjacent to one another. When this occurred, items were moved to another position in the questionnaire. As noted above, 3 5 particular items referring to an injury were placed at the end of the measure. The order of these 3 5 items was also randomized.
Answer key. The three item writers developed an a priori answer key for the TBIQ. Two of the individuals reviewed the questionnaire independently and keyed the items in the direction they believed a malingerer would most likely answer. They agreed on 423 of the 472 items, or at a rate of about 90%. The 49 items on which they disagreed were reviewed by the third writer for additional input. The third writer felt confident in keying only 29 of the 49 items. These 29 additional items were then keyed in the direction of two-thirds agreement, resulting in 452 total keyed items. It was decided not to key the other 20 items because of their ambiguity. Many of these items involved unwillingness to admit to normal human shortcomings (e.g., involve social desirability). It was difficult to anticipate how they might be answered by the current research sample, and there is reason 16 to believe that for these particular items, a converse pattern of responding might be obtained with other samples.

Results
Reading level analyses. The TBIQ was checked for reading level. Given the intended population, the aim was to avoid reader requirements exceeding about an 8th grade level. Using a computer generated reading analysis program, the TBIQ had a Flesch-Kincaid Grade Level (Flesch, 1948; of 5.5, meaning that someone with a 5th to 6th grade education would understand the questionnaire as a whole. This reading score is based on the average number of syllables per word and the average number of words per sentence. The Flesch-Kincaid Reading Ease  for the TBIQ was 74.5. This score is also based on the average number of syllables per word and the average number of words per sentence on a 0-100 point scale, with higher scores indicating that more people could easily understand the document. Most standard writing has scores averaging in the 60-70 range on this scale. These analyses indicated, fortunately, that an average person with at least an 8th grade education should be able to easily understand the items.

Phase II. Community Survey
A survey of community members (lay individuals) regarding the symptoms of head injury was conducted. The original intent was to aid in item generation and refinement prior to the next phase of research. Community members might well have misconceptions regarding the symptoms of head injury. Items could then be generated based on these misconceptions, with the assumption that individuals who have not directly experienced a 17 head injury, or who are exaggerating the effects of a head injury, i.e., malingerers, will endorse these symptoms, aiding in their identification.

Method
Participants and setting. The sample consisted of 40 community members, an adequate sample size given the exploratory purpose of this work. These participants were recruited from two community settings that were likely to yield a sample resembling the overall population of individuals who sustain mild head injuries. For example, mild head injuries occur disproportionately among those of below average socioeconomic status and educational levels. However, research samples in related "community" studies have often involved college students. Community members were recruited at the local Department of Motor Vehicles (DMV) and a local shopping mall after approval was obtained from the respective administrative offices. Potential shopping mall participants were offered a $5.00 reimbursement for their time. Unfortunately, the DMV administration did not allow participants to be reimbursed .
It was made clear to all potential participants that participation was voluntary, with no penalty for choosing not to participate, and that the research was being conducted by a graduate student at the University of Rhode Island in order to meet degree requirements.

Informed consent was obtained for all participants.
Participation was open to individuals between the ages of 18 and 65 . Recruitment was limited to this age group for the following reasons: a) the participants were old enough to consent to participate without parental consent, and b) individuals over 65 are much less frequently involved in mild head injury litigation and, therefore, are not the population of greatest interest. Before obtaining consent, potential participants were 18 asked about their age if it was not obvious that they were between the ages of 18 and 65 .
Several individuals expressed interest in completing the survey, but did not meet the age requirements.
Participants ranged in age from 18 to 60 years old, with a mean of32.7 years.
Mean education level was 13 .4 years, with a range of 10 to 19 years; approximately 53% of the sample had an education level of 12 years or below. Few participants were professionals, and only 22.5 % had college or Master' s degrees. On the last day of data collection at the shopping mall, several women were denied participation in order to equalize the number of male and female participants.
Numerous individuals declined to participate based on various reasons including: the survey would take to long; they did not have the time; they were uninterested; and a few people at the Department of Motor Vehicles declined because they wanted to be paid for their time. Also, several individuals could not participate because they were unable to read English with sufficient understanding to fully comprehend the materials.
Not all participants recruited at the DMV were able to complete the survey before it was their tum to receive service. The large majority of these individuals subsequently returned to complete their surveys, but several of them did not because they could not offer any more of their time. Therefore, there were some missing data; one participant did not complete the second scale, and two participants did not complete the last page of the Participant information sheet. Only one survey given at the local mall was not completed, as the participant failed to complete a page, perhaps inadvertently. One individual at the DMV took the survey, saying she would complete it while waiting for her tum, but never returned . Therefore, one additional participant was recruited at the mall and given the 19 same version of the survey that was not returned. Nineteen individuals fully participated at the DMV, and 21 at the mall, for a total of 40 participants.
Although no formal statistics were maintained, it seemed easier to recruit participants at the mall, perhaps due to the reimbursement offered. Also, the mall setting most likely contributed to individuals ' willingness to participate in that they were not waiting to be serviced, as were individuals at the DMV. Therefore, it is possible that the participants recruited at the mall had different motivations to complete the survey than those recruited at the DMV.
Materials. All materials for this phase of the research can be found in Appendix A, including the informed consent form, the instructions used by the researchers, the Community Survey (including case scenarios and scales), and a participant information sheet.
The Community Survey was comprised of three short case scenarios, and two rating scales. One case described an average, healthy individual, one described an individual with a mild brain injury, and one described either an individual with depression or an individual with dementia. A main focus of the Community Survey was potential differences in responses to the average case and the mild brain injury case. Given concerns about possible order affects, the order of presentation for these two cases was alternated within the survey. The final case, which involved either depression or dementia, and which was examined largely for separate, exploratory purposes, was always placed last. This was done to minimize the possibility that rating this third case would influence or contaminate ratings of the other two cases. Half of the surveys had the depression nd the other half had the dementia case, resulting in four versions of the Community case, a Survey.
The first rating scale of the Community Survey contained items which described common symptoms following head injury, very uncommon or improbable (but seemingly plausible) symptoms following head injury, and common symptoms of dementia and depression. The intent was to determine if individuals could accurately rate the likelihood of these symptoms following a mild head injury. This scale also included a short section containing several questions about symptom course for the mild brain injury case. The second rating scale was comprised of items referring to activities common to everyday living, with which most healthy individuals have no significant difficulty. For each item, participants were to rate how much trouble, if any, an average person, a person with a mild brain injury, and a person with either depression or dementia would have performing the referred to activity.
The participant information sheet contained questions regarding demographic information, a brief head injury screen, and questions regarding participants' fund of knowledge of head injury and depression or dementia.
Procedure. Data was collected by two of the aforementioned researchers (M. A. and D. S.), and a research assistant trained to administer the survey. The researcher identified herselflhimself to potential participants and invited them to participate in a research study investigating people' s knowledge of head injury. They were informed that the research was part of degree requirements for the Department of Psychology at the University of Rhode Island, and that the research involved a short survey, which would take about 20-30 minutes to complete.
Participants read an informed consent form, which they could keep for future reference. They were informed that names would not be associated in any way with their data, in order to maintain confidentiality and anonymity. Any questions about the study requirements or the use of the data were answered at that time. More specific questions regarding the nature of the study were addressed at the completion of the survey, in order to avoid possible biasing affects.
Participants were told that the study would involve reading three short cases and responding to two rating scales. Each participant was provided with one of the four versions of the Community Survey. These four versions were alternated before distribution, and each participant received the version on the top of the stack. An equal number of the four versions were administered. Participants were invited to ask any questions they had about the definitions of words in the case or study materials. No participant asked for clarification of any words or cases.
Participants were instructed to first read the three cases, and then to answer the rating scales. On the first scale, participants were asked to rate the likelihood of each symptom for an average person, a person with a mild brain injury, and an individual with either depression or dementia, depending upon which case they read. One the second scale, they were instructed to rate how much trouble, if any, an average person, a person with a mild brain injury, and a person with either depression or dementia would have performing the referred to activity. Several individuals did not fully understand the instructions. In these cases, the instructions were clarified until it was clear that understanding was achieved . 22 Following the completion of the Community Survey, participants were asked to provide demographic variables for descriptive purposes. It was made cle. ar that providing this information was voluntary, but that the researchers would find it helpful. The participants were also screened for a history of head injury, and asked to provide information regarding their fund of knowledge of head injury and depression or dementia.
Any questions the participant had about the study were then answered.

Results
The original purpose of the Community Survey was to generate ideas for potential items or item refinement based on lay misconceptions about head injury. This phase of the research was delayed due to uncontrollable factors, such as the decision of mall operators to delay the time of data collection. Given other practical constraints, it was decided to proceed with the Validation Study before the completion of the Community Survey. It was of course unknown which items in the survey might be most useful prior to collecting the needed data. However, at this stage of research, excluding potentially useful items from the TBIQ was considered more problematic than including less helpful ones.
Therefore, all the items in the Community Survey were added to the item pool as described above. Given that the data collected through the Community Survey was not used to create the TBIQ, it was not analyzed as part of the current research project.

Method
Participants. Data was collected from 330 university students during this phase of the research. These participants were recruited through undergraduate psychology courses. Two participants were not recruited through their courses. One was another 23 university student who came with a friend to a data collection session, and one was an individual visiting a friend who decided to participate. Students recruited through their psychology courses were given class credit for their participation. The amount of credit was determined by the professor for their course. Each participant was given a consent form as proof of their participation. It was made clear to all potential participants that participation was voluntary and there was no penalty for choosing not to participate. No data were collected from any individual without informed consent.
Given population base rates, it seemed highly likely in advance that some potential participants would have suffered a head injury. The main purpose of the Validation Study was to compare non-head-injured individuals responding honestly to those faking or exaggerating symptoms. Individuals with a history of head injury might answer the TBIQ differently than those with no such history, and thus were to be excluded from the study.
Consequently, all potential participants were screened for head injuries, and only individuals with no prior head injuries, or insignificant, uncomplicated injuries (defined as a loss of consciousness or post-traumatic amnesia of only a few seconds) were included in the main study. Of the 330 participants included in the study, 271 individuals indicated no prior head injuries, and 59 indicated injuries with loss of consciousness or post-traumatic amnesia of only a few seconds.
All participants were over the age of 18. Recruitment was limited to this age group so that the participants could consent to participate without parental consent. No one was excluded from this research due to their gender, culture, or socioeconomic status.
Participants ranged in age from 18 to 39, with 70% being either 18 or 19 years of age. There were approximately three times as many females (77%) as males (23%). This 24 is consistent with the gender ratio among the students in the large psychology lecture classes, from which most of the participants were drawn. It would have been preferable to recruit as many, or more, males than females given the sociodemographics of mild head injury. However, such considerations were overridden by the need for a sufficient sample size to conduct the planned analyses and the impracticality (feasibility) of recruiting a sufficient number of males. About 60% of the participants were in their Freshman year, 27% in their Sophomore year, and about 13% were upperclassmen.
Materials. All materials for this phase of the research can be found in Appendix B, including the informed consent form, a head injury screen, the instructions given to the participants, the TBIQ (including a demographic information sheet), and the postexperimental questionnaire.
The consent forms and head injury screens were distributed together. After completing the screen, participants were given packets of information containing, in order, written instructions, the TBIQ items, and the demographic information sheet. Two versions of this packet were used, one for the honest responders and one for the malingering responders. These packets differed only in the instructions given to participants. As described in the Item Generation section, individuals instructed to answer the questionnaire honestly needed additional instructions for the last 3 5 items on the questionnaire. Therefore, the packet of information given to these individuals had initial mstructions preceding the TBIQ items, and additional instructions inserted before the first of these last 3 5 items. The full instructions for the malingering respondents preceded the items. 25 The post-experimental questionnaires, which were used to evaluate how closely participants followed instructions, were distributed separately, after participants had completed the TBIQ. Researchers in malingering detection who have used post-experimental questionnaires have found that some people have difficulty faking, or cannot think of a strategy to use, and therefore do not follow instructions to malinger (e.g., . Those individuals who did not follow instructions were not included in the data analyses. Also, the post-experimental questionnaire asked the participants about the strategies they used to fake their responses.
Procedure. Data collection was conducted in groups. As participants arrived for the study, they were given a copy of the informed consent form and a head injury screen.
They were instructed to read the consent form and, if they chose to participate, to then answer the head injury screen. Any questions they had about the materials were addressed. The consent form indicated that the research project was investigating head injury and would involve answering a questionnaire, under conditions of anonymity and confidentiality. The participants were not asked to provide their names on any of the study materials. The head injury screen required participants to provide their initials and the last four digits of their social security number. This identification code was transferred to all of the participant's research materials. This code was used to match screening information with the rest of the data, and also match original data to retest data if the participant decided to return for the retest session, as described below. Upon completion of all data analyses, this code was removed from the data to preserve confidentiality and anonymity. 26 After the head injury screen was completed, participants were assigned to either the "NORMAL" group, or the "MALINGERING" group. Participants were handed a packet of materials pre-coded with their group identifier and their identification code, which had been transferred from their head injury screen. The packet contained all necessary instructions, the TBIQ items, and the demographic information sheet. They were told that if they had any questions during the course of the study to please speak with the .researcher. Only a few individuals asked for clarification of the materials. All questions were addressed to the extent possible during the study. Some questions could not be answered fully at the time of the study because it may have interfered with either the experimental manipulation, or the retest session if participants decided to return. All participants did have a phone number to contact the researcher at a later point in time, if they so chose.
Participants all completed the TBIQ, but under one of two different sets of instructions. The NORMAL group was instructed to answer the questionnaire honestly, describing how they have felt over the last 2 years. As described above, 3 5 items, which were placed at the end of the questionnaire, did not fit this format. For these items, additional instructions were provided. They were to assume they had sustained a head injury 2 years prior. However, they were further told to assume that the injury caused no problems and they were to answer the items accordingly.
The MALINGERING group was instructed to pretend that they were in a car accident 2 years ago, in which they suffered a "mild brain injury," but from which they had recovered . They were asked to pretend that they were suing the driver who hit them, and to assume that as part of their lawsuit, they were being sent to a health professional to 27 evaluate their case. They were instructed to answer the questionnaire as if were trying to convince the professional that they were still suffering from problems. They were informed that in cases such as these, financial compensation is greater when impairments are more severe. They were to make their responses look as impaired as possible, but not to the point that they would not be believed.
These two sets of instructions were alternated before distribution, and each participant received the packet on the top of the stack. An equal number of participants were assigned the MALINGERING and NORMAL instructions. There were 165 participants in each group.
A demographic sheet was attached to the questionnaire. Participants were instructed to complete this sheet after they had finished the questionnaire. Upon completion of the questionnaire and demographic sheet, these materials were returned to the researchers, and participants were asked to complete the post-experimental questionnaire.
After they completed the study, participants were given a sheet of information · encouraging them to return for a retest session in 14 to 20 days. This sheet provided the days and times in which they could return. Several professors requested a record of their students who participated in the study in order to properly assign extra credit. At the completion of the study, students wrote their names and the names of their instructors on a separate sheet of paper. This list was copied for professors, but was never, in any way, associated with the data. Also, the researcher signed and dated the participant's consent form, to be provided to their instructor for proof of participation. 28 ~testing Retesting was conducted to examine for response consistency as a potential additional, or adjunct, way to appraise exaggeration.
Participants. A total of 40 participants returned for a retest session 14-20 days from the date of the initial session, 20 from the NORMAL group and 20 from the MALINGERING group. Participants received additional course credit for their second participation. Participants who attended data collection sessions in the last 2 weeks of the semester were not asked to return, given time constraints. However, several participants who wished to arrange a retest session after formal data collection had ended were accommodated when possible. Those who could not be accommodated were given the names of several other researchers to contact if they wished. In this way, they were given an opportunity to still receive extra credit.
Participants ranged in age from 18 to 3 9, with 73 % being either 19 or 20 years of age. About 20% of the participants were in their Freshman year, 53% in their Sophomore year, and about 27% were upperclassmen.
Materials. All the materials used for the retest were exactly the same as those used for the Validation Study.
Procedure. When participants returned for the retest session, they were asked to provide their identification number (initials and four digits of their social security number).
The researcher checked to see if 14-20 days had passed since the time of their first Participation and checked on their original group assignment. Several participants returned too soon, and were asked to return on another day. 29 Returning participants were given an informed consent form and asked to complete the head injury screen a second time. The participants were then given the packet of information with the same instructions they were given the first time. Upon completion of the TBIQ, they were again asked to complete the demographic information sheet and post-experimental questionnaire. They also once again signed in for their professors to receive extra course credit.

Results
This research was an initial step in validating the Tl~IQ , and hence the analyses were largely exploratory in nature. The purpose of the analyses was to evaluate the capacity of the items to discriminate between the groups, and, if feasible, other item characteristics, such as their interrelationships and factor structure.
Compliance. Compliance with the experimental instructions was determined by responses provided to the post-experimental questionnaire. In order to be considered compliant, the respondent needed to provide sufficient information to indicate that they had correctly comprehended and followed instructions.
One question on the post-experimental questionnaire requested participants to briefly summarize the instructions. The answer needed to be consistent with the actual instructions in order to be determined satisfactory (i.e., in compliance). For example, a MALINGERING responder who indicated that she never had a head injury, so therefore she answered questions as they truly applied to her, would be classified as noncompliant.
Another question asked participants how closely they thought they followed the instructions on a 5-point scale, ranging from "Not Closely At All" to "Very Closely." Participants who indicated that they did not follow instructions closely (i.e., answered 30 "Not Closely At All") were classified as noncompliant. Another question asked participants whether or not they faked their responses. Answers needed to be consistent with the instructions given to the participant (e.g., those in the MALINGERING group needed to indicate they had faked responses), or needed to be clarified to a sufficient extent on the following question. On this subsequent question, participants described their strategy for faking. Their response was not evaluated in terms of its suitability or cunning, but relative to their described approach to the questions, i.e., whether they indicated they were faking or not.
For many participants, responses to the post-experimental questionnaire provided clear and unambiguous indication of compliance or non-compliance. However, responses were sometimes vague or ambiguous, especially when taken together. For example, several NORMAL responders indicated that they had faked their responses. A number of these participants then clarified that they pretended to have had a head injury on the last part of the questionnaire, as they were instructed to do, and therefore they were actually compliant with the instructions. However, many of them did not clarify their answers enough to make this determination. Responses such as these called into question how compliant individuals were with instructions. Therefore the answers to all four of these questions were considered concurrently, and participants' responses were rated as indicating definite compliance, definite noncompliance, or uncertain compliance.
Rates of compliance were significantly different across the NORMAL and MALINGERING groups, x 2 (2, N = 330) = 27.36, 12 < .001. Of the participants who were given the MALINGERING instructions, 84.2% were judged to be definitely compliant with the instructions, 6 .1 % as definitely noncompliant, and 9. 7% as uncertain 31 compliance. However, only 58.2% of the participants who were given the NORMAL instructions were definitely compliant, 15 .2% were rated as noncompliant, and 26. 7% were rated as uncertain compliance. Only participants whose data were considered to be definitely compliant with instructions were included in analyses. Therefore, 96 NORMAL responders and 139 MALINGERING responders (N = 235) were included in the analyses.
Ironically, on a separate post-experimental questionnaire item in which participants indicated ease of following the instructions on a 5-point scale that ranged from "Not Easy/Difficult" to "Very Easy," almost 90% of all participants rated this item in the range of 3 to 5, indicating that the instructions were at least "Somewhat Easy" to follow. Also, as mentioned above, participants indicated how closely they followed the instructions on a 5-point scale that ranged from "Not Closely At All" to "Very Closely." Almost 96% of the participants rated this item in the range of 3 to 5, indicating that they had followed the instructions at least " Somewhat Closely." This suggests that although the wide majority of participants believed they understood and followed the instructions, a considerable number of these individuals answered other, more open-ended questions in a way that suggested otherwise.
As an initial step in examining the properties of the TBIQ, a total score was calculated using the a priori answer key discussed above. Total scores were computed for all compliant participants (N = 235) by summing the number of 452 keyed items answered in the keyed direction. Such a total score does not maximize the potential discriminatory power of the questions, because all items, whether they tum out to be effective or not, are included. In any case, this seemed like a reasonable starting point and did not run the risk of capitalizing on statistical artifact or chance association (that are likely to appear given the number of items and sample size involved).
Total scores were prorated for missing items using a percent total score, which was created by dividing participants' total scores (the number of items answered in the keyed direction) by the number of keyed items that were answered (i .e., 452 minus the number of missing items). For example, if a participant did not answer 2 of the 452 keyed items, their percent total score was computed by dividing their total score by 450.
Answers to items were considered missing if they were left blank, or if the participant answered both true and false to the same item. The number of missing items ranged from 0 to 30. Of the compliant participants, 182 answered all the items, 48 did not answer one to six items, and five did not answer 20 to 30 items. The five participants with a considerable number of missing items were not included in the analyses. It was reasoned that if a clinician was administering the TBIQ, as is ultimately intended, the missing data 33 would have been noticed, and the examinee instructed to complete the instrument to the extent possible. In this research, it appeared that the individuals with a large number of missing items had inadvertently missed a page of items when completing the questionnaire.
The distributions of NORMAL and MALINGERING responders scores were plotted against each other to determine cutting scores. A total of 44 cases (or about 19% of the sample) overlapped at the tail ends of the distributions. The optimal cutting score correctly classified 95 .22% ofrespondents. Table 1 contains possible cutting scores with their respective sensitivity and specificity rates, positive predictive power, and negative predictive power. It would seem obvious that the results, which as noted, do not capitalize on chance relations and include items regardless of their discriminatory power based on a priori keying, suggest that the TBIQ has promise in differentiating groups.
Cross-validation analyses. The next set of analyses was intended to identify TBIQ items with the greatest discriminatory power, and to evaluate their consistency across samples. Participants whose protocols were rated as definitely compliant were randomly assigned to either a derivation or a validation group. Half of the NORMAL responders and half of the MALINGERING responders were assigned to the derivation group (n = l l 7), and the other halves assigned to the validation group (n = 118). When these groups 34 were compared on a variety of demographic variables, no significant differences emerged in age, E (1 , 232) = 1.11 , Q = .293 ; year in college, E (1 ,233) = 2.78, Q = .097; or gender, 2 (l N = 235) = .075 , Q = .784. There were also no differences in the reported 1. ,frequency of attention deficit disorders or learning disabilities, x 2 (1 , N = 234) = 2.128, Q = .145; medical problems, X 2 (1 , N = 234) = .618, Q = .432; psychological illness, X 2 (1 , N = 234) = .279, Q = .597; or having experienced a brief loss of consciousness, x 2 (1 , N = 235) = .024, Q = .876.
The discriminatory power of items was examined by first comparing the NORMAL and MALINGERING respondents in the derivation group. Chi-squares were calculated for each individual item. A Q value of . 01 was chosen as the level of significance. This level seemed sufficiently stringent to avoid too many chance associations, but not so stringent as to eliminate too many potentially useful items. In fact, 3 77 items of the 4 72 items produced significant differences (X 2 ranging from 6.59 to 82.76); 318 of which were significant at Q < . 001 . It was originally proposed that factor analyses would be used to explore the structure of the TBIQ, and that subscales would be created based on the solution, however, given the large number of significant items, such analyses were technically improper and unnecessary.
The 35 items placed at the end of the TBIQ, which required additional, potentially confusing, instructions for the NORMAL group, were evaluated to determine their effectiveness. The frequency with which these items reached the . 01 level of significance was compared to the frequency for the remainder of the items. Of the 35 items, 80% had endorsement frequencies that significantly differed by group, matching almost exactly the ?9.86% frequency for the remaining 437 items. Overall, the distribution of these 35 items 35 throughout the range of chi-square values appeared to be the same as that for the other 437 items.
Next, selecting the 3 77 total items in which endorsement significantly differed by group, a post hoc answer key was created. These answers were keyed in the "deviant" (i.e., malingering) direction. The post hoc key was very similar to the a priori key. Of the 377 post hoc items, approximately 97% were keyed in the same direction on the a priori key, J % were keyed in the opposite direction, and 2% were not keyed on the a priori key due to their ambiguity.
Total scores were computed using the post hoc key for the participants assigned to the derivation group by summing the number of items answered in the keyed direction.
Only those individuals who were missing six or less items were included (N=l 14). Total scores were prorated for missing values by computing a percent total using the same method previously described.
Percent total scores for NORMAL responders (n = 47) ranged from 1.59% to 34.75%, and for MALINGERING responders (n = 67) from 16. 18% to 88 .33%. The mean percent total score was significantly and substantially higher for the MALINGERING responders (M = 53 .62%, SD= 19.24%) in comparison to the NORMAL responders (M = 11 .88%, SD= 6.86%), E (1 , 113) = 202.68, Q < .001 , 11 2 = .644. The distributions ofNORMAL and MALINGERING responders percent total scores were plotted against each other to determine cutting scores. A total of 24 cases, 2 I% of the sample, overlapped at the tail ends of the distributions. The optimal cutting score correctly classified 95 . 61 % of respondents. 36 This cutting score was then applied to the validation group to examine the stability of the results. Percent total scores were computed, as described above, for the NORMAL (n == 47) and MALINGERING (n = 69) participants in the validation group. Again, only those individuals who were missing six or less items were included (N = 116). The optimal cutting score from the derivation group correctly classified 96.55% ofrespondents in the validation group, that is, there was no decrease in discriminatory power. Percent total scores ranged from 2.93% to 27.32% for the NORMAL responders and from 22.02% to 88.59% for the MALINGERING responders. There were only 6 cases, 5.17% of the validation sample, which overlapped at the tail ends of the distributions. Again, the mean percent total score of the MALINGERING responders (M = 57.50%, SD= 16.65%) was significantly and substantially higher than that of the NORMAL responders (M = 12.57%, SD = 5.25%), E (1 , 115) = 319.85, Q < .001 , 11 2 = .737. The sensitivity rates, specificity rates, positive predictive power, and negative predictive power for cutoff scores in both the derivation and validation groups are presented in Table 2. In summary, analysis of the derivation and validation groups suggested stability in the discriminatory power of items, at least when analyzed in combination as total scores.
Retest analyses. Retest analyses were conducted to examine within-subject response consistency over a relatively short period of time. Individuals who returned for the retest were significantly older (M = 20.68, SD = 3.92) than those who did not (M = 19.16, SD = 1.60), E(l , 327) = 19.75 , Q <. 001 , 11 2 = .057. The mean year in college for the retest participants (M = 2.10, SD = .74) was significantly higher than those who did not return (M= 1.48, SD = .79), E( l , 328) = 21.58, Q < .001 , 11 2 = .062. These 37 dissimilarities are most likely due to differences in recruitment from particular courses and the amount of extra credit offered by individual professors.
Participants' retest data was matched to their data from their first participation It was expected that the MALINGERING individuals would show greater response variability over time because of the limits of their memory. Given the length of the scale, people who were faking might well have difficulty remembering their previous answers and, therefore, produce less consistent responses than individuals who answered honestly.
Given the small sample sizes and the inequality of the groups, test-retest correlations were not used to evaluate the stability of the scale over time, as initially proposed. Rather, a consistency score was calculated, based on the total number of items answered identically on both test occasions. There were no missing items for any compliant respondents on the retest questionnaire, however, four individuals were missing 38 r tw o answers on their initial questionnaire. Consistency scores were prorated for oneo missing items by dividing the total number of consistent scores by the total number of item pairs answered at both Time 1 and Time 2, creating a percent consistency score. As expected, the mean percent consistency score of the MALINGERERS (M = 74.6%, SD = 9.8%) was significantly lower than the mean of the NORMAL responders (M = 90%, SD= 4%) , E (1, 22) = 15 .77, 12 = .001 , 11 2 = .429.
Cutoff scores were determined for the percent consistency scores, two of which correctly classified about 96% of the participants. Using a cutoff score of less than or equal to 84%, only one MALINGERING individual was misclassified as NORMAL, with a specificity rate of 100%. Identifying all MALINGERERS (100% sensitivity) at the risk of misclassifying NORMALS, a cutoff score of less than or equal to 85% was most effective, misclassifying only one NORMAL responder, again with an accuracy rate of about 96%.
Consistency scores were originally constructed in the hopes of increasing the accuracy achieved by item totals or other such indicators alone. The high rate of classification achieved with item totals in the present sample, along with the small sample size of the test-retest sample, made it difficult to fully explore this possibility. These limitations not withstanding, exploratory analyses were conducted. There was some indication that individuals incorrectly classified using their total scores could be correctly classified using consistency scores. For example, a MALINGERING respondent who was misclassified as a NORMAL respondent by examining his/her total score (using a cutoff with approximately 98% specificity), was accurately classified as a MALINGERER using 39 consistency scores. Additionally, no individuals who were correctly classified by their total score were misclassified by their consistency score.

Discussion
The goal of the current research was to begin the development and validation of a structured interview aimed at distinguishing between individuals with true head injury symptoms and those faking or exaggerating symptoms. More accurate identification of veracity can help not only to identify individuals who are malingering the effects of a mild head injury, but also to establish the merit of genuine cases. Current measures for malingering detection are not specifically targeted to mild head injury, or only measure cognitive performance. It is unlikely that all individuals who attempt to malinger will limit their deception to cognitive measures, or even target these measures. As noted, not all malingerers are likely to deceive in the same way, and it is probable that properly combining (e.g., via actuarial judgment) multiple sound measures will increase the sensitivity and specificity of detection. At this time, there is no method that specifically capitalizes on knowledge about the symptoms and course of mild head injury.
Malingering of mild head injury poses some special challenges because the symptoms of genuine injury may be very similar to those of other common disorders, such as depression and alcohol abuse. Also, there is usually no objective way to validate a mild head injury claim, and there remain uncertainties about such matters as resultant patterns of deficit and the minimal injury required to produce symptoms. For these reasons, it seems likely that an interview format specifically aimed at identifying malingering of mild head injury could prove helpful. 40 In the initial development stage of the Traumatic Brain Injury Questionnaire (TBIQ), three writers generated items that covered a wide range of probable and improbable cognitive, emotional, and somatic symptoms that could arise following a head injury. Many items were designed to seem plausible to laypersons, when in actuality they contradicted expectations about head injury based on scientific literature. Various items also addressed potential behavioral changes, course of recovery, and socially desirable characteristics. Malingerers may be detected through overendorsement of symptoms that can, but by no means always, occur with mild head injury, and/or by overly denying normal personal shortcomings. Items were eliminated from the initial pool if they were overly redundant, or if it seemed that the content would not be easily understood. Some item redundancy or overlap was considered desirable because inconsistency in responding to like items could help in detecting malingerers. A cautious approach to item elimination was taken because, at this point, it seemed worse to lose potentially useful items then to include ineffective items that could be identified and excluded later. These steps resulted in 472 items. A reading level analysis indicated that this measure could be easily understood by an individual with an 8th grade education.
The capacity of the TBIQ to discriminate between individuals answering honestly and individuals asked to malinger the effects of a mild brain injury was evaluated using a college student population in a classic malingering simulation study. Only those Participants who were judged to be compliant with the experimental instructions were included in the analyses. Utilizing compliant participants' total scores on an a priori key developed by the item writers, the results suggest that the scale has considerable promise.
Total scores were much higher for the malingering group than the honest group. Only about 19% of the sample overlapped at the tail ends of the distributions, and the optimal cutting score correctly classified about 95% of respondents. For this cutting score, sensitivity was about 99%, specificity was 90%, positive predictive power was 94%, and negative predictive power was about 98%. This is a very high level of accuracy, especially considering that the a priori key did not maximize the potential discriminatory power of the items. The key was constructed in advance of data indicating the effectiveness of items, and 452 keyed items, whether effective at discriminating the groups or not, were included in these total scores.
Although the level of discrimination achieved with the a priori key rendered certain further analyses oflimited practicality or importance, select examination was conducted of other scale characteristics. It seemed worthwhile to examine, for example, whether an analysis to identify the most effective items would demonstrate consistency or stability. Therefore, the sample was split into derivation and validation groups. Chi-square analyses were conducted with the derivation sample to identify effective items. Using a .01 level of significance, 3 77 items distinguished between the malingering and honest responders and were included in the post hoc answer key. In the derivation sample, the total scores on the post hoc key correctly classified about 96% of the respondents. The cutting score from the derivation group was then applied to the total scores of the validation group. The optimal cutting score for the derivation group correctly classified about 97 % of the validation group. There was obviously no decrease in discriminatory power across the groups. Remarkably, only about 5% of the validation group overlapped at the tail ends of the distribution. These results indicate that the discriminatory power shown by the 42 combined TBIQ items may well have good cross-situational consistency, at least across certain samples.
An analysis of within-subject response consistency over a relatively short period of time, which involved a fairly small sample, also yielded promising results. Consistency scores were calculated by summing the number of items that were answered the same way across test administrations. As expected, the honest responders had significantly higher consistency scores than the malingering group. Although caution is called for given the small sample size, the results suggest greater consistency among individuals answering honestly versus those exaggerating or faking disorder. This result aligns with other research on the use of consistency measures to detect malingering (e.g., . Individuals who feign responses to multiple novel questions and, therefore, more or less have to make up answers as they go along, may have considerable difficulty later on duplicating their answers. In contrast, individuals who answer honestly do not have to recall their initial answers. As this or other consistency measures that can be derived from the TBIQ may have discriminatory power and also may be non-redundant with other dimensions of measurement, they may add to the accuracy achieved through adding up the number of deviant responses on the full scale or sections of the scale. Overall, it appears that the TBIQ shows promise in discriminating honest responders from those feigning mild brain injuries. Across varying analyses, the average total score of the malingering responders was far higher than the honest responders, and cutoff scores correctly classified almost all participants, with high sensitivity and specificity. The optimal cutoff score using the post hoc total scores showed good consistency on cross-validation, that is, classification accuracy did not decrease across the 43 Als o analysis of test-retest response consistency yielded promising results and groups. ' the possibility of an additional, non-redundant discriminator that could produce incremental validity. There is good reason to believe that the TBIQ will retain some, if not a substantial amount, of its discriminatory power when applied to malingerers and headinjured individuals responding honestly. In particular, it seems clear that malingerers in this study "bought" various implausible items, that is, they viewed many of the implausible items as likely consequences of mild head injury. This interpretation seems justified when one considers that the questionnaire contained a very large number of implausible items, that the malingering subjects, on average, endorsed many (over 200) of the questionnaire items, and that chi-square analyses showed that the great majority of items discriminated between groups. It also seems likely that the endorsement rate of plausible items was well beyond that expected with mildly head-injured groups, although additional research will be needed to clarify this issue.

Limitations of the Dissertation
Unmet objectives. Several objectives of this study were not met. Originally, it was proposed that information collected from the community sample would be used to aid in item generation. Due to time constraints and difficulties obtaining recruitment sites, the results of the Community Survey were not integrated into the TBIQ. This unmet objective seemingly created no major problems because all items from the Community Survey were added to the scale, thereby ensuring that all potentially useful items were retained. However, it is likely that items which did not have good discriminatory power were also added to the scale, making it unnecessarily longer, and in the long run, potentially decreasing accuracy. 44 Original plans also called for factor analyses to help determine the structure of the TBIQ and to create potential subscales. However, so many items proved effective that factor analysis was not feasible. The level of discriminatory power demonstrated by a large majority of items is a crude indication of a one factor solution, at least with the current sample. However, the same will probably not hold with other samples.
Potentially, subscales could be used to identify malingering approaches, such as endorsing an unusual number of extreme items, denying positive attributes, or endorsing only certain types of symptoms, such as only physical symptoms. It is possible that subscales may prove useful in not only differentiating between malingerers and honest responders, but potentially between head-injured individuals and other client populations that have similar symptoms. Subscales may also be useful in identifying individuals who are misrepresenting symptoms for other reasons, for example people with Somatoform Disorders. However, all of this remains to be determined through future research involving such subgroups.
Most significantly, this study would have been much stronger if there had been a large enough sample of head-injured individuals to complete the supplemental Head Injury Study. Ultimately, the clinician is interested in distinguishing between individuals who have had a head injury and are answering honestly, and those individuals who are faking or exaggerating the consequences of a mild head injury (e.g., an injury from which they have truly recovered) . These are likely much more difficult distinctions to make, and until more research is conducted using true head-injured samples, the ability of the TBIQ to make this distinction is uncertain. 45 Methodological shortcomings. This study had several limitations which may have impacted the results. One was the lack of compliance, or possible lack of compliance, among a large number of participants. In a considerable percentage of cases, participants were classified as non-compliant with instructions or as exhibiting uncertain compliance.
This raises the question of participants' motivation and investment in the study as a whole.
It is unclear why so many individuals may not have been compliant with instructions. This may have resulted from not reading and/or following the instructions. Alternatively, many participants may not have fully understood the items on the post-experimental questionnaire. It should be noted that very few individuals asked for clarification of any instruction or item in the study. Although the large majority of individuals indicated that they found the instructions at least somewhat easy to follow, perhaps various participants did not fully understand what they were being asked to do . Ratings of the ease of following the instructions, in combination with compliance ratings, suggests that participants either under-recognized difficulties following instructions, or when responding to the post-experimental questionnaire, under-represented their understanding of the experimental instructions. Also, given that participants' open-ended answers to the postexperimental questionnaire were often vague, in many cases, it was difficult to determine whether they had been compliant.
The inequality of compliance with instructions across the NORMAL and MALINGERING groups may be due to unevenness in the difficulty of the instructions, or confusion on the part of the honest responders when answering the post-experimental questionnaire. Specifically, the additional instructions provided to the honest responders for the last 3 5 items directed them to pretend they had experienced a head injury with no 46 subsequent problems. And, on the post-experimental questionnaire, many of the honest responders indicated that they faked the consequences of a head injury. Some of the honest responders then specified that they feigned only the last part of the questionnaire, and were therefore considered to have been compliant with instructions. However, many responses to the post-experimental questionnaire were too vague to make this determination.
This difference in rates of compliance across the groups could have impacted the equality of the groups. For example, perhaps only more intelligent or motivated individuals were able to fully understand the instructions for honest responding, or fully express their understanding of the honest instructions on the post-experimental questionnaire.
It should also be noted that the study sample was comprised of college students required to participate in research for course credit. In order to obtain credit, students needed only to participate in a research study, and there was obviously no stipulation that they needed to fully comply with the research in order to receive credit. Therefore it is likely that some individuals were not motivated to do their best as research participants, but rather to complete the study and obtain course credit. Also, given the length of the questionnaire, perhaps some participants' motivation and interest dwindled over time, and when given the post-experimental questionnaire, they wished to be done with the study and answered the items hastily. If their answers were too vague, their compliance was then called into question.
The post-experimental questionnaire included two open-ended questions relating to compliance. Some level of subjective judgment was involved in evaluating responses to 47 these questions, and only one of the researchers reviewed these materials and classified compliance levels. It is possible that the ratings lacked reliability although, in many cases, these determinations were straightforward. This study could have been strengthened by using more than one rater to determine compliance, or using more detailed, closed-ended questions to clarify participants' level of compliance. For example, multiple-choice alternatives could be provided that covered task instructions.
Another limitation of this study is that although compliance with the instructions was rated, the actual performance of the participants, most importantly the malingering participants, could not be determined . Individuals who chose to intentionally misrepresent symptoms may or may not have been very effective at doing so, although the near uniformity with which exaggerators were detected does somewhat mitigate this concern.
It is possible that the malingering participants who scored very high on the scale were not effective malingerers and would be easily detected on most any malingering measure. The great majority of participants did not describe their faking strategies in enough detail to assess how well their strategy might work. It is not clear if malingering responders who scored lower on the TBIQ were good malingerers in that their scores may have been misclassified as those of honest responders, or, if they may not have endorsed a sufficient number of items to look as if they were suffering from symptoms due to their injury. This latter question may be answered with further research examining how genuinely headinjured and symptomatic individuals would answer the questionnaire. Also, the student malingerers in this study may well be dissimilar to those who malinger head injury in real world settings. Unlike student malingerers, true malingering individuals often have powerful incentives to appear impaired and not be detected. Such individuals, for 48 example, may go to considerable lengths to prepare for their role and also may receive cues about symptoms through such experiences as interviews related to their claims.
One might argue that another limitation of this study was that the retest sample was somewhat older and further along in college than the rest of the sample. This was most likely due to participant recruitment and the fact that one professor offered significant extra credit to his students for their participation in this particular study. This course happened to be a higher level course than the course from which the majority of participants were recruited. Although the sample differed somewhat, it seems unlikely that minor differences in age and education would affect consistency scores. Rates of compliance and classification rates did not seem to differ between the retest sample and the rest of the sample, and the total scores of the retest sample appeared to be evenly distributed with those of the other participants.
The procedure used to classify head injury was another potential limitation of this study. The definition of a head injury for this study was a loss of consciousness or posttraumatic amnesia of more than a few seconds. Individuals who indicated loss of consciousness or post-traumatic amnesia of just a few seconds were considered to have suffered non-significant injuries and were recruited for the Validation Study. In this way, individuals who had experienced very minor or trivial head injuries with no meaningful or long-term consequences could be included, thereby increasing sample size. However, there was no way to verify head injury status, or if these injuries were truly minor. Also, head injury status was based on self-report, which certainly may contain some degree of error. In fact, a few individuals who returned for the retest session were not fully consistent in their responses to the head injury screen across participation sessions. 49 fiiture Steps in Development and Validation Future research will focus primarily on the TBIQ' s ability to distinguish individuals with true head injury sequelae from other populations. At this time, it is unclear how head-injured individuals will answer this questionnaire. It will be of interest to determine how severity and length of time since injury impact the total scores on the questionnaire.
It is assumed that individuals with these injuries will improve over time, which in tum will affect the consistency of their scores over time. Whether inconsistency resulting from genuine improvement in symptoms can be distinguished from inconsistent responding among malingerers, perhaps by analysis of item patterns, remains to be determined. It will also be very important to examine the potential impact of coaching on questionnaire responses and effectiveness. How easily can a knowledgeable person beat the measure?
What types of information do they need, and how much information do they need? Also, knowing what types of strategies are effective at beating the measure will help in developing counter-strategies for identifying malingerers.
Future research will likely include attempts to develop subscales aimed at identifying potential approaches taken by malingerers to appear impaired and possible detection techniques. For example, subscales may be developed which identify endorsement of implausible symptoms, overendorsement of plausible symptoms, or denial of normal human shortcomings. Sub scales may also focus on ways to detect malingering such as identifying inconsistent responding. As mentioned, subscales may also prove helpful in distinguishing between mild head injury and other disorders with similar symptoms, as well as factors underlying misrepresentation of symptoms, such as selfdeception versus true malingering. 50 Studying true malingerers is a challenge. Malingerers do not readily identify themselves to researchers, and the base rate for malingering is not clearly known. Base rates play a large role in determining the usefulness of a measure due to the impact on sensitivity, specificity, and positive and negative predictive power. Therefore, populations that are expected to have differing rates of potential malingerers, such as those in litigation, will be investigated.
It would also be useful to examine how individuals with genuine, symptomatic  What will be done: If you take part in this study, you will complete a questionnaire that will last about 20 minutes. This questionnaire consists of questions about the possible effects of head injury. The questionnaire will also contain questions about other medical or psychological problems. After you complete the questions, the researcher will briefly ask you about any head injuries you may have experienced. The researcher will also briefly ask you how you may have learned about head injury and other medical or psychological problems.

Risks or discomfort:
The possible risks or discomforts of the study are minimal , although you may find some of the questions difficult to answer.
Benefits of the study: You will not benefit directly for taking part in this study. However, the researchers may learn more about people's knowledge of head injury. This information wi ll be used to develop ways to identify people with head injuries and identify people who may be faking an injury.

Confidentiality:
Your part in this study is confidential. None of the information on the questionnaire will identify you by name. No one else can know if you participated in this study, and no one else can find out what your answers are. ~eports will be based on group information and will not identify you or any individual in this project. Jim is a relatively young man. In the past, he has had no serious physical or mental problems, and generally handled the day-to-day requirements of life without too much trouble. More recently, however, he has started to experience more problems in his day-to-day functioning, and decided to see his doctor. He had been in a car accident, hit his head on the windshield , and was knocked out for a few minutes. Based on the exam, his doctor diagnosed him with a mild brain injury.

Case 3 -Depression
Tom is a relatively young man. In the past, he has had no serious physical or mental problems, and generally handled the day-to-day requirements of life without too much trouble. More recently, however, he has started to experience more problems in his day-to-day functioning , and decided to see his doctor. Based on the exam, his doctor diagnosed him with a serious depression.
Bill is a 56-year-old man. In the past, he has had no serious physical or mental problems, and generally handled the day-to-day requirements of life without too much trouble. More recently, however, he has started to experience more problems in his day-to-day functioning , and decided to see his doctor. Based on the exam, his doctor diagnosed him as being in the initial stages of a dementia known as Alzheimer's disease.
COMMUNITY SURVEY -RA TING SCALE 1 Each of the following items describes some type of problem. For each item, please rate how likely it is that an average person would have that problem. Next, please rate how likely it is that a person with a mild brain injury would have that problem. Then, please rate how likely it is that a person with a serious depression would have that problem. That is. for each item. you are to do a rating for an average person. a person with a mild brain injurv. and a person with serious depression.        We are also very interested in finding out what people know about the course of recovery tor people with mild brain injuries. Meaning, what happens to their symptoms over time.
Please circle your answers.

5.
What about after a year? · · · · · · · · · · · · .. · · · · · · · · ...... · · Get Better 1. Some of their symptoms will get better, but others will get worse :::,::::1,:::~!iiai i": :l.''i'''! .~~ ---~ · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · -jj -jj-jj-jj -jj -jj -j~~i~i-j~jj~i~~i~~i~~~j~j~ Each of the following items describes some type of daily task or activity. For each item, please rate how much trouble. if any, an average person would have with that task. Next, please rate how much trouble. if any, a person with a mild brain injury would have with that task. Then, please rate how much trouble, if any, a person with a serious depression would have with that task. That is, for each item. you are to do a rating for an average person. a person with a mild brain injury, and a person with a serious depression.    (If necessary, prompt with the following: never heard a thing about it, a little, a good amount, more than most people) (If they answer NOTHING, than prompt with "you've never heard anything about it?" to see if they've ever heard about it) 11. · Follow up by asking: Where did you get your information? (If necessary, prompt with knowing someone, TV, classes, magazine articles, etc.)

How much do you know about depression?
(If necessary, prompt with the following: never heard a thing about it, a little, a good amount, more than most people) (If they answer NOTHING, than prompt with "you've never heard anything about it?" to see if they've ever heard about it) 13. Follow up by asking: Where did you get your information? (If necessary, prompt with knowing someone, TV, classes, magazine articles, etc.) 72 AppendixB Materials for the Validation Study 2 2 Note. Additional titles have been added to some of the materials to aid in clarity. The overall content of the materials has not changed from that presented to the participants. Description of the project: This project is one of several studies in the development of a scale intended to distinguish between mild brain injury and people faking a brain injury for financial compensation.
What will be done: If you take part in this study, you will complete a questionnaire that will take about 40-60 minutes. This questionnaire consists of items which may help distinguish between a true brain injury and people faking an injury. You will be asked several questions to help determine if you ever had a significant head injury. You will be asked to either answer our questionnaire honestly, or to exaggerate, or fake your responses to look as if you have had a brain injury. At the end of the questionnaire, you will be asked to complete a short form regarding your understanding and compliance with our instructions, and also to provide some demographic information.
Risks or discomfort: The possible risks or discomforts of the study are minimal, although you may find some of the questions difficult to answer. Also, some of the questions are personal, and you may feel slightly uncomfortable sharing this information.
Benefits of the study: You will not benefit directly for taking part in this study. However, the researchers may learn more about people's knowledge of head injury. This information will be used to develop ways to identify people with brain injuries and identify people who may be faking an injury.
Confidentiality: Your part in this study is confidential. None of the information on the questionnaire will identify you by name. No one can know if you participated in this study, and no one can find out what your answers are. All materials are pre-coded to ensure confidentiality. Although your initials and 4 digits of your social security number will be on your materials, these identifiers will be changed after your responses are entered into our computer. Reports will be based on group information and will not identify you personally in any way.

Decision to quit at any time:
The decision to take part in this study is up to you. You do not have to participate, and you can refuse to answer any question. If you decide to take part in the study, you may quit at any time. Whatever you decide will in no way penalize you. However, you will need to fulfill your research requirement for class in some other way if you do not complete the study. If you wish to quit, you can simply inform the researcher, or you can call Margaret Ackley (87 4-2193) at any time during this research .

Head Injury Screen
I am studying what people know about mild brain injury and ways people may fake symptoms of a brain injury. Because of this, I need to know whether you have experienced any head injuries. Please answer the following questions. If you are not sure of what the question is asking, please ask the researchers .
1. Have you ever hit your head and: lost consciousness (were knocked out), and/or your memory completely stopped working for some time and you could not remember anything about what was going on? YES NO IF NO, please go to ITEM 5. 1a. IF YES, how long were you out for, and/or how·1ong was it before your memory started working again? If you had both of these problems for some time, indicate the longer period of time. Please check your best estimate. seconds 12 -24 hours 1 -20 minutes 20 -30 minutes 30 minutes -1 hour 1 -12 hours __ more than a day, but less than a week __ more than a week, but less than a month over a month

INSTRUCTIONS: (for "NORMAL" group)
You will be completing a questionnaire that describes characteristics, abilities, preferences and problems that may or may not occur after a mild to moderate brain injury.
The following questionnaire is designed for individuals who have experienced a brain injury, but we also want to know how people who have not had a brain injurv would respond to the items. Thus, when you answer the questions, please describe yourself as accurately as possible.
Because the questionnaire is designed for individuals who have had head injuries, some questions refer to an injury, for example, "Since my injury, I eat more." Of course, since you have not had a head injury, this wording does not apply directly to you. So, here is what you are to do with these types of items: ANY time you see a reference to an injury, just substitute the phrase "since 1998" and then answer the question as it applies to you. In other words. we want you to read these items as if they were asking you how you have felt SINCE 1998.
For example, "Since my injury, I eat more." We want you to read this type of item as if it was asking how you have felt SINCE 1998 -for example, "Since 1998, I eat more." Or "Before my injury, I had more free time than I do now," should be read as "Before 1998, I had more free time than I do now." Two Important Reminders: 1. It is very important that you keep this 2 year time frame in mind when answering this questionnaire. 2. When answering questions, please try to describe yourself as accurately as possible.

If the item is FALSE or MOSTLY FALSE, answer FALSE (F).
Please circle your answers.

PLEASE TRY TO ANSWER EVERY ITEM.
If you have any questions, please ask the researcher.

77
Additional instructions for NORMAL group for the last 35 items (# 438 -# 472): Because this questionnaire is designed for individuals with head injuries, and you have not had an injury, some of our questions could not be answered by following the instructions given to you at the beginning of the questionnaire.
Therefore. for the following items. please pretend that you had a head injury in 1998 that caused no problems. Please answer these items as they would apply to you. given that you had this "injury" and suffered no problems.
If you have any questions, please ask the researcher.

INSTRUCTIONS: (for MALINGERING group)
You will be completing a questionnaire that describes characteristics, abilities, preferences and problems that may or may not occur after a mild to moderate brain injury.
What the researchers want to know is whether, or how often , people who are faking or exaggerating symptoms of a brain injury will endorse these characteristics and problems.
We want you to pretend that you were involved in a head-on collision 2 years ago. in 1998, caused by a careless driver. Your head hit the windshield and you lost consciousness (blacked out). Also, you remember nothing that went on for the next several hours. You were examined at the hospital, diagnosed with a mild brain injury, and were discharged for follow-up care. In truth, your problems got better quickly, and you are feeling like your old self, but you have not told anyone.
You are suing the driver who hit you. You know you may get a lot of money if you look like you are still suffering from problems. You are sent to a health professional to evaluate your case. As part of the evaluation, you are asked to complete the questionnaire that follows . As you know, professionals realize that people may fake or exaggerate problems. The more impaired you seem. without being detected as faking or exaggerating, the more money you are likely to receive .