MMPI-2 Scores as a Predictor of Outcomes on the Phase II Profile Integrity Inventory

Many personality inventories have been developed and used for clinical assessment purposes as well as pre-employment screening devices . Examples include the Woodworth Personal Data Sheet , the Thematic Apperception Test (TAT) , the California Psychological Inventory (CPI), the Sixteen Personality Factor Questionnaire (16PF), and the Minnesota Multiphasic Personality Inventory (MMPI, MMPI-2) (Kaplan & Saccuzzo, 1993). Sackett and Wanek (1996) reviewed the use of measures of honesty , integrity, conscientiousness, dependability , trustworthiness , and reliability for personnel selection , and found that the criterion-related validity studies are well represented. Using this as a basis, Murray (2000) completed a construct validation study of the Phase II Profile Integrity Inventory, which provided compelling results for its valid use in pre-employment and promotion screening purposes. This study investigated the factorial validity of the Phase II Profile Integrity Inventory by assessing the predictive power of the MMPI-2 scores for outcomes on the Phase II Profile Integrity Inventory using the statistical technique of structural equation modeling, a confirmatory factor analysis procedure. Several goodness-of-fit indices indicate that MMPI-2's Anti-social Practices, Cynicism, and Work Interference Scales are a viable predictor of outcomes on the Phase II Profile's overall confidence scale score. In addition to the equation modeling , a hierarchical cluster analysis was used to examine the underlying relationships of constructs measured by the Phase II Profile Integrity Inventory , yielding cluster structures that are similar to the results of a previous principal components analysis. Analysis of variance statistics reflect that there are gender differences (for this college sample) on the overall confidence scale scores, which is derived from the Phase II Profile. Findings indicate that the use of the Phase II Profile with this younger, inexperienced age group (mean age= 19.5 years) could be inappropriate. It may be that employers have differing screening needs and while one employer may want a full clinical picture of the applicant another may want to focus on only a partial picture of the applicant. If this is true, many employers and human resource specialists may benefit by adding this 117-item inventory to their set of tools. Acknowledgment I wish to thank my committee members for graciously sharing their time and knowledge with me. Having been exposed to all of their different aspects towards research and practice in pursuing practical solutions to real problems has been invaluable to me. A very special thanks goes to both Joe and Sue Rossi for giving me critical insights during critical moments. My goal is to forge collaborative professional relationships with every person who has contributed to my graduate training experience. Thank you.


LIST OF TABLES
.

Statement of the Problem
Assessing a potential employee before hiring him/her is of great importance. One major reason is because the impact of employee theft on business and the consumer is enormous . Internal theft has increased over the prior two decades at an alarming rate (Bales, 1988). This estimated annual loss to American business from employee theft is in excess of 40 billion dollars (Palmiotto, 1983). Zemke (1986) pointed out that when calculated on a perminute basis, a 40 billion-dollar loss due to employee theft is equivalent to a loss of $7,125 per minute.
More recently , according to Effective Media Inc. (1998), at least 110 billion dollars is annually lost as a result of theft in the workplace. This accounts for money, merchandise, information, and time that is stolen from employers.
Industries that allow employees access to money and merchandise such as retail stores, banks, and warehouses are those having the greatest need for preemployment screening (Sackett & Harris, 1984).
Previously, employers have typically used two methods prior to employment to assess the honesty of employees: written tests and the polygraph . However , in 1988 Congress passed a law prohibiting use of the polygraph by private employers as a pre-employment test (Hartnett, 1991). Today , with the increasing demand by employers for paper-and-pencil measurements, psychologists are developing more reliable pre-employment tests (Jones, Joy, Werner & Orban, 1991;Hartnett & Terranova , 1991).
An example of why these measurements are desired is described here. Most businesses of medium to large size perform a physical inventory once a year.
There are types of inventory control systems that allow disparities between actual inventory amounts and what is shown on records to exist , without this coming to the attention of the manager /owner. Even if businesses could afford to perform two physical inventories a year in an attempt to have a tighter inventory control, potentially dishonest employees still have plenty of time to abscond with merchandise .
Pre-employment screening tests are widely used in business and industry in an attempt to reduce internal theft (Martin ,1989;Sackett & Harris, 1984). One paper and pencil pre-employment screening test , the Phase II Profile Integrit y Inventory (Lousig-Nont & Associates , 1982a), has been used for assessing the personality trait of honesty (Lillie- Murray , 1999;Martelli , 1988). Sackett and Wanek (1996) have reviewed the use of measures of honesty, integrity , conscientiousness , dependability , trustworthiness, and reliability for personnel selection , and have found that criterion-related validity studies are well represented. In addition to finding that the criterion-related validity investigations are well represented in the literature , other scientists have found that through an analysis of employment longevity there is a significant and measurable relationship between employment longevity and the scores on the Phase II Profile Integrity Inventory (Cotton , 1990).
A previous study added to the body of knowledge of pre-employment screening /testing by investigating the construct validity (Murray , 2000) of the Phase II Profile Integrity Inventory by examining its results with those of the Minnesota Multiphasic Personality Inventory, 2nd edition (Butcher, et al., 1989).
This measure, the MMPI-2, was selected for investigating the construct validation because it has been the most widely used personality assessment instrument and the most extensively researched of all psychological tests. Its first and only revision, the MMPI-2, was published in 1989 and is now widely accepted in psychological practice. According to Newmark and McCord (1996), the unparalleled success of the MMPI is attributable primarily to three aspects of its As a follow-up procedure to the multiple regression a secondary factor analysis was performed to determine which hierarchical constructs are measured by the Phase II Profile. This factor analysis was performed using the principal components method of extraction with varimax rotation (George & Mallery, 1999 It was predicted , building upon the prior research described above (Murray, 2000), that a structural model would adequately describe this predictive relationship. Evidence for this was shown by the resultant goodness of fit indices after the most optimal path parameters were determined . Further , this research was designed to investigate the usefulness of the Phase II Profile for preemployment and promotion screening purposes , determining whether it might be able to measure inappropriate employee traits using a much shorter inventory as compared with the MMPI-2.
In addition to the structural equation modeling technique described above, a hierarchical cluster analysis was performed to determine which profiles might emerge . This type of cluster analysis of the Phase II Profile scales produced the cluster structure (profiles) for 298 participants. It was predicted that the cluster structure (profiles) would coincide with the principal components analysis performed in Murray's (2000) validity study. Additionally, it was hypothesized that gender would not play a significant role in this analysis (Baltes, et. al., 1986).
Finally, it was hypothesized that the MMPI-2 L scale would not correlate with three test items taken from the Phase II Profile that specifically ask the dollar amounts the person has stolen in the past. It is believed that the MMPI-2 L (Lie) scale is more of a measure of how much a test taker is attempting to present him or her self in a positive light rather than a measure of direct deceit (Butcher, et. al., 1990). High MMPI-2 scores indicate increased levels of the person trying to present him or herself in a positive light.  (Boomsma, 1983;Tabachnick & Fidell, 1996).

Method
Participants representing minority groups were in this study in an effort to be a more accurate sample of the general population. There was no exclusion of participation because of race, ethnicity or socio-economic background. The students self-described as being Asian, Black, Caucasian, Hispanic, or Other.
There was no financial compensation but participants did receive class credit towards their introductory psychology course requirements.
Scales found on the Phase II Profile are: Validity Scale-There are 10 validity points on the inventory. If a person gets 8 correct, this would indicate that 80% of the time he was trying to answer the questions truthfully. If a person has a very low percentile, and a validity score of 6 or lower, it might indicate that they are not a good reader and they did not really understand the test. If this were the case, the Profile would be invalid.
Thinking Scale-Thinks about doing something dishonest. Higher scores indicate increased preponderance of committing dishonest acts.
Rationalization Scale-Rationalizes acts of dishonesty. Higher scores indicate more rationalization of dishonest acts .
Bad Attitudes Scale-Bad attitudes usually associated with dishonest individuals.
Increasing scores on this scale indicate higher levels of bad attitudes.
Minor Admissions Scale-Minor admissions of dishonesty. This scale reflects admissions that are relatively insignificant.
Major Admissions Scale-Major admissions of dishonesty. These are noteworthy admissions of dishonesty.
Good Attitudes Scale-These are attitudes generally associated with honest people . There are items in the Phase II that include 48 possible "good attitude" responses.
Confidence Scale-The confidence level indicates how confident the Inventory developer is that a person will be an honest employee . A confidence level of 26% is very low. There are times a person may be in a high overall percentile for integrity, for example the 92nd percentile, which is normally good, however they may have a low confidence score of a 26% . This could be an indication that the person tried to fool the Phase II Profile and may not have answered truthfully.
Caution should be exercised when a person has a low confidence score.
The MMPI-2 The MMPI and the MMPI-2 have been used in both clinical and work settings for assessing /screening test-takers. The Federal' Government has used the MMPI and now the MMPI-2 extensively for screening employees who are eligible for working in sensitive environments. Conditions that tend to generate deviant patterns of self-report include several test-taking strategies that invalidate the MMPI-2. These patterns are described in the scales for Validity below. The following MMPI-2 Content scales are described in the form of 'themes' and have become widely used as a valid and useful way of approaching patient problems within the clinical setting (Wiggins , 1966(Wiggins , , 1969Butcher, et. al, 1990). Please note that these descriptions and 'themes' are interpretations of the MMPI and the MMPI-2 inventory originators (Butcher , et. al, 1990).

MMPI-2 Validity Scales
Cannot Say(?) -The instructions to the MMPI-2 encourage the test taker to respond to all of the items. The great majority of the items in the inventory are written in such a way that either a true or false response to the item would be appropriate and relevant to anyone. When items are not endorsed ( or both true and false are marked) , particularly a large number of them , the scores on the test will likely be attenuated and result in an inadequate assessment. Some test takers who are insufficiently motivated to be evaluated may simply answer the items without attending to the content by simply marking answers in a particular pattern . For example , a test taker could mark the items on his answer sheet in the shape of his initials. For this reason it is good to examine the answer sheet before it is scored .
Variable Response Inconsistency Scale (VRIN) -The best way to obtain an appraisal of inconsistent responding is to determine whether the test taker has endorsed similar items in a consistent manner. Inconsistent responding to personality questionnaire items is relatively easy to detect if the inventory is long enough and has enough items of similar or opposite meaning. The MMPI-2 provides two scales for detecting inconsistent responding to the items. These are the VRIN and the TRIN scales. The VRIN is a good measure of random responding on the MMPI-2 because it is made up of 67 pairs of items for which one or two out of four possible configurations represents inconsistent responses.
For example, answering true to "I wake up fresh and rested most mornings" and true to "My sleep is fitful and disturbed" represents semantically inconsistent responding.
True Response Inconsistency Scale (TRIN) -This was developed to appraise the tendency that some people have to respond in an inconsistent manner to items that should be endorsed, to be consistent, in a particular way. TRIN is made up of 23 pairs of items to which the same response is semantically inconsistent. For example, answering the items "Most of the time I feel blue" and "I am happy most of the time" both true or both false is inconsistent.
Lie Scale (L) -Some people have difficulty disclosing personal information and tend to present themselves in an overly favorable light on personality scales. This scale is designed to detect an invalidating pattern where clients tend to exaggerate their virtues and lay claim to unrealistically higher moral standards than other people.
Defensiveness Scale (K) -Another, somewhat related aspect of presenting a good front on personality inventory items involves problem denial. In this response pattern, the test taker simply checked positive adjustment options and denied his or her problems. The test taker does not exaggerate virtues, but only denies his or her problems.
Superlative Self-Presentation Scale (S) -This is another measure of defensiveness. People who score high on this scale endorse few minor faults and problems-considerably fewer than those who took the test in the MMPI-2 Restandardization Study. (High S responders are also associated with extreme endorsement of "self-control" in test takers by people who know them. Infrequency Scale (F) -This invalidating condition has been referred to as faking, exaggerating, or malingering. This response pattern is commonly found in situations in which the test taker feels it is to his or her advantage to appear psychologically disturbed on the test. These test takers exaggerate their complaint pattern and tend to respond to too many of these extreme items in a pathological direction.
Infrequency-Back Scale (F(B)) -This scale uses similar items as found in the infrequency scale, but these are placed in the latter part of the test.

MMPI-2 Content Scales
Anxiety Scale (ANX) -This scale is comprised of items that center on feelings of tension and anxiety. High scorers on this general anxiety scale (T>65) acknowledge that they experience symptoms of anxiety, including tension, somatic problems, sleep difficulties, worries, and poor concentration. Highscoring patients report a fear of losing their mind and having difficulties making decisions. They acknowledge that life is very difficult for them, and they find life a strain. They also seem to have insight into their problems; they are aware of the symptoms and problems they are experiencing and are willing to discuss them with others.
Fears Scale (FRS) -This scale contains items that focus on specific fears. A high score on FRS is obtained when the patient acknowledges many specific fears. They report feeling uncertain about their future and are uninterested in their lives.
They are likely to brood, be unhappy, cry easily, and feel hopeless and empty.
Very high scorers acknowledge suicide or wish that they were dead. They acknowledge that they feel as though they are condemned or may have committed unpardonable sins. They tend to feel that other people do not provide them with enough emotional support.
Health Concerns Scale (HEA) -The HEA contains items that deal with somatic complaints and health concerns. Individuals with high scores on the HWA scale acknowledge many physical symptoms concerning several bodily systems , including gastro-intestinal symptoms (e.g., constipation , nausea and vomiting , stomach trouble) , neurological problems (e.g., convulsions , dizziness and fainting spells , paralysis) , sensory problems (e.g., poor hearing or eyesight) , cardiovascular symptoms (e.g., heart or chest pains) , skin problems, pain (e.g. headaches , neck pains) , and respiratory troubles (e.g., coughs , hay fever or asthma). Patients who score high on HEA worry about their health and indicate that they feel sick a lot.
Bizarre Mentation Scale (BIZ) -The item content on this scale involves extreme psychotic symptoms. All of the items are symptoms of severe mental disorder.
Psychotic thinking characterizes people who score high on this scale. These items suggest auditory , visual , or olfactory hallucinations. People who score high on this scale appear to be aware that their thoughts are strange and peculiar.
Paranoid ideation (e.g., the belief that they are being plotted against or that someone is trying to poison them) is reported. People who score high on this set of items appear to feel that they have a special mission or power in life.
Anger Scale (ANG) -This scale contains items that reflect anger control problems. They center on loss of emotional control and hotheadedness. People who score high on this scale acknowledge anger control problems. They report being irritable , grouchy, impatient , hotheaded , annoyed , and stubborn; they acknowledge that they sometimes feel like swearing or smashing things. They tend to lose self-control and report personal incidences of physical abuse toward other people and objects. sometimes enjoy the antics of criminals and like to see "clever crooks" get away with crimes. They tend to believe that it is appropriate to get around the law as long as it is not broken.
Type A Scale (TP A) -This scale is comprised of items to assess the pattern of behavior that includes hostility, driven behavior, and compulsive schedule orientation. People who score high on this scale tend to be hard driving, fastmoving, and work-oriented individuals, who frequently become impatient, irritable, and annoyed. It bothers them to have to wait or be interrupted at a task.
There is never enough time in a day for them to complete the tasks they have planned. They tend to be very direct in interpersonal situations and are likely to be overbearing in their relationships with others.
Low Self-Esteem Scale (LSE) -This scale is made up of items that reflect negative self-views and strong feelings of inadequacy. People who score high on LSE present themselves as having low opinions of their self. They are not well liked by others and feel unimportant. They hold many negative attitudes about themselves, including perceptions that they are unattractive, awkward, clumsy, and useless. They often feel as though they are a burden to others and lack selfconfidence. They find it hard to accept compliments from others and, at times, feel overwhelmed by all the faults they see in themselves.
Social Discomfort Scale (SOD) -This scale was designed to assess personality characteristics related to the experience of social discomfort and distress. People who score high on this scale are very uneasy around others. They prefer to be by themselves; when they are in social situations, they are likely to sit alone and avoid joining in a group. They tend to see themselves as shy and dislike parties and social events.
Family Problems Scale (FAM) -The items on this scale focus on family and relationship problems. Those who score high on this scale report substantial family discord.   (Murray , 2000) (Appendices A-1 , A-2), together with the results of the of the principal components analysis of the Phase II Profile scales (Appendix C). Please note that the Phase II Profile Bad Attitudes scales was not included in this present study. This is due in part to the fact that the Bad Attitudes scale was significantly correlated with many of the content scales on the MMPI-2. It was decided to leave this variable out of the model due to its overlapping variance with many other variables, since the resulting multicollinearity would result in statistical difficulties and a poorly specified model (Tabachnick & Fidell, 1996).
The results of the confirmatory factor analysis were used to determine the factor loadings and measurement errors . The estimation of this model using the results of this study included the analysis of the goodness of fit. Model fit was assessed with a variety of indices selected to represent different conceptual approaches , including x 2 (Chi-square), x2!df (the normalized Chi-square or the Chi-square divided by the degrees of freedom) , the Comparative Fit Index (CFI), and the Root Mean Square Error of Approximation (RMSEA). All of these indicators have been shown to be accurate , robust , and reasonably unbiased est(mators of model fit under a variety of circumstances except for x2 (Anderson & Gerbing , 1984;Bentler, 1990;Steiger, 1990). x2 was reported since it frequentl y serves as the basis for computing many other goodness-of-fit indices .
Reporting a wide range of fit indices protects against the possibility of sampling error and model misspecification (Marsh , Balla , & McDonald , 1988). to evaluate the 2-factor model that was derived from the exploratory factor analysis of the Phase II Profile subscale totals. Descriptive statistics of the scales scores used in the structural model (Figure 2), using the entire dataset is shown in Table 4.  (Tabachnik & Fidell, 1996).

Demographic information was collected for all participants and is shown in
A better 2-factor model fitting these data was derived and is depicted in the EQS results diagram shown in Figure 3. Table 5 depicts the standardized covariance matrix (from a subset of the data) used in deriving the structural model to be tested (N= l 98) . N=l98 *used to build structural model Table 6 depicts the standardized covariance matrix used in calculating the factor loadings for the entire data set (N=298). Within the modeling paradigm, an initial subset of the entire data was used to derive the structural model.
Subsequent to this analysis the entire data set was analyzed and then tested, which provides the goodness of fit indices that describe the accuracy of the structural model derived in the first place. The confirmatory factor analysis provides support for the validity of the 3 MMPI-2 scales as predictors of performance on the Phase II Profile Inventory.
These are the Anti-social Practices , Cynicism, and Work Interference Scales.
Goodness of fit measures that were used to assess the fit of the model to the entire dataset included Chi-square x2 (8)  In an effort to further describe how the two measures contribute to the confidence/integrity score presented by the PIIP , a composite of the MMPI-2 scale scores and a composite of the PIIP scale scores were reduced to high/low scores and then analyzed. That is, the Anti-social practices , Cynicism, and Work Interference scales were added and then averaged for each participant to establish a mean score. These mean scores were regrouped into either high or low score groups using the median as the splitting point between groups. This same procedure was completed for the PIIP scale scores , using the Major Admissions, Rationalization, and Thinking scales. Median splits were employed to maximize the number of individuals in the groups so as to prevent low statistical power from too small sample sizes.
A 2 X 2 analysis of variance (AN OVA) was implemented using group status (high/low) on the MMPI-2 and group status (high/low) on the PIIP as the independent variables and the PIIP Confidence scale score as the dependent variable . Results of this analysis of variance are shown in Table 7. for the high PIIP group. The interaction of the two main effects was also significant, F(l , 294) = 3.88,p = .05, 17 2 = .012 . Means for all groups, illustrating the interaction, are shown in Figure 4.  that was previously performed (Murray, 2000) (see Appendix C). As predicted , the cluster structures do coincide with the principal components analysis groupings as can be seen in Figure 5. Lastly, the Minor Admissions scale is off by itself as shown in the principal components analysis plot as well as in the Figure 5, which shows the results of the hierarchical cluster analysis.

Gender Differences
The hypothesis that gender would not play a role in this study was not supported; gender differences were indeed found. A one-sample 2-tailed t-test was performed comparing overall confidence scale scores by gender, yielding significant differences t(296) = 3.964,p < .001, 11 2 = .050 as shown in Tables 8   and 9 below. Confidence scale scores were lower for men than for women. Although this difference is statistically significant this must be interpreted cautiously because the 11 2 value is only .05, that is, the amount of variance in Confidence scale scores accounted for by gender alone is only 5%. Another finding of interest is that the MMPI-2's 'L' scale was not significantly related to the amount of money the test taker had reported to steal.
This affirms the definition of what the L scale is intended to measure on the MMPI-2, that is, that the L scale is more of a measure of one's effort to appear more positively rather than being an indicator of dishonesty and theft.
The statistical nature of the original components analysis was explored in this study by comparing it with a hierarchical cluster analysis. This comparison was made in this study to better understand the relationships between the scales of the Phase II Profile Integrity Inventory. The clustering structure was built from an across-sample covariance matrix , which is presented in Figure 4. The pattern of hierarchical clusters was then compared with the pattern of components groupings from the principal components analysis that was performed using a data subset of the inventory results from the 298 participants. These two separate statistical techniques yielded very similar grouping patterns , showing the relati ve salience of the relationships between constructs as measured by the Phase II Profile.
One unexpected outcome was the statistically significant gender differences found in the Phase II Profile confidence scale scores. The confidence scale score is the numerical value that most employers will go to first when evaluating the results of the inventory. As shown in Tables 5 and 6, males scored significantly lower on this scale compared to females. While this is true and the significance was detected using a large sample size of participants, the effect size was relatively small where 17 2 = .05 or 5% of the variance could be attributed to gender. Also to note is the large discrepancy of sample sizes for females (n= 219) and males (n = 79).
The confidence scale reflects the test-makers' confidence in the applicants' level of integrity or honesty on the job . One may want to explain this gender difference outcome in terms of work-related roles and how generally speaking , certain work positions are traditionally held by males while others have been traditionally held by females (Thoma & Rest, 1999;Bates , et. al, 1986). Further breakdown of a statistical analysis of the profiles by gender were not obtained in this study , but would prove interesting to include in future research.  (Kaplan & Saccuzzo , 1993).
One reason why pre-employment screening tests are so readily available and used today is due to the dramatic increase in businesses' internal theft. What is important to keep in mind is that as this need for measures of honesty, integrity , conscientiousness , dependability, trustworthiness, and reliability for personnel selection increases, the need for scientific studies that provide evidence of their validity will also increase .
As paper and pencil tests are being sold, utilized and validated for use in the pre-employment and human resource development arena, researchers should continue to emphasize who should be given these screening inventories in the first place. What researchers should do is evaluate the tools that currently exist for the human resource specialist to use and attempt to increase the quality of what they are supposed to measure, whether it is a pre-disposition towards negative attitudes, or even the rationalization of low-integrity acts (theft from the workplace   This study will investigate how two different tests are similarly constructed . This is completely anonymous, so please do not write in either handbook. For this first test, please answer the following questions, then proceed to question 1 of the blue hand book. You will answer questions 1 through 117. After you have answered all questions on the first test, please write down the time down below, then turn over your answer sheet and blue handbook. After a short break, you will then proceed to question 1 of the next handbook. Do not spend too much time on any one question. Do not change any of your answers. You are to answer every question honestly. This study will investigate how two different tests are similarly constructed. This is completely anonymous, so please do not write in either handbook. For this first test, please answer the following questions . After you have answered all the questions on the first test tum over your answer sheet and test booklet. After a short break, you will then proceed to question 1 of the blue handbook. You will answer questions 1 through 117. Do not spend too much time on any one question . Do not change any of your answers. You are to answer every question honestly. Please write down the time when you have finished with the second test. If you decide to participate, you will be asked to complete two inventories designed to measure personality traits . Because of the testing situation , you may feel some minor feelings of anxiety or stress. Possible benefits to you include the knowledge that you participated in a study which aims to increase the knowledge oftest construction, and the partial fulfillment of the experimental participation requirement called for by your Psychology 113 course . We cannot guarantee , however , that you will receive any benefits other than research credit from this study .
All information that is obtained from this study will be coded and will have no personal information that would be capable of individually identifying you . If you give us your permission by signing this document , we plan to report the results of this study in professional psychology journals . At no time will your identity be revealed.
You will be receiving 20 points for your participation . If you have any questions regarding your rights as a human subject and participant in this study, you may contact the office of the Vice Provost for Graduate Studies , Research and Outreach, 70 Lower College Road, Suite 2, University of Rhode Island, Kingstown, Rhode Island, telephone : (401) 874-2635 .
If you have any questions, please ask us. If you have any additional questions later, Dr. Rossi can be reached at (401) 874-5983 , and will be happy to answer them . You will be given a copy of this form to keep .

Signature & Date
Signature of Investigator