VALIDITY OF THE DIMENSIONAL MODEL OF ADVERSITY AND PSYCHOPATHOLOGY ACROSS RACE/ETHNICITY AND GENDER

Background: Experiences of childhood adversity predict adult health though the mechanisms of such association remain unknown. The Dimensional Model of Adversity & Psychopathology (DMAP; Duffy et al., 2018) postulates that experiences of adversity can be conceptually and statistically “unpacked” into separate factors that have distinct outcomes. Experiences characterized by an absence of expected inputs from the environment (e.g., physical and emotional neglect) have been construed as a deprivation dimension; whereas experiences characterized by physical harm or threat of harm (e.g., physical, emotional, and sexual abuse) have been hypothesized to fall along a threat dimension. DMAP suggests that experiences of deprivation shape capacity for cognitive control, whereas experiences of threat alter emotional reactivity. Despite the popularity of DMAP, there have been critiques of its statistical utility due to lack of divergent validity and multicollinearity (Smith & Pollack, 2021). Methods: We sought to test propositions of DMAP using a cross-sectional sample of emerging adults (N=1,187; mean age= 22; 49.5% female; 33% Black, 35% Latinx, 32% White) recruited via an online research panel. Deprivation and threat were measured using the Childhood Trauma Questionnaire (CTQ; Pennebaker & Susman, 1988); and individual differences in cognitive control and emotional reactivity were measured using the Adult Temperament Questionnaire (ATQ; Evans & Rothbart, 2007), which consists of traits Effortful Control (capacity for cognition control), Negative Affectivity (intensity and frequency of negative emotional states), and Surgency (intensity and frequency of positive emotional states). This thesis sought to test: 1) the convergent and divergent validity of McLaughlin and colleague’s DMAP, as well as 2) the cultural validity of the model across the three largest ethno/racial groups in the U.S. (Black, Latinx, White). Results: Differential item functioning was demonstrated across race/ethnicity and gender in various scales of the CTQ and the ATQ. Between-group analyses were only conducted for scales that showed comparable psychometric functioning across race/ethnicity and gender, which included: sexual abuse, emotional abuse, and emotional neglect scales from the CTQ, and the Discomfort, Attentional Control, Activation, and Sociability subscales for the ATQ. After statistically accounting for differential item functioning, not all DMAP hypotheses were supported. There was evidence that select groups had significant associations aligned with 1) deprivation predicting attentional and 2) threat predicting surgency and negative affect. However, our divergent hypotheses were not all supported. There was evidence that select groups had significant association aligned with 1) deprivation predicting surgency and negative affect and 2) threat predicting Attentional Control and Activation. Conclusions: Measurement invariance was not established for all scales across race/ethnicity and gender, but for those scales with comparable psychometric properties across race/ethnicity and gender DMAP hypotheses were only partially supported.


PUBLICATION STATUS
The presented thesis has been prepared in Manuscript Format for submission to peer-reviewed journal Assessment. The manuscript has not been published.

INTRODUCTION
Childhood adversity refers to negative environmental experiences that are likely to represent a deviation from the anticipated environment, as well as those that are likely to require substantial adaptation by the average child (McLaughlin, 2016). Deviations from the anticipated environment reflect the presence of an unexpected input that threatens the well-being of the child (e.g., physical abuse) or an absence of an expected input (Humphreys & Zeanah, 2015;McLaughlin, Sheridan, & Lambert, 2014). Research has shown that childhood adversity is common, as approximately half of children (18 and younger) in the U.S. will experience one form of adversity by the time they reach adulthood (Green et al., 2010).
Although a plethora of research suggest that children who have experienced adversity are more likely to develop psychopathology and have social difficulties (McLaughlin et al., 2012), there is a lack of consensus on how to conceptualize and measure the effects of adverse childhood experiences. Earlier research focused on individual types of adversity, such as sexual abuse or poverty ). However, research has shown that individuals who experience one form or adversity are likely to have experienced various types of adversity (Fuller-Thomson & Sawyer, 2014). The clustering of adverse experiences within individuals has been a limitation of research focusing on individual types of adversity, as this approach does not consider children who have experienced different kinds of adversity. To account for the frequent co-occurrence of adverse experiences, a cumulative risk approach has been applied.
The cumulative risk approach counts the number of adversities experienced to create a single risk score (Evans, Li, & Whipple, 2013). The cumulative risk approach focuses on the number of adverse events experienced and treats all types of experiences as equal contributors to a single count variable. Although this approach has its strengths (it can test variability in the "dose" or number of adversities experienced), it has been critiqued for assuming that all adverse experiences have equal effects.

The Dimensional Model of Adversity and Psychopathology (DMAP)
DMAP attempts to quantify individual differences in the amount of exposure to distinct "types" of adversity . These efforts to quantify heterogeneity in childhood experiences of adversity attempt to identify distinct mechanisms affected by childhood adversity, which may be essential for intervention.
DMAP differentiates between experiences of threat (involving harm or threat of harm) and deprivation (absence of expected inputs from the environment). Instead of creating a single count variable of adverse experiences, DMAP creates a deprivation dimension and a threat dimension. In other words, similarly to the cumulative risk approach, DMAP can "count" the occurrence of various adverse childhood experiences; however, unlike the cumulative risk approach, DMAP classifies events as falling under the dimensions of deprivation or threat.
Previous research supports DMAP propositions that experiences of threat and deprivation influence emotional and cognitive traits. Several studies have demonstrated that exposure to abuse is associated with elevated emotional reactivity (Weissman et al., 2019) and disruptions in emotion regulation . In both of these studies, markers of deprivation were found to be unrelated to emotional reactivity and emotion regulation. Additionally, a meta-analysis found that experiences of both threat and deprivation were associated with reduced executive functioning; however, consistent with DMAP, these results suggest that this association is greater for experiences of deprivation than for experiences of threat (Johnson et al., 2021). Per results, the effect size between Inhibitory Control and deprivation was stronger (Hedges g = .43) compared with threat (Hedges g = .27), whereas the effect size between working memory and deprivation was stronger (Hedges g = 0.54) compared with threat (Hedges g = .28). These values show executive functioning and deprivation yielded a medium to large effect size, whereas executive functioning and threat yielded a small to medium effect size.
Moreover, multiple studies did not observe associations between threat exposure and cognitive function in models that included both deprivation and threat, even when exposure to threat was more significant and severe than exposure to deprivation (Lambert et al., 2016;Machlin et al., 2019;Sheridan et al., 2017). In sum, there is support for convergent and divergent validity hypotheses of DMAP, as previous research suggests experiences of threat and deprivation influence differential traits of emotion and cognition. Despite support for the model, there have also been critiques to its utility. Smith and Pollak (2021) critique DMAP because of it relying on a splitting perspective, a type of specificity model, that postulates that different types of adversity confer specific effects. They state that a model has not yet been developed that is a consistent, replicable, and mechanistic model that may account for specific patterns of developmental change after adversity. In other words, Smith and Pollak (2021) argue that childhood adversity is not clean cut and hence is unlikely to empirically demonstrate clear patterns of divergent validity. Smith and Pollak use the example of deprivation experiences also involving components of threat experiences such that while deprivation experiences are associated with an absence of expected inputs, they also involve components that are likely to be perceived by individuals as threats to their survival (Fereri & Tottenham, 2016;Hein & Monk, 2017). To further illustrate their point, they discuss how experiences of threat can also involve components of deprivation. For example, per DMAP, physical abuse is categorized as a threat, but when environmental contexts are examined, these children are also deprived of basic supports such as inconsistent positive parental feedback and access to material resources. In sum,  critique DMAP because they do not believe divergent validity hypotheses will be supported, and because they believe the dimensions of deprivation and threat will be so highly correlated that they will violate multicollinearity assumptions of statistical regression models (i.e., independent variables are highly correlated with each other, hence resulting in less reliable inferences).

Critique of DMAP: Divergent Validity and Multicollinearity
In response to ) critique, McLaughlin et al. (2021 reiterate the accumulating evidence that has come from her laboratory supporting DMAP hypotheses. McLaughlin et al. (2021) suggest that further empirical tests should be conducted on the convergent and divergent validity of DMAP. If both convergent and divergent evidence supports DMAP, the utility of the model would be demonstrated. If DMAP hypotheses fail to show convergent and/or divergent validity, then  critique of DMAP will be supported.

Critique of DMAP: Generalizability to Minoritized Populations
There is a dearth of research that is devoted to testing measurement invariance of adversity measures and models (Holden et al., 2020). A measure shows invariance when individuals of different populations who have the same standing on the latent construct being measured receive the same observed score on the measure (Schmitt & Kuljanin, 2008). A test violates invariance when individuals from different populations who are identical on the latent construct score demonstrate divergent observed scores. Lack of measurement invariance violates statistical assumptions needed to make valid between group inferences in means and covariances (Vandenberg & Lance, 2000), and hence may be necessary to increase confidence placed on between-group similarities/differences. For example, data from the NSCH measure for adversity suggest that Black and Hispanic children are exposed to greater adversity than White children, though a search of the literature shows that the measure has not been evaluated for measurement invariance across race or ethnicity (Slopen et al., 2016). Holden et al. (2020) state that if adversity measures do not display measurement invariance, it is impossible to know if group similarities and/or differences are valid or a spurious manifestation of systematic measurement error. An accurate assessment of adversity is important as this assessment is vital in medical and mental health settings, as well as in examining outcomes of such experiences (e.g., psychopathology). In response, studies have focused on tests of measurement invariance to facilitate adopting a culturally informed perspective (Rodriguez et al., 2019;Thombs et al., 2007). For example, Bernard et al. (2020) present a culturally informed Adverse Childhood Experiences model to understand the impact of racism on Black youth. This model supports the advancement of adversity research by acknowledging the possibility of measures "not working the same" (i.e., displaying different psychometric properties) across race/ethnicity; however, measurement invariance in this model has not yet been examined. In sum, additional research is needed to test adversity models (e.g., DMAP) to confirm measurement invariance across race/ethnicity and gender before making further conclusions of their utility.

Conclusions
In sum, theoretical disagreements between prominent scholars in the field inspired the rationale of this thesis proposal. Specifically, Smith and Pollack (2021)

Current Study
We aim to test the convergent and divergent validity of DMAP hypotheses as applied to phenotypic presentation of adult temperament. We will use Rothbart's model of temperament (1981), which postulates the structure of temperament as consisting of capacity for cognitive control (trait effortful control) and the emotional-motivational traits of negative affectivity (encompassing dimensions of Fear, Sadness, Discomfort, and Frustration) and surgency (comprised of high intensity pleasure, reduced fear, and novelty seeking tendencies). Per convergent validity hypotheses, we expect that 1) lower trait effortful control will be correlated with higher levels of deprivation experiences, 2) higher trait negative affect will be correlated with higher levels of threat experiences, and 3) higher trait surgency will be correlated with higher levels of threat experiences. Per divergent validity hypotheses, we 1) do not expect deprivation to correlate with either negative affectivity or surgency, and 2) do not expect the threat dimension to correlate with trait effortful control.
Before testing the convergent and divergent validity of DMAP hypotheses, we will capitalize on the diversity of our sample to test if the psychometric properties of the measures are equivalent across race/ethnicity and gender. We will follow recent recommendations to test for the psychometric equivalence of instrument functioning (Lopez-Vergara et al., 2021) across the groups being compared before making betweengroup inferences. Without testing for measurement invariance, it is not possible to know if group similarities and/or differences represent actual similarities/differences or if they are measurement figments related to differential instrument functioning across race/ethnicity and gender (Lopez-Vergara et al., 2020). We will test for three forms of measurement invariance of instruments: 1) configural invariance, or equal configuration of factor loadings across race/ethnicity and gender (support for configural invariance provides evidence that we are likely to be measuring the same construct across groups); 2) metric invariance, or equality in the size of factor loadings across race/ethnicity and gender (which is a necessary assumption for comparing covariances across race/ethnicity and gender); and 3) scalar invariance, or equality of item intercepts after accounting for the latent factor (which is a necessary assumption for comparing means across race/ethnicity and gender).

Participants
Participants were recruited online via X&Y Analytics with an N of 1,187 individuals balanced across gender (50.5% Male, 49.5% Female) and race/ethnicity (32% White, 33% Black, 35% Latinx). Individuals were eligible if they resided in the U.S. and were between 18 and 26 years of age. They were ineligible if they were not able to complete all measures online. Table 1 presents demographics of this sample.

Procedures
Participants were invited to respond to survey questions upon indicating their consent to participate in the study. The target sample was 1,200 young adults in the CloudResearch, which are two online panels of survey respondents often used by researchers. Prolific advertises over 300,000 available participants, and CloudResearch has access to over 50,000,000 available participants. Both platforms represent a wide variety of global locations and demographics.
Due to the length of the survey instrument, the survey was split into two segments. Participants were first invited to the first segment, and those who participated were then invited to participate in the second segment. Survey items with the most sensitive topics (e.g., sexual history, drug use, and trauma) were placed in the second segment of the survey to protect biases in re-participation.
To ensure that participants were attentive to the survey questions and providing genuine responses, the following data cleaning steps were followed. First, 174 cases from the total 2,100 participants in segment one were removed for incorrectly answering the attention check measure. An additional 101 cases were removed because their responses of age, sex, and/or race and ethnicity did not match up with the target demographics. A total of 1,825 participants were invited to participate in segment two. Of those, 1,375 participated. The same data cleaning procedures were applied to this dataset. Seventyfive cases were removed for incorrectly answering the attention check measure.
Participant user ID numbers were used to match cases when combining the two segments. Fifty-four cases were removed because their segment one and segment two data were not able to be matched. After these data cleaning steps, the total final sample was N = 1,246. Of this final sample, we are conducting analyses on participants whose assigned sex matches their gender identity, which yielded N = 1,187.

Childhood Adversity
Different types of adverse childhood events are measured by the Childhood Trauma Questionnaire (CTQ; Bernstein et al., 2003). The CTQ is a 28-item screening questionnaire intended to quantify self-reported childhood trauma history. It includes scales of physical abuse (CPA), sexual abuse (CSA), emotional abuse (CEA), physical neglect (CPN), and emotional neglect (CEN). Items are rated according to frequency on a 5-point Likert-type scale from 1 (Never True) to 5 (Very Often True). The CTQ scale scores have shown test-retest reliability coefficients ranging from .79 to .86, and internal consistency coefficients ranging from .66 to .92 across initial validation samples (Bernstein et al., 2003). The CTQ has shown good reliability when administered to youth (Bernstein et al., 1997;Forde et al., 2012). Scales overlapping with the threat dimension are the scales of physical, emotional, and sexual abuse, while those overlapping with the deprivation dimension are the subscales of physical and emotional neglect.

Adult Temperament
Different dimensions of temperaments are measured by the Adult Temperament Questionnaire (ATQ; Evans & Rothbart, 2007), which is a 75-item questionnaire. The ATQ is comprised of effortful control, trait negative affect, and surgency scales. Effortful control consists of the following subscales: Activation, Attentional Control, and Inhibition, Trait negative affect consists of the following subscales: Fear, Frustration, Sadness, and Discomfort. Surgency consists of the following subscales: Sociability, High Intensity Pleasure, and Positive Affect. The ATQ scales have shown evidence of reliability in previous studies as indicated by factor analyses of the scales (Evans & Rothbart, 2007).

RESULTS
Descriptive statistics showed most variables approximated a normal distribution.
Childhood adversity variables ranged in skew between .54 and 2.5 and kurtosis ranged from -.12 to 2.9. Temperament variables ranged in skew between -.27 and 1.16 and kurtosis ranged from -.5 to 1.23.

Test of Measurement Invariance
Summaries of tests for measurement invariance are presented in Tables 2-16. For childhood adversity, select scales showed configural invariance. The scales for CEA and CEN displayed configural, metric, and scalar invariance meaning forcing factor loadings and intercepts to be equal across groups did not influence model fit, while the scales for CPA and CPN did not display configural invariance. Metric invariance for the CSA scale was rejected meaning that forcing factor loadings across groups led to a significant drop in model fit. The factor loading for "Someone threatened to hurt me or tell lies about me unless I did something sexual with them" was found to account for the source of misfit and was released to be freely estimated across groups (i.e., to attempt to statistically "account" for differential item functioning). Partial metric invariance was established after allowing such item to differ amongst groups, suggesting that inferences in covariances can be made after "accounting" for bias in measurement (Lopez-Vergara et al., 2021). Building off this partial metric invariance model, scalar invariance was rejected -meaning that forcing item intercepts to be equal across groups led to a substantial decrement in model fit. The intercept for the same item was the source of misfit and was released to be freely estimated across groups. This partial scalar invariance model fit the data well, as indicated by CFI, RMSEA, and SRMR (see Table   2), suggesting that inferences in means can be made after "accounting" for bias in measurement.
For temperament, select scales showed configural invariance. The subscales for Discomfort, Attentional Control, and Sociability displayed configural, metric, and scalar invariance, while the subscales for Fear, Frustration, Sadness, Inhibitory Control, Pleasure, and Positive Affect did not display configural invariance. Metric invariance for the Activation subscale was rejected. The factor loadings for "I am often late for appointments" and "I often make plans that I do not follow through with" were found to be the source of the misfit and were allowed to be freely estimated across groups. Scalar invariance was also rejected and the item intercepts for the same subscale items were the source of the misfit and were allowed to be freely estimated across groups. This partial scalar invariance model fit the data well (see Table 11).

Latent Variable Mean Differences by Gender and Race/Ethnicity
Once we tested and modeled (i.e., "accounted for") bias in measurement, we tested if there were differences in the mean values of the latent variables between groups.
Relative to the level of CSA reported by Black women, there were no differences for

Multi-group Structural Equation Model
The structural relations among the latent constructs (i.e., after "accounting" for bias in measurement) are presented in Figure 1. Results showed effects that are similar and different across race/ethnicity and gender. Higher levels of CEN predicted 1) higher levels of Activation in all groups except Latina women, 2) lower levels of Sociability in all groups except White women and Black men, and 3) lower levels of Discomfort only in White men. CEN did not predict Attentional Control. Higher levels of CSA predicted 1) lower levels of Activation only in Black women and Black men, 2) lower levels of Attentional Control only in Black men, and 3) lower levels of Discomfort in Black women and Black men. CSA was positively associated with Sociability in Black men, but it was negatively associated in Black women. CEA predicted 1) higher levels of Activation in Black men, 2) lower levels of Attentional Control in Latina women, White women, and Latino men, 3) lower levels of Sociability in Black men, and 4) higher levels of Discomfort in all groups except Black women and White women.

Manifest vs Latent Analyses
We re-ran the analyses using manifest level data to examine the contributions made by measurement invariance analyses. Relations among the manifest level data are presented in Figure 2. Out of the 12 effects shown, only two had similar interpretations across gender and race/ethnicity. Eighty-three percent of inferences were distinct in the model that accounted for measurement bias compared to the model utilizing classical test theory. At the manifest level, the following differences were observed related to CEN: 1) CEN no longer predicted Activation for Black men, 2) CEN did not predict Sociability in any group, 3) CEN predicted Discomfort in White women but not White men. Per CSA, manifest level analyses showed the following differences: 1) CSA only predicted Activation in White men and not Black men or White women, 2) CSA no longer predicted Attentional Control in Black women, 3) CSA no longer predicted Sociability in Black women, and 4) CSA predicted Discomfort only in White men and no longer Black men. Per CEA, the following differences were observed at the manifest level: 1) CEA only predicted Activation in White men, 2) CEA did not predict Attentional Control in any group, and 3) CEA now predicted Discomfort in Black women.

Multicollinearity
We tested the associations between CSA/CEA and CEN to determine if multicollinearity was present. We found that CSA was associated with CEN (r=.70, p<.001) and that CEA was associated with CEN (r=.78, p<.001). These findings show that multicollinearity was present to some degree among threat and deprivation scales, which is what Smith & Pollak predicted would occur.

CHAPTER 4 DISCUSSION
This study investigated the associations among childhood adversity with cognitive and affective dimensions of temperament. Additionally, we examined the performance of measures widely used for these constructs. Findings on psychometrics and the associations of childhood adversity and temperament are described in the following sections.

Psychometric Findings
Not all our latent factors displayed configural invariance, or equality in the pattern of factor loadings. Per childhood adversity, only three of five latent factors established configural invariance. Per temperament, only four out of 10 latent factors established configural invariance. For the ones that did display configural invariance, it is implied that the same latent constructs are likely being measured across the groups in our sample. Some of our factors demonstrated scalar and metric invariance meaning that some of the factor loadings or item intercepts did not differ after accounting for the influence of the latent factor. Of the latent factors displaying configural invariance, CEA, CEN, Discomfort, Attentional Control, and Sociability established full metric and scalar invariance, which are necessary for comparing covariances and means between groups.
In contrast, CSA and Activation displayed partial metric and scalar invariance meaning some of the items vary in how well they index the latent construct across groups. Partial measurement invariance suggests that some degree of measurement bias was detected, and that the factors were able to "correct" for differential item functioning prior to making between-group inferences.
Select scales from the CTQ and ATQ demonstrated non-comparable psychometric properties across race/ethnicity and gender. These included CPA, CPN, Fear, Frustration, Sadness, Inhibition, Pleasure, and Positive Affect. In other words, such scales rejected configural invariance, suggesting that either our measure is functioning differently across groups and/or that the construct functions differently across groups (e.g., Lopez-Vergara et al., 2020).
Finding bias in the CTQ and ATQ scales is mostly consistent with the few previous studies testing for measurement invariance. For example, one study examined the CTQ across gender and race in a sample of drug-using adults (Thombs et al., 2007).
Results showed that measurement invariance was not established for the total score of the measure. The study examined item level invariance but did not look at the individual scales. Additionally, Rodriguez et al. (2018) found that the CTQ demonstrated bias across gender. Conversely, Cruz (2023) found that the CTQ showed measurement invariance across race/ethnicity and gender, but the study only included two racial/ethnic groups (i.e., Black and White). Per the ATQ, there is a dearth of research examining measurement invariance. One study investigated measurement invariance using the Early Adolescent Temperament Questionnaire (EATQ; Capaldi & Rothbart, 1992;Kim et al., 2003); showing no bias in measurement across gender, though this study did not use the version of the measure (i.e., ATQ) that the current study used.
Finally, it is important to consider the different inferences that were made at the manifest vs latent level. Eighty-three percent of inferences were distinct at the latent level when correcting for bias in measurement. Changes in inferences at the two different levels (i.e., testing and correcting for bias in measurement vs assuming equivalence instrument functioning) supports psychometric critiques of cross-cultural research (e.g., Lopez-Vergara et al., 2021). Our findings are consistent with previous studies that have found that not accounting for bias in measurement can result in substantial increase of incorrect inferences (Beuckelaer & Swinnen, 2018). These findings have implications for cross-cultural research. Our measures demonstrated substantial bias across groups in this sample (e.g., 40% of the CTQ scales and 60% of the ATQ scales could not be used for between-group comparisons).

DMAP Associations with Temperament
After accounting for bias in measurement, we found that the associations between DMAP constructs and temperament showed some similarities and differences across race/ethnicity and gender. Although there were more latent factors in measures of DMAP constructs and temperament, we only used the factors that displayed measurement invariance to test our convergent and divergent hypotheses (i.e., only used scales that had comparable psychometric properties across groups in between-group comparisons). As shown in Figure 1, deprivation consisted of the CEN factor, threat consisted of the CSA and CEA factors, effortful control consisted of Activation and Attentional Control, surgency consisted of Sociability, and negative affect consisted of Discomfort.
We discuss the results of our hypotheses in this section. Per hypotheses related to deprivation, CEN 1) predicted Activation in all groups except Latina women, 2) did not predict Attentional Control in any group, 3) predicted Sociability in all groups except White women and Black men, and 4) predicted Discomfort in only White men. Per threat hypotheses, CSA predicted 1) Activation in Black women and Black men, 2) predicted Attentional Control in only Black women, 3) Sociability in Black women and Black men, and 4) Discomfort in Black women and Black men. CEA predicted 1) Activation in only Black men, 2) Attentional Control in Latina women, White women, and Latino men, 3) Sociability in only Black men, and 4) Discomfort in all groups except Latina women and White women.
These results suggest that the assumptions of DMAP are not all supported. The association between CEN and Activation is consistent with our convergent validity hypotheses. However, CEN not predicting Attentional Control (another form of effortful control) was inconsistent with our convergent validity hypotheses. Also inconsistent with our divergent validity hypotheses, we did not expect CEN to be correlated with Sociability and Discomfort. The associations between threat factors and Sociability and Discomfort are consistent with our convergent validity hypotheses as we expected experiences of threat to be correlated with surgency and negative affect. However, it is important to note that only one group showed a significant relation with Sociability (a form of surgency). Additionally, our divergent validity hypotheses were not supported as shown by the significant association of threat with effortful control across some groups.
These findings suggest that the differential effects of DMAP constructs on temperament are inconsistent across race/ethnicity and gender in our sample. This could mean that this model of adversity may not be valid as it is currently postulated, although we do not know if inferences would differ using different methods (e.g., using performance-based measures of temperament as opposed to self-report) and/or samples (e.g., samples selected for exposure to trauma). Replication of these findings is needed to assess the validity of DMAP across some groups, namely 1) Black men, which was the group that most consistently showed an association between threat and surgency and negative affect and 2) White men, which was the group that most consistently showed an association between deprivation and while also rejecting the association between deprivation and surgency and negative affect.
These results support Smith & Pollak's critique that DMAP is not likely to show a consistent pattern of convergent and divergent validity hypotheses. Smith & Pollak's critique of multicollinearity is also supported as the threat and deprivation scales had significant correlations ( r's ranging from .70 to .78, all p's<.001). Smith & Pollak state that a model that is consistent, replicable, and mechanistic has not yet been developed, and is not likely to be developed, because adversity may not be as "clean cut" as hypothesized by DMAP. For example, it is possible that threat experiences may also involve components of deprivation (e.g., a physically abusive parent may, by the very definition of being physically abusive, also be emotionally neglecting). Another alternative hypothesis proposed by Smith and Pollack is that the effects of adversity may be mediated by how people think or conceptualize the event, rather than by the "type" of adversity experienced. For example, Mansueto et al. (2019) showed that exposure to childhood adversity may lead to metacognitive beliefs about the adversity, which may then develop into experiencing emotional and cognitive consequences.
In sum, findings from our current study support some DMAP hypotheses but also reject some DMAP hypotheses. In order to advance our understanding of how adversity leads to negative outcomes, it may be beneficial to compare and contrast different theoretical views of underlying mechanisms (e.g., examining thoughts about the adversity).

Clinical implications
If, as Smith & Pollak suggested, the DMAP dimensions of deprivation and threat are too highly correlated to be treated as distinct predictor variables then it may not be useful to hypothesize them as having differential effects or to tailor interventions to "match" such exposures. However, the validity of DMAP may be "hidden" in untested moderators. For example, it may be important to consider individual differences in the timing of the adversity, how individuals think about the event, the socio-economic context of the child, support from others, and/or other resilience factors. It is possible that to adequately "unpack" the effects of threat and deprivation, it is also necessary to quantify individual differences in exposure to resilience factors. For example, the Intergenerational and Cumulative Adverse and Resilient Experiences (ICARE) model postulates that the effect of adversity depends on the "net balance" between experiences of adversity and protective or resilience promoting experiences. (Hays-Grudo et al., 2021). This model postulates that negative consequences of adversity result from biological and behavioral adaptation that alter cognitive, social, and emotional development, but that such processes are also affected by resilience factors (e.g., nurturing relationships and access to resources). Testing features of distinct adversity models can help the field differentiate the appropriateness or utility of models conceptualizing the effects of adversity and threat on developmental outcomes.

CHAPTER 5 LIMITATIONS, FUTURE DIRECTIONS, & CONCLUSIONS
While this study has its strengths, it is not without limitations. First, one limitation includes using cross-sectional data, which does not allow us to make inferences regarding the directionality of effects. Longitudinal studies are needed to further explore this, which is especially needed due to the retrospective report of childhood adversity. Additionally, future research may also benefit from replicating analyses in other forms of samples, such as clinical samples or others exposed to high rates of trauma. Finally, we may have simplified our categorization of individuals into racial/ethnic and gender groups to be able to examine intersectionality of race/ethnicity and gender, though we acknowledge other intersecting identifies were neglected.
In conclusion, the current study supports psychometric critiques of cross-cultural research and partially supports and refutes various DMAP hypotheses. Per psychometric findings, both measures of childhood adversity and temperament violated assumptions of equal functioning across race/ethnicity and gender. There were mixed findings on DMAP hypotheses with some groups displaying associations consistent with our convergent and divergent validity hypotheses. These results suggest it may be important to test for measurement bias in adversity models (prior to making between-group comparisons), otherwise there is the possibility that findings are a consequence of measurement error.
Future attempts to support/refute DMAP may benefit from incorporating alternative hypotheses for how "adversity gets under the skin" (e.g., how people think about adversity). Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance. Models 2b and 3b have an unconstrained factor loading and unconstrained item intercept for variable "Someone threatened to hurt me or tell lies about me unless I did something sexual with them". Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance. 27 Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance. Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance.
Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance.

31
Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance.
Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance.

33
.078 M2 0 0 0 Accept Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance. Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance. Models 2b and 3b have unconstrained factor loadings and unconstrained item intercepts for variables "I am often late for appointments", "I often make plans that I do not follow through with", and "When I am afraid of how a situation might turn out, I usually avoid dealing with it". .095 M2 0 0 0 Accept Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance.   .093 M2 0 0 0 Accept Note: Decision "Accept" = accepting the assumption of invariance, decision "Reject" = rejecting the assumption of invariance.