EVALUATING THE EFFECTS OF ATTITUDES ON HEALTH-SEEKING BEHAVIOR AMONG A NETWORK OF PEOPLE WHO INJECT DRUGS

The transmission of HIV/AIDS remains a great concern among people who inject drugs (PWIDs) in the United States. PWIDs are often embedded in a unique HIV/AIDS risk network via the shared use of drug equipment and risky sexual behavior. However, the characteristics of PWIDs in risk networks present challenges in a collection of network data resulting in limited studies of these networks of PWIDs. Our study employed causal inference methods applied to an observational study with dissemination to assess attitudes toward HIV/AIDS risk among PWIDs and their effect on health-seeking behaviors. We used data from the Social Factors and HIV Risk Study (SFHR), a sociometric network study conducted between 1991 and 1993 in Bushwick, Brooklyn, New York that investigated how HIV/AIDS infection spread among PWIDs through shared sexual and injection risk behaviors. We evaluated the effects of locus of control (internal vs. external) and blame (self vs. others) attitudes separately on their own health-seeking behavior and that of other members in their risk communities. With taking dissemination of attitudes into account, four causal parameters were estimated: direct, indirect, total, and overall effects. Communities were defined to include members that were closely related via HIV risk behavior and had sparser connections with individuals outside of the community. For the health-seeking behavior outcomes, we considered receipt of study-based HIV testing result and a medical encounter within the past year. While direct effect measures direct effect of exposure on outcome behavior of PWIDs in the same community, indirect effect is the quantified measure of dissemination, which compares the outcomes of unexposed PWIDs in two different communities. Total effect is defined as the sum of direct and indirect effects and is the measure of the maximum impact of the exposure of interest. Last but not least, overall effect measures the marginal effect of exposure by comparing the potential outcomes of those exposed and unexposed regardless of communities they belong. First, we applied a modularity-based community detection algorithm to determine communities within the SFHR network. We then employed a networkbased causal inference methodology for clustered observational data. Coverage is defined as the proportion of people with internal locus of control/self-blame attribute in a community. For the direct effect, PWIDs who believe uncontrollable factors determine whether or not one gets HIV/AIDS (i.e. with external locus) were 16% less likely to receive HIV testing result when they are in 50% and 70% coverage communities (95% confidence interval (CI): -0.27, -0.06, for both communities). Also, when the coverage of people who believe controllable factors determine whether or not one gets HIV/AIDS (i.e. with internal locus) was decreased from 70% to 50%, the likelihood of receiving HIV testing result decreases 3% among those with external locus (95% CI: -0.05, -0.01), demonstrating a significant disseminated effect. Furthermore, as another significant dissemination effect, when the coverage of people with self-blame was decreased from 99% to 50%, the likelihood of having a recent medical encounter increases 27% for those with external locus (95% CI: 0.07, 0.47). Because the SFHR study was conducted in the early 1990s, and there is a possibility that the health-seeking behavior of PWIDs has somewhat changed over time. However, our results may contribute to understanding how PWIDs attitudes and behaviors have changed over a few decades by conducting a similar analysis in more contemporary studies. The results from this study support the existence of dissemination of locus of control/blame attitudes among PWID networks. This indicates that the introduction of appropriate network-targeted interventions can bolster positive behavioral change in health-seeking among PWIDs by leveraging disseminated effects.

different communities. Total effect is defined as the sum of direct and indirect effects and is the measure of the maximum impact of the exposure of interest. Last but not least, overall effect measures the marginal effect of exposure by comparing the potential outcomes of those exposed and unexposed regardless of communities they belong.
First, we applied a modularity-based community detection algorithm to determine communities within the SFHR network. We then employed a networkbased causal inference methodology for clustered observational data. Coverage is defined as the proportion of people with internal locus of control/self-blame attribute in a community. For the direct effect, PWIDs who believe uncontrollable factors determine whether or not one gets HIV/AIDS (i.e. with external locus) were 16% less likely to receive HIV testing result when they are in 50% and 70% coverage communities (95% confidence interval (CI): -0.27, -0.06, for both communities). Also, when the coverage of people who believe controllable factors determine whether or not one gets HIV/AIDS (i.e. with internal locus) was decreased from 70% to 50%, the likelihood of receiving HIV testing result decreases 3% among those with external locus (95% CI: -0.05, -0.01), demonstrating a significant disseminated effect.
Furthermore, as another significant dissemination effect, when the coverage of people with self-blame was decreased from 99% to 50%, the likelihood of having a recent medical encounter increases 27% for those with external locus (95% CI: 0.07, 0.47).
Because the SFHR study was conducted in the early 1990s, and there is a possibility that the health-seeking behavior of PWIDs has somewhat changed over time. However, our results may contribute to understanding how PWIDs attitudes and behaviors have changed over a few decades by conducting a similar analysis in more contemporary studies. The results from this study support the existence of dissemination of locus of control/blame attitudes among PWID networks. This indicates that the introduction of appropriate network-targeted interventions can bolster positive behavioral change in health-seeking among PWIDs by leveraging disseminated effects.  days before the interview) and outer periphery (i.e., people who do not depend on core members to obtain information of drugs and place where they shoot drugs). Among these participants, there was a significant difference in risk behavior between outer periphery and the other two categories, and members of the outer periphery used injection drugs and engaged in risky behaviors less frequently compare to other two types of injection drug users (3). HIV/AIDS prevention that appreciates the network structure in which PWIDs are imbedded, and particularly those that target core network members and those on the inner periphery could be more effective compared to the conventional approach, which typically encourages all network members to modify their risk behavior related to injection drug user.
The network structure in which PWIDs are embedded could slow or even prevent them from attaining long-term behavioral change, or vice versa, improve and sustain behavioral change (3). Introducing appropriate network-targeted interventions can bolster positive behavioral change among PWIDs (4). The purpose of this study is to investigate how personal health attitudes impact health-seeking behavior among PWIDs and their HIV risk networks. This new information can then be used to develop more effective interventions for HIV/AIDS treatment and prevention among injection drug users, such as an educational program to empower individuals to engage in health behavior targeted at the most influential members in communities of PWIDs.
One challenge to evaluate the effect of attitudes of PWIDs on their health-seeking behaviors is dissemination or interference among the PWIDs network. That is, the health beliefs and/or blame attributes of an individual can affect the health-related behaviors of their network members. Using the potential outcome framework for causal inference, one typically assumes no dissemination of the treatment or exposure; that is, an individual's outcome is affected only by their own treatment/exposure and not by the treatment/exposure received by other individuals in the study. This assumption is part of the "stable unit-treatment value assumption" (SUTVA) (5).
However, in some settings, dissemination is of interest to understand causal relationships and the full impact of an intervention. For example, consideration of interference is necessary to evaluate the effect of vaccination for the prevention of infectious disease (6,7). In our setting where we try to evaluate the effect of PWIDs attitudes toward HIV/AIDS risk, ones' attitude can change another person's healthseeking behavior especially when they are closely connected. For the estimation of causal inference in the presence of interference, we use an inverse probability weighting (IPW) method developed by Tchetgen Tchetgen and VanderWele (8) designed to replicate an idealized two-stage randomized design. In this design, investigators randomly assign a treatment allocation strategy to each community then assign actual treatment to participants in each community given the assigned treatment allocation strategy. When applying this method to observational studies, group-level propensity scores are computed for each observed community in population of interest. Lastly, the inverse of this propensity score is used as weight for the IPW estimator that is a contrast of group-level potential outcomes.
An individual's health attitudes can be defined based on two distinct concepts of locus of control and blame attribute. The concept of locus of control was developed by Rotter in the field of personal psychology (9). The locus of control is defined as the degree of people's belief that how much they have control on what happen to them.
Locus of control is classified into two different types: internal and external (9). People with internal locus attribute the events they experience to factors within their control while those with external locus attribute events to factors beyond their own influence.
The rational for a distinction between locus of control and blame attribute has precedent in the field of psychology. Blanchard-Fields et al. (2012) investigated how one's beliefs against traditional social schema affects his/her blame attribute against the violation of such schema by others (10). Grimes et al. (2004) investigated how students' locus of control affect the evaluation of their teacher (11). They found that students who had internal control tended to give high evaluation to their teacher while students with external locus gave lower evaluation to the teachers. Locus of control and blame attribute have been considered distinct concepts in psychology. Among PWIDs, there have been few studies on the relationships between locus of control and blame attribute with health-seeking behavior. We considered locus of control and blame attribute as distinct exposures and the evaluated the exposure effects on healthseeking behavior among each individual and their networks.  (2):129-47.

Study data
In this study, we evaluated how PWIDs' individual health beliefs affect their own health-seeking behavior and their risk network members' health-seeking behavior in the Social Factors and HIV Risk Study (SFHR). The SFHR study was conducted in Brooklyn and other parts of New York, New Jersey and Connecticut between July 1991 and January 1993 (1). Data was collected from street recruited injection drug users in the Bushwick neighborhood, a low-income area of approximately 100,000 residents with high rates of poverty, injection drug use, and HIV/STI prevalence. The original study enrolled a total of 767 participants and the information of 3,162 dyadic relationships, a connection between two individuals. HIV risk connections were defined by sharing risk behaviors (i.e. use drug together or have sexual intercourse).

Data preparation
Our primary objective was to understand how injection drug user's individual attitudes affect health-seeking behaviors of that individual and those with shared HIV risk connections in the network structure. From this perspective, our analysis focused on the individuals who had at least one shared risk connection with participants in SFHR study. We also define the participants who do not share any risk links with at least one other enrolled participants as isolated participants. The process of sample selection for the analysis in our study was summarized in a flowchart ( Figure 1). After removing 82 participants either missing outcome, exposure or covariate information and 283 isolated participants, the SFHR PWIDs network for this analysis included 402 subjects with 403 risk connections ( Figure 2).   , integer-scoring approach requires the assumption of equal distances between categories (2).
Therefore, by rescaling with -1, 0 and 1, we could satisfy this assumption and create a score for each participant. By adding the all values assigned to the responses for health belief related question, we obtained individual health belief score (BLF score) ranging from -7 to 7. The distribution of BLF scores is shown in the left panel of Figure 3.
This integer-scoring procedure is common approach in psychological studies in dealing with categorical responses (2,3). Then, if one's BLF score was greater than or equal to three, we considered the overall health belief status of that person is internal locus of control otherwise the person's health belief status is external locus (i.e.
for internal locus; for external locus). A similar procedure was implemented to create a binary variable to represent individual blame attribute. By summation, we obtained blame scores that ranged from -3 to 3 for individual participant. The distribution of BLM scores is shown in the right panel of Figure 2. If one's blame score was equal to three, the attribute was "self-blame"; otherwise, the attribute was "blame others" (i.e. for self-blame; for blame others). We used a threshold of three for defining individual locus of control as a binary variable and we chose this threshold because loglikelihood was higher in case of using three compares to use another threshold such as five or seven (The loglikelihoods from the estimation of propensity score with the threshold of 3, 5 and 7: -171; -242; -261, respectively).
We evaluated two outcomes related to PWIDs' health-seeking behaviors:   Figure 4 is a schematic diagram of causal parameters when dissemination exists in a study and is based on the diagram introduced by Halloran and Struchiner (4). Coverage is the proportion of people who received treatment/exposure in a group.

Causal inference framework under the presence of interference
In two-stage randomized design, first, different coverages of exposure ( where ) are randomly assigned to different groups or communities, then people in each group are assigned to the exposure according to the pre-assigned coverage strategy to the group. As the result, different coverage groups have both exposed and non-exposed member within a group. Given this particular design, Halloran and

Assumptions and notation
An important feature of our study is that we need to allow for the exposure (i.e. individuals' attitudes) to have influence on the behavioral outcomes of other individuals in the same community; however, we assumed this effect does not extend beyond that particular community. This assumption is known as partial interference (5,6,7). Under this condition, the no interference assumption of SUTVA is relaxed within clusters. We also assume the following three assumptions required to guarantee the internal validity of causal estimation in observational data: i) Conditioning on a set of pre-treatment covariates assumed to be sufficient to control for confounding; that is, the potential outcomes of those who were exposed and the outcomes of those who not exposed are the same on average (conditional exchangeability), ii) there is a positive probability of exposure within each level of the covariates (positivity), and iii) the exposure is well defined and there is no other version of exposure in the study (consistency). To define the potential outcomes, we assumed a Bernoulli individual group assignment strategy under which each individual has exposed (i.e. having internal locus of control/self-blame attribute) at random with probability (6). In this study, we also assumed that there is no misclassification of attitudes; that is, every participant tells the truth about his/her attitudes in the study. With relating this, we will make a detailed discussion on reliability and validity of health beliefs questions in SFHR and our definition of exposures in discussion section. In addition to this, we assumed that the exposure status we defined capture the underlying traits of internal vs. external locus of control and self-blame vs. blame others. We also assume the weight models are correctly specified (e.g., correct functional forms of covariates). We also assumed there is no homophily, which means negating the existence of latent variables with which an individual has a tie with another individual who has the similar characteristics. As discussed in , similar individuals are more likely to be connected in the first place, rather than the intervention or exposure influencing connected individuals (8). Our no homophily assumption presumes that the individual covariates we controlled in our study; sex, race, education level, age and attitudes toward HIV/AIDS risk; locus of control and blame attributes are enough to explain the existence of a tie between a pair of PWIDs and there is no other unobserved characteristic as the source of homophily.
Notation used for describing network structure of the PWIDs HIV/AIDS risk network follows the notation used by  and  (9,10). Notation used for explaining causal inference under the presence of interference follows the notation used by  and   (5,11). A network (or graph) is mathematically defines as a collection of vertices and edges , . So, in our context, is the SFHR PWIDs network in which a vertex represents each subject and an edge represents a shared risk behavior between a pair of subjects. There are clusters and each of the clusters has individuals for , denoted as . ) within cluster and the number exposed is (or individuals out of receive active treatment), the possible exposure allocations are denoted as , where the length of each vector is and there are patterns. Also, every element of satisfies . This means there are people who are exposed. Based on this, we can write the potential outcome for individual in cluster as if the cluster was exposed to . Here, the individual potential outcome not only depends on person 's exposure, but also the vector of exposures received by everyone else in cluster in cluster .

Community detection
A community (or cluster) is defined as a group of vertices densely connected, with only sparser connection to other groups of vertices (10). Hierarchical clustering is one of the most common methods for community detection (9,10). In this method, the closest or most similar vertices are combined to form communities with a measure of similarity or connection strength between vertices based on the network structure (10).
The most popular one in such measures is modularity (9).
Let be an observed network and assume there are candidate K clusters within the network. We also define as the fraction of edges in the original network that connect vertices in community with vertices in community and . Given this, the modularity of is defined by (1) where is the fraction of edges which connect vertices within the same community in , and is the expected value of under a random edge assignment.
Modularity is obtained by maximizing Eq. (1) where the observed fraction of edges is substantially different with the fraction of edges formed via random process. That is, large value in modularity indicates a substantial connection among some vertices than expected, and this suggests the presence of a nontrivial community structure in the network. In practice, the community detection in our PWIDs network was conducted with "fastgreedy.community" algorithm in "igraph" package in R.

Identification of causal effects with IPW estimator
With notations introduced above, we describe four different causal estimands of interest in the presence of interference (5,7). As mentioned above, to define the potential outcomes, we assume Bernoulli individual group assignment strategy under the strategy individuals within community assigned treatment at random with probability (6). Then, the probability of community 's exposure vector is (2) and the probability of community 's exposure vector which exclude th individual is When an individual exposure with probability , its average potential outcome is denoted by (4) Also, the marginal individual average potential outcome is defined by (5) With this notation, community-level average potential outcome is (6) Then, population-level average potential outcome with a certain coverage is As in the case of the marginal individual average potential outcome, we can express the population average potential outcome with Given these, the following notations represent four different causal effects as proposed in Hudgens and Halloran (5). The direct (or individual) effect is defined as: , (9) the indirect effect is defined as: , (10) the total effect is defined as: , (11) and the overall effect is defined as: , (12) where these causal effects are described as population averages. Tchetgen Tchetgen and VanderWeele proposed an inverse probability weighting (IPW) method to estimate these four different causal effects under the presence of interference in observational studies (6). The IPW estimators are unbiased when the group level propensity scores are known, and the following assumptions hold: (1) Conditional independence: ; (2) Positivity: .
In practice, however, the true propensity scores are often unknown, and we need to estimate with (13) where is a propensity score for individual in community and : the density of community specific random effect which follows a normal distribution with mean 0 and variance .
With the estimated cluster-level propensity score, the IPW estimator for communitylevel average potential outcomes is calculated by (14) and the marginal potential outcomes is

Community detection
In the entire SFHR network (Figure 2

Causal effects estimation
Causal effects under the presence of dissemination were defined by comparing community-level potential outcomes with different coverages of exposure, which is the proportion of people who have internal locus of control/self-blame attribute in a community. Given the 96 communities of PWIDs identified by community detection, the observed distributions of coverage are shown in Figure 5. The left panel shows the frequency distribution of observed coverage or proportion of people who have internal locus and the right panel shows the frequency distribution of coverage or proportion of people who have self-blame attributes in each community out of ninety-six. The coverage of self-blame attribute has wider variation than that of internal locus. Also, because the entire SFHR PWIDs networks included 66 communities of PWIDS that have only two participants as its member, we observed many communities with the coverage of 0%, 50% and 100%. As sufficient number of communities were observed to estimate group-level propensity score, we focused on 50%, 70% and 99% coverages for causal estimation. There were four different models because we considered two different exposures (i.e. locus of control and blame attitude) and two different health seeking behaviors (i.e. receipt of HIV testing result and medical visit within the past year). Sex, race (White, Others), education level (less than high school, high school or more), age (young adult (18-40 years), middle-aged (>40 years)) and all pairwise interaction terms were used as individual-level covariates in the models.   Figure 7 shows the plots of estimates four different causal effects in this model. Coverage is defined as the proportion of people with internal locus of control in a community, < ′. For both 50% and 70% coverage groups, the direct effects (adjusted with interactions) were statistically significant and those with external locus of control was a 16% less likely to receive their HIV testing result compare to those with internal locus of control in those coverage groups (95% confidential intervals (CIs): -0.265, -0.055 for 50% coverage; -0.268, -0.055 for 70% coverage, respectively). The indirect effect was significant only when comparing 50% and 70% coverage groups. This means that when the coverage of people with internal locus was decreased from 70% to 50%, the likelihood of receiving the HIV testing result decreased about 3% for those with external locus (95% CI: -0.054, -0.008). The total effects, that is, the maximal impact of locus of control, were statistically significant in the comparison of all coverage groups. There were about 19% reduction in the likelihood of receiving their HIV test result between for those with external locus with 50% coverage compared to those with internal in 70% or 99% coverage groups (95% CIs: -0.286, -0.100 for 50% vs. 70%; -0.291, -0.093 for 50% vs. 99%, respectively). A 16% difference in total effect was observed in the comparison between those with external in 70% and those with internal in 99% groups (95%CI: -0.272, -0.049). The estimated overall effects support that the marginal likelihood of receipt of HIV testing result is significantly lower in groups with low coverage of internal compare to high coverage group. For example, the likelihood of receipt is about 7% lower in 50% coverage group compare to 70% coverage group, while the likelihood is about 11% lower in 50% coverage group compare to 99% coverage group (95% CI: -0.089, -0.041 for 50% vs. 70%; -0.190, -0.032 for 50% vs. 99%, respectively).  New York (1991.     New York (1991.    Figure 8 shows the plots of estimates four different causal effects in this model. Coverage is defined as the proportion of people with selfblame attribute in a community, < ′. As the reminder, the coverage here is defined as the proportion of people who have self-blame attribute in a community. As you can see from Table 7, none of the estimates were statistically significant. Figure 9 also shows 95% CIs for all estimate include zero in this model. Therefore, we could not determine the direction of impact of PWIDs' individual blame attribute on their receipt of HIV testing results from al the estimation of four different causal effects.  New York (1991.     (1). Therefore, our approach to use threelevel scale would be reasonable; however, future studies are needed to evaluate the validity of our scoring system. We selected threshold based on likelihood to create binary exposure outcomes. Because the existing causal inference methods under dissemination can applicable only for a binary exposure, our approach would be the best possible way to evaluate the effects of PWIDs' attitudes on their health-seeking behavior given dissemination. In a future work, to give more rational to the reliability and validity of individual attitudes in the SFHR network, we will apply item response models as discussed in  and   (2,3) and develop methodology for dissemination of a categorical exposure.
Thirdly, some assumptions in this study could be relaxed in future work.
Although we assumed no homophily, it is impossible to measure homophily in this setting because SHRF PWIDs network was observed at a single time point. One feasible sensitivity analysis to check the existence of homophily would be to compare obtained estimation results with simulation results obtained by creating random networks with same number of node, links and node attributes as the original SFHR PWIDs network. This comparison will enable us to assure the assumption of no homophily. That is, when there is no significant difference between the causal estimation results obtained from our study and random network, we can guarantee that there are no other unobserved factors that explain the relationship between an exposure and outcome behaviors except covariates and exposures used in our study.
We may apply more realistic treatment allocation strategies than the assumption of Bernoulli individual group assignment strategy. For example, Barkley et al. (2017) and  introduced a new treatment allocation strategy that allows the correlation of treatment assignment of individuals in a same cluster, which is a generalized version of the Bernoulli individual group assignment strategy (4,5).
Considering this correlation seems more realistic in dealing with observational studies because, in our setting, individual attitudes may not be independent if one has a close connection with another person.
Finally, in this study, there is a possibility that the health-seeking behaviors of PWIDs has changed because the SFHR study was conducted in the early 1990s.
However, our results may contribute to understanding how PWIDs attitudes and behaviors have changed over a few decades by conducting a similar analysis in more contemporary studies and additionally provide insights into health-seeking behaviors during an emerging infectious disease epidemic.
By applying causal inference method in the presence of dissemination, we evaluated how personal health attitudes impact health-seeking behavior among PWIDs and their HIV risk networks with the Social Factors and HIV Risk Study (SFHR). Our findings that PWIDs' attitudes affect health seeking behaviors of other members in the same community indicates that interventions that taking a network structure among PWIDs and the existence of dissemination into account can be a more effective and powerful approach to prevent HIV/ADIS transmission among injection drug users. As more concrete suggestion, segmentation approach, a type of network interventions in which certain groups of people will be targeted for intervention to bring about behavioral change, may be effective in our setting (6). For example, if we could conduct an educational program, which enable people to come to have internal locus, against the members belong to 50% coverage group, their likelihood to receive HIV testing results would increase about 20% in terms of total effect. Thus, our results could not only support the effectiveness of network interventions and provide an important information to improve HIV prevention interventions among PWIDs.   All in all, there is no considerable difference in structures between the analyzed SFHR PWIDs network and full SFHR PWIDs network. However, it should be kept in mind that there is still possibility that individual attributes for unenrolled subjects are substantially different with those for participants included in this study.