INTERPERSONAL PHYSIOLOGY: ASSESSING INTERPERSONAL RELATIONSHIPS THROUGH PHYSIOLOGY

Interpersonal physiology is the study of relationships between people’s physiological activates during social interactions. Converging evidence indicates that interdependencies develop between peoples’ autonomic systems, and can be indicative of psychosocial constructs such as empathy and attachment. These interdependencies, often referred to as physiological linkage, are theorized to be key components of social process. Research in the area is limited however, and there is little consensus for best practices. The mechanisms involved in the emergence of linkage, terminology, and methodology and statistics have not been adequately addressed. This dissertation aimed to systematically address these issues through four manuscripts. The first addresses potential generating mechanisms using a controlled, laboratory based study. Results indicate that matched activity and dialog are not necessary for physiological interactions to emerge between romantic couples during passive activity. In the second manuscript, analytical issues are addressed through the application of cointegration, an advanced time series modeling procedure designed to handle multivariate, nonstationary data. However, results suggested that the analysis is not well suited to these data. The third manuscript addresses the informational divide through a systematic literature review designed to both create a centralized resource, and offer recommendations for the field at large. In the final manuscript, the inconsistent timescales in which physiological relationships appear to occur is addressed through the use of a novel method of data decomposition in the time domain. The method is applied to an idiographic example of data collected in-vivo from a student with autism spectrum disorder and his teacher. Findings suggest that running analyses on different time windows of data can significantly impact results.


INTRODUCTION
Interpersonal physiology is the study of relationships between people's physiological activity (e.g., heart rate, breathing rate) during social interactions.
Converging evidence indicates that interdependencies can develop between peoples' autonomic systems, during which the activities of one person are partially dependent on another. Interpersonal measures of physiology have been used to show that a couple is locked into a heated argument, a therapist is empathizing with her patient, and that one individual is leading the behaviors of his teammates. Whether it is family dynamics or group behaviors, psychotherapy or team leadership, a better understanding of the influence of physiology on social relationships can lead to important new insights.
Though interpersonal physiological interactions are currently underexplored, the field is undergoing a rapid expansion, and nearly all research to date suggests that these are critical processes underlying all social interactions (see manuscript 3).
Despite increased interest, inconsistencies in the field have led to a number of issues. Varied terminology and methods have caused an information divide, as few researchers appear to be aware of the extent of the current literature. Due to the complexity involved in the statistical analysis of nonstationary multivariate time series of physiology, analytical procedures applied to these data are often inappropriate or misinterpreted. Most importantly, the lack of studies addressing the basic generating mechanisms have left questions of how and when physiological relationships emerge unanswered, hindering all other interpretations. The combination of these systemic issues has inhibited the progress of research in this field, and will continue to do so unless resolved.
The following dissertation aims to address these issues in a systematic way. The first manuscript undertakes the question of generating mechanisms. This is done through a controlled, laboratory based study that assessed whether conditions such as matched activity or dialog were necessary for physiological interactions to emerge. In the second manuscript, analytical issues are addressed though the application of an advanced time series modeling procedure designed to handle multivariate, nonstationary data. The third manuscript addresses the informational divide through a systematic literature review designed to both create a centralized resource, and offer recommendations for the field at large. The final manuscript addresses a general analytical problem, namely the inconsistent timescale in which physiological relationships appear to occur, through the use of a novel method of data decomposition in the time domain. The method is applied to an idiographic example of data collected in-vivo from a student with autism spectrum disorder and his teacher.

Conditions
In recent years, there has been an increase in research on interpersonal physiology, the study of physiological activities as interpersonal, rather than intrapersonal processes. Dyadic studies of physiology have shown that relationships develop between people's autonomic activity, suggesting that physiological components underlie social dynamics. Across published examples, whether different populations or conditions, using various physiological measures and statistical analyses, relationships have been found in the autonomic activities of dyads . Whereas some theories suggest that this physiological linkage (PL) is a coregulatory processes emerging from specific conditions such as secure attachments or empathy Diamond, 2008;Sbarra & Hazan, 2008), other findings indicate that these processes operate independent of higher order constructs, and are an underling component of social interactions (Ferrer & Helm, 2012).

Previous Findings
Research on interpersonal physiology began over half a century ago, when a series of studies found correlations in the skin conductance (SC) of therapists and patients during therapy . Observed patterns included concordance, when SC moved together, and discordance, when SC moved in opposition . The authors discussed the possibility of using these measures as a reflection of therapeutic rapport, or as a physiological marker of empathy . Despite this early cluster of research, the field of social-psychophysiology trended towards intrapersonal responses to social situations, rather than interpersonal interactions . Beyond a few scattered reports (e.g., Stanek, Hahn, & Mayer, 1973), it was over 25 years until the next advancements in interpersonal physiological research.
In their seminal work on the topic,  applied a bivariate time series analysis to an index of cardiac, electrodermal, and somatic measures from married couples. When couples discussed high conflict topics, the resultant synchronizations in physiology accounted for 60% of the variance in marital satisfaction, an accuracy beyond any other measures of the time . The authors concluded that negative valance led to PL, a finding replicated elsewhere in the literature Messina et al., 2012).
Alternative conclusions about the role of valance were advanced in a study of attachment in depressed and non-depressed mothers with their infants (Field et al., 1989).
Using a cross spectral analysis, PL was found in the heart rates (HR) of mothers and infants, regardless of the emotional state of the mother. Other studies of mothers and infants support these findings, suggesting that basic components of social interactions, such as shared gazing, lead to PL rather than emotional states (Feldman et al., 2011;Ham & Tronick, 2008). For example, Ham and Tronick (2008) analyzed the correlations of slopes in the SC of mothers and infants engaged in the face to face, still face paradigm. In this three phase procedure, mothers interact normally with infants, then sit quietly with a stilled facial expression, then reengage in normal interaction. Analysis showed that PL occurred across conditions, but correlated with different social engagements. In the still face period, video showed that PL in SC was significantly related to negative infant behaviors, such as fussing or protesting. During reengagement PL was significantly correlated with behavioral synchrony between the mother and infant, but not negative behaviors.
Attention was assessed more directly in a study of perceived empathy . Interviews between a therapist and participants were monitored in two conditions. In the first condition, the therapist read scripted questions in a neutral tone and attended to participants in a clinically accurate and typical manner, including eye contact and head nodding. In the second condition, the therapist read the same questions in a similar manner, but made conscious attempts to distract himself with breathing techniques and decreased eye contact during participant responses. Correlations in SC slopes indicated that decreased therapist attention was associated with a significant decrease in PL. Through the conscious act of averting his gaze and attention, the therapist was able to disrupt the physiological relationship as well as to decrease the experience of empathy reported by participants.
A recent series of studies support findings that attention contributes to PL (Ferrer & Helm, 2012;Helm et al., 2012;McAssey et al., 2011). Combining advanced statistics and a simple design, trials were run to assess PL in romantic couples. Couples completed three conditions in which they sat next to each other while quiet and still: a 5 minute baseline, where couples were blindfolded; a 3 minute gazing task, where they were asked to maintain eye contact; and a 3 minute in-sync task, in which they were asked to attempt to synchronize their physiologies. Measures included respiration, thoracic impedance, and HR. Overall, analyses found PL in all conditions. Across measures and analyses, the in-synch task produced significantly greater linkage than the other two conditions. The gazing task resulted in significantly more PL than the baseline in all physiological measures across all 4 analyses. Analyses showed significantly more PL in baseline HR than randomly paired control dyads, whereas respiration was no more related than chance. These findings indicate that PL occurs across contexts in nonverbal interactions, and suggests that proximity may be a sufficient condition for relationships in HR to develop.

Interpreting Interpersonal Physiological Research
Though interpersonal physiological research holds a great deal of promise, there are some important limitations that should be addressed. First, physiological recordings reflect arousal, but not valance. Profiles of physiological activity are not mood specific, so PL is not indicative of shared emotional states (Cacioppo, Tssinary, & Berntson, 2007). A second consideration is that physiological measures are complimentary rather than redundant (Cacioppo et al., 2007). Each measure reflects uniquely innervated systems, so linkage in one measure does not denote similar relationships in other systems.
For example, studies have found that under some conditions, HR but not in breathing rate synchronizes (Ferrer & Helm, 2012). Linkage in certain measures might therefore indicate distinct components of an interaction, but this has not yet been explored. Finally, statistical procedures assess specific parameters of PL, so effectively become the basis of its operational definition. Whereas some techniques test for shared long term, linear trends (e.g., correlations), others evaluate momentary synchronizations in high frequency activity (e.g., coherence). Therefore, the operational definition of PL is inconsistent across studies, and analyses may be addressing different types of processes.
Despite such limitations, evidence from previous work suggests that across dyads, measures, and analyses, PL emerges in the absence of coordinated behaviors such as shared activities or dialog. The aim of the present study was to assess whether proximal conditions were sufficient for PL to develop. For purposes of cross study comparisons, the most commonly used measure (i.e., SC) and analysis (i.e., windowed correlation of slope) were used to assess PL in romantically involved couples. The combination of skin conductance, considered a reflection of sympathetic nervous system activity (Dawson, Schell, & Filion, 2007), and windowed correlation of slope has been successfully interpreted in a number of studies (Ham & Tronick, 2008;Messina et al., 2012). Similar to recent trials (Ferrer & Helm, 2012;Helm, Sbarra, & Ferrer, 2012;McAssey et al., 2011), the current study assessed dyads during inactive, nonverbal conditions in which visual cues were available in one condition but not the other. Based on previous research, it was hypothesized that PL in the SC of couples in both conditions would be greater than chance, and that visual cues would significantly increase measures of linkage.

Method Participants
Participants included 18 romantically involved heterosexual couples. One member from each dyad was an undergraduate psychology student, and received class credit for participation. Due to technical issues, data from 16 dyads was available for analysis. Recruitment and procedures were approved by the University of Rhode Island's institutional review board for the protection of human subjects.

Procedure
Each couple was brought into a quiet room, seated, and fitted with surface electrodes on the distal phalanges of the third and forth fingers of the nondominant hand, as well as the left and right forearm. A respiration sensor was placed over the diaphragm.
Participants were asked to sit still and remain quiet for thirty-two minutes, during which physiological measurements were taken. For seventeen minutes, participants were seated back to back in separate chairs. The first two minutes was considered an acclimation phase, followed by the back to back phase (BB). At the fifteen minute mark, a tone sounded alerting participants to turn their chairs to face each other. The face to face period (FF) continued for the remaining 15 minutes of the trial. Following the trial, participants were debriefed, and all electrodes were removed. Individuals were asked to complete a survey assessing age, gender, length of relationship, mood, and intensity of mood during the trial, though these measures were not analyzed due to inadequate sample size.

Measurement Tools
A J+J Engineering I-330-C2+, 12 channel biofeedback unit was used to take simultaneous physiological measures at a sampling rate of 10 measures per second. Gel free surface electrodes were used to take measurements of SC, HR, respiration rate, and skin temperature, though only SC was analyzed for this report.

Statistical Analysis
The most commonly reported analysis of dyadic relationships in SC is a windowed correlation of slope, first developed by . The technique was designed to assess incremental shifts in slope, as change in level is considered a better indicator of sympathetic activity than mean level . This approach has been successfully applied in a number of studies Ham & Tonick, 2008;Messina et al., 2012), suggesting its viability as a measure of PL.
For each series of SC, slope is calculated in a continuous, running 5-second window. Thus, the slope of the first 50 data points is calculated (t = 1:50) using a least squares regression. The window is then shifted forward by 1 data point, and slope is calculated again (t = 2:51). The continuation of this procedure results in a vector of slope parameters. Following this step, lag-0 Pearson correlations of the SC slopes are calculated for each dyad using a continuous, running 15 second window. Here, the correlation for the first 15 second segment is calculated (t = 1:150), the window is shifted forward by 1 data point, and the correlation is calculated again (t = 2:151). The continuation of this step results in a vector of correlations of slope for the dyad (R = r 1 …r n ). For aggregation, an index of overall linkage for a session is calculated by dividing the sum of positive correlations by the absolute value of the sum of negative correlations, and standardized with a natural log transform (see equation 1).
[1] = log ( This linkage index (LI) is considered a reflection of the synchrony in SC during the trial.
To test for linkage beyond chance, control data was created by following the above procedures with 16 randomly matched pairs from the total data. Indexes were calculated for time matched periods to reflect the BB and FF phases, giving random BB indexes (RBB), and random FF indexes (RFF). Statistical significance was tested using independent samples t-tests comparing the indexes from BB to RBB, FF to RFF, and FF to BB. Confidence intervals and effect sizes using Cohen's d (Cohen, 1988) are reported for each comparison.

Results
The hypothesis that the index of linkage during the FF condition (m = .65; SD = .50) would be significantly greater than the RFF control data (m = .06; SD = .25) was For the purpose of illustration, the raw SC for couple 1 are displayed in figure 1.1.
The linkage index for this couple was relatively high at .47 for the BB phase and .26 for the FF, whereas the mean index for RBB was .12, and the mean index for RFF was .06.

Discussion
Results suggest that when couples are quietly facing each other, PL is detectable in sympathetic activity. Despite a number of limitations, the significant results and moderate effect size support previous findings (e.g., Ferrer & Helm, 2012), suggesting that visual proximity is sufficient for PL to develop.
There are, however, significant limitations to this study. To begin, a small convenience sample of undergraduates was used, so results should be considered trends rather than generalizable evidence. Additionally, the serial dependence in the data was not accounted for, violating the assumption of independence required for correlation analysis. This increases the potential of type I errors. In the original paper,  cite previous work by   is known about the processes involved. Additionally, the complexities involved in analyzing these data prohibit the use of most statistics, and viable methods are needed.
To address the need for basic data, the current study assessed physiological relationships in the skin conductance of romantically involved partners during passive, nonverbal conditions. Physiological interactions were assessed using cointegration analysis, a well validated, multivariate time series analysis that tests for shared stochastic trends between data sets. However, due to constraints of the analysis, less than half of the data was analyzable. Additionally, results indicated that randomly matched skin conductance data exhibited cointegration, suggesting that the analysis is not well suited to these processes.

Assessing Physiological Linkage Through Cointegration Analysis
Interpersonal physiology refers to the study of interpersonal dynamics through physiological activity. The approach requires the joint assessment of simultaneously collected time series of physiological data from multiple people. This method has revealed complex bi-directional processes in the physiological activities of dyads and groups, known as physiological linkage (PL). PL has been observed across relationships (e.g., couples; teammates; and conditions (e.g., play; Ham & Tronick, 2008;therapy;, and found to correlate with psychosocial constructs including empathy  and attachment (Field et al., 1989). can minimize confounds to more accurately assess its dynamics. Second, cointegration analysis will be evaluated as a measure of PL. Cointegration is a multivariate time series analysis that can show coregulatory relationships. It is a validated technique used most often in econometrics to assess shared stochastic trends in nonstationary data, and appears well suited for the analysis of PL.

Methodological Issues in Interpersonal Physiological Research
Interpersonal physiological research began over half a century ago, when a series of studies included physiological measurements from both clients and therapists during counseling . Results suggested that there were periods of synchronization in the heart rates (HR) and skin conductance (SC) levels of the dyads. Contextual data indicated that sessions with higher levels of PL were experienced as more empathic, prompting the researchers to conclude that there was a physiological component of empathy. Most research to date has followed these early works, using PL as a means to assess broader psychosocial constructs. For example,  assessed an index of PL as a marker of marital satisfaction. Creaven et al. (2014) and  have used it as an indicator of relationship type in motherchild dyads, whereas  and others (e.g., Järvelä, Kivikangas, Kätsyri, & Ravaja, 2013) are leading the way using PL to explore components of teamwork. Nearly all studies to date have resulted in finding of PL, and it is generally considered to be a useful tool indicative of a range of constructs .
One problem stemming from these findings is the prevalence of contrary conclusions. Though some studies connect PL to negative contexts only (e.g., , many observe it during positive valance (e.g., , Ham & Tronick, 2008. Some conclude that it is limited to attachment relationships (Sbarra & Hazan, 2008), whereas others have observed it in strangers (Silver & Parante, 2004).
Greater linkage has been associated with better teamwork ), but also with arguments and dissatisfaction . It is often assumed to be a result of behavioral coordination (Feldman et al., 2011), though it has been observed in dyads participating in unmatched activities  most studies make no attempts to explore PL as an independent process, instead relying on a given measure of linkage as an indicator of some other construct. This approach inherently assumes that the mechanisms driving PL are related to the mechanisms of a given construct. However, PL has been found to underlie a wide range of constructs, so cannot be assumed to be caused by the given conditions. Research designs that assess PL as an independent dynamic process with unknown mechanisms are needed to better understand how these interactions develop.

Statistical Issues in Physiological Linkage
An important caveat in interpersonal physiological research is that the analysis  (Ferrer & Helm, 2012;Helm, Sbarra, & Ferrer, 2012;McAssey et al., 2011), however, few have been well validated, and there are currently no clear solutions for how PL should be assessed.

Cointegration
One potential solution is to adapt techniques hat have been validated using similar data, such as cointegration. Well established in econometrics, cointegration is designed to identify shared stochastic trends in nonstationary time series. A well known theoretical example of a cointegrated relationship is the shared path of a drunk walking his dog (Murray, 1994). In this example, the steps taken on the walk by both the man and the dog are random, so the path of each is stochastic and individually unpredictable.
However, there is a shared trend between them, as both the man and the dog regulate their movements based on the position of the other. The man is never too far from his dog, and the dog never moves too far away from his owner. Similarly, cointegration has been described as a way to determine that two distant ships, each with their own unique movements, are drifting on the same current. These shared movements despite random positions create a linear trend, or cointegration, which can be calculated using a vector error correction model (VECM).
To be eligible for cointegration analysis, each time series must be nonstationary and integrated of the same order. Nonstationary data has an inconsistent mean and variance, and is integrated (I) if it becomes stationary after differencing to a given order, d. Differencing is a simple transformation of a time series (X), reflecting the change in scores between consecutive measurements (Δx t = x t -x t-1 ). The number of times data must be differenced for the resultant series (ΔX) to become stationary, is the order that it is integrated. This is denoted as I(d), where I indicates that the integrated data (ΔX) is stationary after being differenced (d) times. Stationary data can be denoted as I(0). Most commonly, integrated data becomes stationary after first order differencing, meaning it is integrated to the first order, denoted as I(1). For time series to be tested for cointegration, each series must be nonstationary and integrated of the same order with normally distributed residuals (ϵ ~ I(0)). Due to these constraints, some data may not be appropriate for cointegration analysis.
If time series vectors are integrated of the same order (d), they can then be tested for cointegration by determining whether they share a common stochastic trend (Engle & Granger, 1987). If analyses indicate that the series share a common trend (i.e., are cointegrated), then a VECM can be used to calculate the parameters of their relationship.
The VECM, as defined by Stroe-Kunold and colleagues (2012), first assumes that the common trend (CT) has a unique influence on each variable, λ, and each variable has a white noise error (ϵ) around the trend. So, the series X, can be represented as: and the series Z can be represented as : where ϵ is a white noise error (ϵ 1t , ϵ 2t ~ I(0)), CT is the shared stochastic trend, and λ is the weighted influence of the CT on each original series (X, Z). If the shared stochastic trend, CT is removed, and there is a stationary, linear combination of the remaining terms (e.g., λ 2 ϵ 1t − λ 1 ϵ 2t , ~ I (0)), then the two series are cointegrated. The general VECM is written as: Where ΔY t = a K-variate process (e.g., X, Z) with r shared stochastic trends (multivariate systems can have multiple cointegrating trends, though r = 1 in a bivariate cointegrated system); Π = αβ / , where α is the K x r error correction mechanism, and β / ΔY t-1 is the K x r matrix of error correction weights, the equilibrium of the system.
Combined, these terms represent the shared common trend. Γ = the K x K loading matrix of lagged weights (λ i ), and represents the autocorrelation structure. p = the number of lags in the model; and U = the error matrix (ϵ it ). When solved, terms such as β and α can be used to interpret the dynamics of the system. For example, if α i > 0, then deviations from the trend in the previous period are enhanced, whereas if α i < 0, then deviations are reduced.

Overview Of The Current Study
It has been theorized that physiological level interactions are ubiquitous , and can therefore be explored regardless of the contextual environment. If this is the case, then the inclusion of multifaceted conditions and interactions may obscure patterns in the already complex data. In the current study, the first aim will be addressed by simplifying the conditions under which PL is assessed so that confounding variables that may inhibit the measurement of physiological interactions can be reduced. The second aim will be addressed by using cointegration analysis to assess PL.

Participants
Participants included 18 romantically involved heterosexual couples. One member from each dyad was an undergraduate psychology student, and received class credit for participation. Due to technical issues, data from 16 dyads were available for analysis. Recruitment and procedures were approved by the University of Rhode Island's institutional review board for the protection of human subjects.

Procedure
Each couple was brought into a quiet room, seated, and fitted with surface electrodes on the distal phalanges of the third and forth fingers of the nondominant hand, as well as the left and right forearm. A respiration sensor was placed over the diaphragm.
Participants were asked to sit still and remain quiet for thirty-two minutes, during which physiological measurements were taken. For seventeen minutes, participants were seated back to back in separate chairs. The first two minutes was considered an acclimation phase (AC), followed by the back to back phase (BB). At the fifteen minute mark, a tone sounded alerting participants to turn their chairs to face each other. The face to face period continued for the remaining 15 minutes of the trial, with the first five minutes (FF1) separated from the final ten minutes (FF2) due to movement artifact. Following the trial, participants were debriefed, and all electrodes were removed. Individuals were asked to complete a survey assessing age, gender, length of relationship, mood, and intensity of mood during the trial, though these measures were not analyzed due to inadequate sample size.

Measurement Tools
A J+J Engineering I-330-C2+, 12 channel biofeedback unit was used to take simultaneous physiological measures at a sampling rate of 10 measures per second. Gel free surface electrodes were used to take measurements of SC, HR, respiration rate, and skin temperature, though only SC was analyzed for this report.

Statistical Analysis
Prior to testing for cointegration, data were transformed using a log(10) to meet the assumption of normally distributed residuals. All data were then reduced using a one second moving average, so that each data point represented one second. Each time series was then split into four segments as defined above: AC, BB, FF1, and FF2. Unit root tests were performed on each segment of each time series using the Augmented Dickey Fuller test (ADF, Dickey & Fuller, 1979) using an alpha of 0.025, so that an alpha of 0.05 was maintained for each dyad. Initial test lag was set at: Max Lag = (t-1) (1/3) , then rerun with lags derived from the Akaike information criteria (AIC) and the Schwarz Criterion (SWC). Each unit root test then was first run without a trend (tau-3 and phi-3). If a unit root was detected, it was rerun with a trend (tau-2, phi-2), and again with a drift (tau-1).
A unit root indicates nonstationary data. If ADF tests indicated that a segment was nonstationary and integrated to the same order (d) for both series from a dyad, then procedures continued. Otherwise, the two series could not be cointegrated so no further tests were done on the given segment for that dyad.
Cointegration was then tested using the Johansen trace test (Johansen, 1995), with the alpha of the likelihood ratio, r, set to 0.05. For these tests, trends in data were first assessed using a procedure for statistical testing of deterministic trends described in Pfaff (2008) and were carried out using the statistical software R (R Development Core Team,

2012)
If neither time series had a trend, the Johansen trace test included a constant. If one series had a trend and the other did not, then the test included a constant and a trend.
If both series had trends with a similar slope, then an orthogonal trend was used. If both series had trends and the slopes were not the same, then a constant and a trend were included. The Johansen trace test was rerun accordingly using each lag indicated by the AIC and SWC.
If Johansen trace tests indicated that the null hypotheses H 0 : r ≤ 0 (i.e., no integration) or H 0 :r >1 (i.e., both time series are stationary, and have no unit root) were rejected, and that the null hypothesis H 0 : r ≤ 1 (i.e., integration) was not, then the series X and Z were considered cointegrated, and were eligible to be fit by a VECM.
To validate the results, all analyses were run using random pairs created from all eligible individuals (i.e., I(1)) from FF1.

Results
Of the 64 segments assessed for cointegration, 31 had unit roots of the same order. All were I(1). Twelve dyads were integrated during AC ( Random pairs were then generated from all individuals in FF1. Of those random pairings, 7 dyads were I(1), and eligible for cointegration tests. Five of the random dyads were cointegrated (Table 2.9), equal to the number of cointegrated dyads from the nonrandomized data. Therefore, none of the planned VECMs were run, as interpretation of coefficients would be speculative at best.

Discussion
The application of cointegration analyses to physiological data for the assessment of PL appeared to be a good match. Cointegration is designed to handle nonstationary multivariate data, and assesses shared long term trends while capturing momentary system dynamics. It has been validated and used extensively in econometrics, and has been recommended as a viable tool for analyzing psychological processes (Stroe-Kunold et al., 2012). However, due to the strict requirement that all series are integrated of the same order, less than half of the current data could be analyzed. Of those testable, only 6 were cointegrated, less than 10% of the original segments. Additionally, due to nonstationary error variance (i.e., U ≠ I(0)), data needed to be log (10) transformed and split into segments to meet the assumption of normal residuals (U ~ I(0)). More active conditions would likely amplify this problem, further reducing the analyzable data.
More importantly, cointegration tests using randomly matched eligible dyads from the FF1 condition resulted in the same number of cointegration relations as with the true dyads. This suggests that findings of cointegration are most likely due to shared context or statistical artifact, rather than direct interpersonal influences. As model parameters cannot be considered reflective of an interpersonal relationship, the VECM parameters would not be interpretable as descriptions of the interaction. Due to these issues, VECMs were not run. One potential cause of these issues is the complexity of the interactions, which may not be captured by a static model even under controlled conditions. Cointegration assumes, as most models do, that the relationships being tested are fixed, so parameters such as the alpha weights at given lags are constants. If this is not the case, then a fixed model is being fit to a heterogeneous set of processes. If this is the case, then the estimated model will be as much a misrepresentation of the data as nomothetic models are of an individual participant. For example, if one partner repeatedly laughs, and the other has a lagged and measured response (i.e., laughing a few seconds later to a lesser degree), the interaction in SC may show a good fit using a constant model. However, if the interaction morphs into both partners laughing at similar levels simultaneously (i.e., synchronized SC), then different model parameters would be needed to fit this new relationship. If a single model is fit to the total interaction, the aggregated estimate of the two dynamics will not be a good representation of either interaction, even if the model is accurate enough to fit the data as a whole.
A significant limitation is that this study applied an analysis that has not been validated with physiological data, to test a hypothesis (i.e., PL) that has not been confirmed under these conditions. Therefore, it is unclear whether the analysis is capable of finding meaningful relationships in these data, or whether there were relationships for it to find. Given the small sample size, it is difficult to draw general conclusions about the viability of this approach with these data. However, few segments met the required assumptions for the analysis (i.e., matched unit roots and I(0) residuals), and more active trial conditions would likely produce less usable data. Unless such issues can be resolved, it seems unlikely that cointegration is a viable analysis for these complex data.

Abstract
This systematic review concerns research on interpersonal physiology, the study of relationships between people's physiological activities during social interactions.
Converging findings from this methodology indicates that interdependencies emerge between the physiological activities of people during interactions, often referred to as physiological linkage. Physiological linkage has been found to correlate with psychosocial constructs including empathy, attachment, and dissatisfaction, and has been observed in both new and established relationships. Due to such findings, interpersonal physiological interactions are theorized to be ubiquitous social processes underlying observable behavior. The literature on interpersonal physiology however, is highly fragmented, with different researchers using idiosyncratic terminology, methods, and analyses. This disconnect has complicated cross-discipline collaboration. The following systematic review therefore aimed to generate a centralized resource of the existing work, and offer recommendations for future research. We first define terminology, followed by explanations of the review methods. Results of the systematic review are then detailed including key themes and findings from the literature. Finally, we discuss pros and cons of methodological and analytical approaches, review current limitations, and propose guidelines for best practices.
Keywords: interpersonal physiology, physiological linkage, physiological synchrony, physiological coherence, dyadic interactions, social psychophysiology Interpersonal Physiology: A Systematic Review of the Literature The following report is a systematic review of the research on interpersonal physiology, the study of relationships between people's physiological activities during social interactions. Converging evidence indicates that peoples' autonomic system activities can be interdependent with the autonomic systems of the people around them.
Interpersonal analyses of physiology have been used to show that a couple is locked into a heated argument ), a therapist is empathizing with a patient , and that one individual is leading the behaviors of others . Whether it is family dynamics or group behaviors, psychotherapy or team leadership, a better understanding of the influence of physiology on social relationships can lead to important new insights and interventions.
Though interpersonal physiological interactions are currently underexplored, nearly all research to date indicates that these are critical processes underlying all social interactions. Advancements in wireless telemetrics and dynamic multivariate time series analysis allow complex questions about interpersonal dynamics to be addressed.
Ambulatory data collection and reliable analyses have generated a new opportunity to explore mechanisms of social relationships underlying observable behaviors.
Despite a recent increase in the interpersonal physiological methods, this small field is currently fragmented as research groups use idiosyncratic terminologies, measures, and analyses, complicating cross-discipline collaboration. Lack of awareness of previous work has led to replications of known procedural issues, as well as uninformed conclusions. Without a general format for reseachers to communicate, these issues will continue to hinder progress. The following literature review is therefore intended to be a reference source by both compiling previous research, and highlighting issues deemed to be critical to future work. This review is organized as follows:. first, we operationally define basic terminology, followed by the details of our methods for search and retrieval, and eligibility criteria. Second, we review key themes identified in the literature including general findings in the Results section. Lastly, we discuss pros and cons of methodological and analytical approaches, review current limitations, and propose guidelines for best practices.

Operational Definition of Key Terms
The general methodology of studying temporal interactions in the physiological processes of multiple people is viewed herein as "interpersonal physiology". At minimum, these techniques require a bivariate analysis of physiological measures simultaneously collected from two individuals over time. Though distinct from other social process research such as behavioral (e.g., linguistics), biological (e.g., cortisol) or neurological (e.g., electroencephalograph [EEG]), these fields are only separable in concept. As co-occurring intrapersonal processes are inherently symbiotic, it is assumed that there are associations between all of these research areas. For example, affect and emotional contagion, described by some as the transference of emotional states (e.g., Hatfield, Cacioppo, & Rapson, 1994;Waters, West, & Mendes, 2014), is typically assessed through self report or behavioral assessment, and rarely includes measures of physiology. Still, differentiating characteristics including rapidity of response, interpretability of measures, and the potential for continuous passive data to be collected in-vivo make physiology uniquely adaptable to social research.
A common observation resulting from interpersonal physiological research is the development of different types of interdependencies between partners' autonomic activities. References to these interdependencies are nonspecific and idiosyncratic, making cross-study comparisons difficult. For the purposes of this review, we generalize the term physiological linkage (PL) to refer to any type of identified interaction in the physiological processes of individuals. Linkage is therefore imposed as a general categorization, under which more specifically defined patterns are included.

Search and Retrieval
We conducted a systematic literature review according to the guidelines presented by Okoli and Schabram (2010). All researchers underwent protocol training to search and identify relevant articles. Our goal was to identify and retrieve all interpersonal physiological research published in peer-reviewed journals. Several search terms were chosen based on previously identified research. These terms were: physiological synchrony; interpersonal physiology; physiological linkage; physiological coherence; and physiological covariation. Following the initial search, the following five search terms were added based on relevant articles that used alternate language: physiology & contagion; social psychophysiology; attunement & physiology; and attunement & physiological. Keywords were entered into four bibliographic databases: PsycINFO, PsycARTICLES, MEDLINE, and Science-Direct. Reverse citation was performed on each relevant paper obtained using Google Scholar (i.e., a search for studies that cite the obtained article). Relevant articles referenced in the text of identified research were also obtained. Searches were performed between January and March, 2014.

Eligibility Criteria
Studies selected for the review were based on the following criteria: 1) The study was published in English.
2) The study was published in a peer-reviewed journal.
3) The study simultaneously and continuously collected physiological measures (e.g., heart rate [HR]; skin conductance [SC]; respiration rate [RR]) from two or more proximal individuals.
b. Studies which only assessed physiological interactions between individuals who were not simultaneously proximal (e.g., watching a tape of a previous interaction) were excluded.
4) The study quantitatively assessed temporal relationships in physiological measures simultaneously collected from two or more people (e.g., bivariate correlations).
a. Studies assessing only intrapersonal physiological activity, without an assessment of interpersonal physiological interactions, were excluded. b. Studies that did not assess a temporal relationship (e.g., heart rate measured at or aggregated to 1 occasion) were excluded. c. Studies assessing mother-fetal relationships were excluded.
5) The sample included human subjects.

Results
A total of 35 studies were identified that met the defined eligibility criteria for interpersonal physiological research (see table 3.1). In order to establish a centralized reference highlighting the research to date, as well as identify critical issues for future work, the following characteristics of included studies are presented: terminology, physiological measures, statistical assessment of PL, methodological approach, and study findings.

Terminology
Over a dozen different terms were used to describe research on interpersonal physiology (see table 3.1). Most studies identified an observed phenomenon such as synchrony (e.g., McAssey, Helm, Hsieh, Sbarra, & Ferrer, 2013), though some used terms such as sociophysiology (Di Mascio, Boyd, Greenblatt, & Solomon, 1955) to describe a general methodological approach. Others did not give a clear definition or term in reference to the method or a phenomenon (e.g., . Terminology largely varied by the population being studied. For example, 75% of studies using the term physiological concordance (n=8) addressed therapist-client dyads, and 100% using physiological compliance (n=6) examined teammates. However, the operational definition assigned to a given term was inconsistent across studies. For example, Henning and colleagues  coined the term physiological compliance in reference to coherence and correlations in cardiac, respiratory, and electrodermal measures. More recently, Järvelä and colleagues (Järvelä, Kivikangas, Kätsyri, & Ravaja, 2013) used the same statistical approaches and operational definitions as Henning (2001), but instead used the term physiological linkage. Similarly, Gottman and  developed an index of physiological and motor activity, and referred to synchronizations between people's index as physiological linkage. Reed and colleagues (Reed, Randall, Post, & Butler, 2013) used physiological linkage in reference to lagged interdependencies in specific cardiac and electrodermal measures. The broad use of the term led us to generalize the operational definition of physiological linkage to include any observed interdependence in physiology.
More specific terms were also used to define types of PL. In the first study of interpersonal physiology, synchronized patterns in physiology were described as concordance, in reference to observations of matched HR. Findings of synchrony, however, are dependent on the analysis used to define it. For example, where  used coherence to test for the presence of similar frequency bands,  bivariate time series analyses tested for shared linear trends.
Though physiological synchrony has been measured using a number of analyses, and defined by a number of terms, moving forward we use the term concordance as a more general indication of matched states.
Another pattern observed in the original studies was co-occurring changes in opposite directions (i.e., a negative correlation), which they defined as discordance. As with concordance, other researchers have observed similar patterns, but used different analyses and terminology (e.g., Helm, Sbarra, & Ferrer, 2012;Reed et al., 2013). For example, Reed and colleagues found discordance using multilevel models, but defined the patterns as anti-phase synchrony. Moving forward, we refer to any measure of negative relationships as discordance.
The third pattern of physiological relationship described in the literature is a lagged concordance. This is a distinctly different type of PL that can only be assessed when using time as a variable. Lagged-concordance indicates that a change in one person is followed by a similar change in the other, and has been used to test for leadership roles.
Müller and Lindenberger (2011) used wavelet analysis, a time-frequency procedure, to show that changes in the respiration of a conductor were followed by similar changes in the respiration of individuals in the choir.
A relationship theorized to develop out of lagged concordance is physiological coregulation. Coregulation is defined as the interdependence between partners' physiological activities, leading to a maintained stable state .
Whereas lagged concordance may occur as a momentary, unidirectional influence, coregulation refers to a bidirectional interaction that leads to a stable state over time . Ferrer and colleagues have used statistical models capable of identifying this form of interdependence, and found that coupled oscillations between romantic partners' HR and RR maintain stable patterns Helm et al., 2012).
A final term in the literature is "asynchrony", used to describe a lack of observable PL (Reed et al., 2013). Though difficult to substantiate without the use of multiple models to test for PL, the concept of asynchrony is an important one, as it describes periods that do not exhibit signs of physiological interactions between people.
Asynchrony has been found to be predictive of specific relationship types (Reed et al., 2013), suggesting that the identification of periods that lack PL can also be informative of an interaction.

Physiological Measures Used
Physiological measures used to detect PL included cardiac and electrodermal activity, respiratory rates, and skin temperature (see table 3.1). The majority of studies used multiple physiological measures in their research, running separate analyses on each (n=13). For example,  used three techniques to test for PL: crosscorrelations in SC; weighted cross-coherence in HR; and weighted cross-coherence in RR A total of 12 studies relied on a single physiological measure to test for PL, and 2 studies created indexes that incorporated numerous measures into a single analysis. Indexes consisted of summations of multiple physiological measures, and were analyzed as a more general indication of autonomic state.  research has become the most well known study in the field, and used a bivariate time series analysis to assess PL in an index of normalized scores of HR, pulse transmission time, SC level, and somatic movement.

Statistical Analysis of Physiological Linkage
As PL is largely a mathematical construct, its observation is dependent on the analytical procedures used to measure it. To help elucidate the approach used to identify PL, these procedures were separated into two components: statistical category and analytical approach (see table 3.1).
Six basic categories of statistical analyses of PL were performed: correlational (n=23); frequency based (n=7); time-series analysis (n=4); nonlinear modeling (n=2); dynamic systems (n=2); and multilevel modeling (n=1). Note that a number of studies assessed PL using multiple approaches. These strategies can be further differentiated as static (n=35) or dynamic (n=4) analytical approaches. Static approaches result in a single measure or model of PL for each trial, and describe the general state of a relationship over a trial (e.g., a correlation). These results are typically aggregated across participants to represent PL at the group (i.e., experimental or control group) or condition level. In contrast, dynamic approaches track changes in PL in a single unit (i.e., a dyad or team) over time, leading to detailed observations of temporal patterns. For example, Ferrer and Helm (2013) used coupled differential equation models to assess PL in HR, RR and thoracic impedance.  were therefore able to assess conditional differences within dyads over time, rather then depending on aggregates of the trials.

Methodological Approaches
Both idiographic and nomothetic methodologies have been used in interpersonal physiological research (see table 3.1). Idiographic designs focus on the individual unit over time (i.e., a dyad or team), whereas nomothetic techniques combine the data to assess group level trends. The large majority of studies reviewed report nomothetic findings (n=32), despite only 2 assessing PL using purely nomothetic analyses (i.e., multilevel models). For example,  measured a running correlation in the slope of SC in dyads. This led to a vector showing changes in the correlations of slope over time for each dyad. However, this dynamic, idiographic measure was then aggregated into a 'linkage index' score, followed by a nomothetic comparison of mean differences in linkage by condition, effectively collapsing the temporal resolution.
Nomothetic studies assessed differences in PL between groups (n=9), or across conditions (n=21). Group differences were found in 7 studies, and 17 studies found differences between conditions. Four studies reported idiographic results, which allowed a more detailed assessment of patterns and trends present in the data. For example, Müller, & Lindenberger (2011) used a combination of advanced statistical analyses (e.g., wavelet analysis and Granger causality) to assess physiological interdependences in a choir. They were able to determine group dynamics including leadership (i.e., that a physiological change in one person is followed by the same physiological change in the group) and subgrouping (i.e., individuals whose physiological activity is significantly more related to each other than to the rest of the group), as well as track the changes in those roles over time (i.e., when the leader becomes a follower).
Findings of PL, regardless of the analytical procedures applied, are difficult to interpret without a null hypothesis for comparison. Seemingly high correlations in physiology may occur due to random or contextually based conditions, rather than coordinated interpersonal interactions. For example McFarland (2001) notes that statistically significant correlations as high as .80 could be found in the RR of randomly matched data from the sample. To account for this, many studies developed a null hypothesis test to determine whether results significantly differed from random (n = 8).
This was done by creating random dyad pairings, and then rerunning analyses on data from unmatched participants (i.e., random pairs). Comparative analyses such as t-tests were then used to determine whether PL in the true pairs significantly differed from the random pairs. All eight studies that assessed PL in comparison to a null hypothesis found PL to be significantly greater in actual dyads compared to random dyads. Though impressive on the surface, these findings may be the result of publishing bias (e.g., studies with nonsignificant findings having difficulty being published), or type I errors.

Findings by Population
Four distinct populations have been studied to date using interpersonal physiological methods (see table 3.1): therapist-clients (n=8), couples (n=5), motherchild (n=7), teammates (n=8), and friends-strangers (n=7). This categorization emerged as a key factor under which other categories were grouped. For example, the terminology and statistical procedures used to define and identify PL was largely restricted by population. The following sections therefore organize results by population.
Therapist-Client Dyads. Research on interpersonal physiology began over half a century ago, when a series of studies found evidence of PL in the skin conductance (SC) and heart rates (HR) of therapists and patients during therapy . Moments of positive and negative correlations in the SC of therapists and patients were observed, respectively defined as concordance and discordance . The authors concluded that these relationships were potential indicators of therapeutic rapport and empathy . Further analysis showed that therapist notes from sessions with high concordance had fewer references to being distracted from therapy than session with low concordance . Additionally, the authors noted that clients showed reduced HR with one particular therapist. Though results were limited by small sample sizes and rudimentary statistical procedures, these works introduced and defined 'interpersonal physiology' as a research methodology decades before most others would find its utility.
In an interesting early advancement of these procedures, Robinson and colleagues  assessed the relationship between empathy and PL in SC and finger skin temperature between therapists and clients during therapy. The researchers found that large amplitude SC responses occurring in both the clients and therapists within a short lag (< 7 seconds apart) were significantly correlated with empathy, but that measures of PL in finger skin temperature were not. Robinson and colleagues concluded that the affective matching process that is related to empathy is evident in short lagged SC responses, but not long lagged or tonic affective activity.
Building off this earlier work,   Two publications by Meara (2009, 2012) used  dynamic linkage index to identify neural activity during 'high empathy' moments between therapists and patients. During periods of peak PL during therapy sessions, electroencephalograph (EEG) data was assessed in an attempt to trace the neurological correlates of the therapeutic alliance. Extensive relationships between neural activity and PL were reported, such as high alpha and beta activity in the temporal region. However, periods of high PL were labeled as empathy with no additional measures, yet extensive work suggests that PL is only contextually bound to positive affect. Though this procedure limits conclusions that can be drawn from their results, findings suggest that neurological states may accompany PL.
In summary, therapist-client relationships have received only minimal investigation through these techniques. Most studies have used measures of SC, and all recent studies employed the linkage index to assess PL. Findings consistently show that transient periods of PL develop during therapeutic relationships, and that these periods are significantly correlated to empathy.
Couples. In their seminal work exploring PL in couples,  devised a unique index combining cardiac, electrodermal, and somatic measures of couples discussing neutral and conflictual topics. A bivariate time series analysis showed that couples' PL during arguments could account for 60% of the variance in marital satisfaction, but did not detect PL between couples discussing neutral topics.
They concluded that PL only developed during negative interactions, postulating that dissatisfied couples could not disengage from the arousal of a conflict, whereas satisfied couples were able to 'step back' and listen. Likely due to results that indicated that PL was only marginally predictive of future relationship status (i.e., divorce), most later work by these researchers focused on intraindividual processes rather than interpersonal physiology.
A recent series of papers involving romantic couples assessed the mechanisms of PL and developed advanced procedures for the analysis of PL Helm et al., 2012;McAssey et al., 2013). Trials for these studies consisted of three conditions in which romantic partners sat next to each other while quiet and still: a 5 minute baseline, in which couples were blindfolded; a 3 minute gazing task, where they were asked to stare at each other; and a 3 minute in-sync task, in which they were asked to attempt to synchronize their physiologies. In their first paper on the subject, McAssey et al. (2013) applied idiographic methods and two novel statistics to data from four couples. The first analysis, an empirical mode decomposition followed by a windowed cross-correlation, was used to assess PL in respiration and thoracic impedance. In the second approach, a structural heteroscedastic measurementerror model was adapted to detect linear associations between dyads HR. Across measures and analyses, results suggested that PL increased from baseline to trials. No significant effects were found when analyses were run using randomly paired individuals from the trials.
In the second and third reports, the group applied dynamic systems models to the HR and RR from 32 couples Helm et al., 2012). An important advancement with these approaches is their capability of tracking bidirectional patterns of interdependence within a dyad over time. This important advancement allows constant assessment of changes in the interdependencies between physiological processes. This is a significant improvement over techniques that give an aggregate measure of the relationship for a given time period. Overall, these analyses revealed PL in both measures for each face to face condition, as well as a number of specific patterns of interaction. Unexpectedly, analyses also showed significant PL in the cardiac activity of dyads during the baseline phase. Baseline procedures were therefore unsuccessful at eliminating physiological interactions, prompting the authors to recommend that future studies use alternative approaches, such as pairing data from unmatched individuals, keeping participants separate, or simulating data . Results from the analysis of randomly paired data suggested that none of the findings were due to methodological artifact. Reed et al. (2013) explored the influence of negative partner interactions on PL between romantic partners. In the trials, romantically involved couples discussed healthy lifestyle issues while video, cardiac, and electrodermal measures were continuously collected. Following the sessions, participants used the video to code their own affect.
Observers coded the dyads for demand and withdraw behaviors (i.e., when partners were demanding of the other, or withdrawing from the interaction) and negative partner influence tactics (e.g., using guilt or ridicule). Results suggested that negative partner influence moderated PL in blood pressure, as low influence was associated with discordance and high levels were associated with asynchrony. Demand and withdraw behaviors also appeared to moderate PL, as their presence coincided with concordance, and their absence with discordance. The authors suggested that discordance may therefore result from turn taking during dialog, and could be a key component in any conversation.
At first glance, the existing literature addressing PL in couples appears contradictory.
Some findings suggest that PL, and concordance in particular, only develops during negative interactions, whereas other results suggest that it develops in neutral conditions as well. Potential reasons for these inconsistencies include the differences in physiological measures and statistical approaches used, as well as the variations in methodology.

Mother and Child. Mother-infant.
The first study to assess mother-child dyads was completed by Field, Healy and LeBlanc (1989), who conducted a study of depressed and non-depressed mothers. They assessed coherence and cross-coherence between behavioral states, HR, and behavioral states and HR of mothers and their infants during 3-minute sessions of normal play. Results revealed coherence across behaviors for both depressed and non-depressed dyads. Concordant heart rates were found in more than half of the dyads, with no significant differences across depressed and non-depressed dyads.
Ham & Tronick (2009) examined physiological and behavioral linkage between mother and their 5-month old male infants. The SC of dyads was recorded while they participated in the face-to-face still-face paradigm. This procedure included three successive two-minute episodes of regular interaction, a perturbation episode where mothers could not respond, and a soothing episode. PL was assessed via  linkage index. Concordance in SC was observed during the still face paradigm when infants displayed negative behaviors. Additionally, when mothers engaged in subsequent soothing of infants, greater concordance occurred in relation to behavioral synchrony. The authors concluded that mothers calm themselves to calm their infants, and that concordance may be more likely to occur when at least one partner is attending to the other partner.
Feldman and colleagues (Feldman, Magori-Cohen, Galili, Singer, & Louzoun, 2011) examined the effects of face-to-face interactions on PL of HR between mothers and 3-month-old infants. Micro assessments of gaze, affect, and vocal synchrony were conducted on mother and infant dyads during two-minutes of baseline and three-minutes of free play. PL of maternal and infant interbeat interval (IBI) were measured using autoregressive integrated moving average (ARIMA) models and cross correlation functions. Statistically significant levels of PL were found during face-to-face interactions. Time periods involving vocal synchrony, affect synchrony, or the cooccurrence of vocal and affect synchrony significantly related to increased concordance in IBI between mother and infant compared to periods without behavioral synchrony.
Most recently, Waters, West, and Mendes (2014) assessed affect contagion between mothers and infants by assigning mothers to one of three conditions: a social evaluation with positive or negative feedback, or a neutral condition. PL was found between infant HR and mother ventricle contractility. Greater PL along with an increasing trend was observed in dyads in the negative feedback condition, but not the neutral or positive conditions. Therefore, the researchers concluded that stressful affect is contagious across mothers and infants. (Creaven, Skowron, Hughes, Howard, & Loken, 2014) explored the effect of child maltreatment on the PL of mother-child HR and RSA. HR and RSA were collected while the pairs watched a video. Zero-order correlations of mother and child resting HR and RSA were used to measure PL. Results revealed PL in the HRs of non-maltreating mothers and their children, and discordant PL in the HR and RSA of both groups. Additionally, mothers' resting HR was found to moderate PL, as higher average resting HR was associated with lower PL.

Mother-child. Creaven and colleagues
Two studies recently assessed PL in facial skin temperature between mothers and children. Procedures involved women watching their own or another child participating in a series of play and stress phases through a one-way mirror. Ebisch et al. (2012) assessed stress conditions, and found correlations in skin temperatures of mothers and their children using both idiographic and nomothetic methods. Manini and colleagues (2013) expanded this work by comparing the PL of thermal signals of mother-child dyads to other woman-child dyads during stress conditions. Results indicated that PL occurred between women and children regardless of parenting status. However, correlations were significantly higher, and cross correlation lags were shorter between mothers and their own versus other child dyads. The authors concluded that these findings demonstrate that a child's distress evokes a spontaneous autonomic response in women, but that maternal bonds may modulate the timing of response.

Mother-adolescent.
To date, only one study has examined PL in motheradolescent dyads. Ghafar-Tabrizi (2008) examined PL of HR and finger pulse amplitude in low-conflict and high-conflict mother-adolescent daughter conversations. PL was analyzed via a bivariate time series analysis. Close assessment of the interactions revealed a number of specific patterns in PL over the course of the trials. For example, levels of felt arousal were associated with the strength of PL during dyadic interaction, suggesting an experiential component was associated with these periods. When daughters led the conversation, their HR predicted the response pattern of mothers better than mothers predicted daughter, and vice versa. In the high-conflict group, however, when daughters led the conversation, the HR of daughters predicted HR of mothers significantly better than when mothers led the conversation. Equivalent levels of PL were demonstrated across varied conversation topics. Finally, high-conflict dyads did not demonstrate higher levels of PL than lower level conflict dyads. PL, however, was stronger during conflictual conversation than pleasant conversation for the high-conflict group only.
In summary, preliminary research examining PL among mothers and children suggests that PL is likely to develop during an interaction ). It appears equally across depressed and non-depressed mother-child dyads (Field et al., 1989), but is more pronounced when mothers are under stress (Waters et al., 2014).
Multiple studies indicate that individual physiological profiles moderate the development of PL (Ebisch, 2012;Maninni et al, 2013;Waters et al., 2014), suggesting that a better understanding of group dynamics may be achieved by assessments of intrapersonal patterns associated with interpersonal dynamics.
Additionally, there may be an experiential component of PL, suggesting the possibility that dyads could report when they are more or less linked.
Teammates. Video games. In a series of interpersonal physiological studies examining teammates,  tested whether PL is a determinant of team performance. Pairs of gender-matched undergraduate students participated in variations of a jointly controlled video game, with and without visual or verbal contact with their partner. Measures of team performance and coordination, as well as continuous measures of interbeat interval, breathing rate, and SC were continuously collected. A weighted cross-coherence, as well as cross-correlations (lag-0), were used to assess PL. Results suggested that socio-visual contact was not a significant predictor of team performance or coordination, indicating that direct contact with partners was not required for teams to do well. PL in SC and interbeat interval significantly predicted task completion time.
Multiple assessments of PL were found to be predictive of team performance scores, but not of team coordination. These findings suggest that PL could play a significant role in how well teams perform, but is not dependent on coordinated behaviors.
In a follow-up study, Henning and Korbelak (2005) adapted the earlier procedures by randomly changing joystick controls (e.g., left/right became up/down). Teammates were again seated adjacent, but could not see each other's joystick movements. Interbeat interval was continuously recorded from team members while scores were kept on team performance. Weighted coherence scores in interbeat interval were used as the measure of PL. Results suggested that PL prior to controller change negatively predicted postchange tracking error, explaining 3.8 percent of performance variance across all teams and conditions. The authors concluded that there was enough empirical evidence to suggest that PL can be used to predict future team performance.
Chanel, Kivikangas, and Ravaja (2012) also measured team performance during video game play. Measures of electrodermal activity, RR, and interbeat interval were continuously recorded while teams of friends played a video game. Games were set to either cooperative or competitive mode, and replayed in the lab and the home, followed by a questionnaire on gaming experience. Assessment of PL followed Henning Armstead, and Ferris's (2009) approach of cross-correlations and weighted coherence.
Results indicated that PL increased with players' self-reported involvement in the social interaction, suggesting that it could be used as an objective measure of social presence.
For most measures, PL was higher for competitive versus cooperative play.
In another assessment of PL in teams playing video games, Jarvela et al. (2013) investigated whether social interaction and PL are affected by (a) competition/cooperation, and (b) computer opponents in games. Volunteer dyads of friends played a turn-based artillery game while cardiac and electrodermal activity were continuously recorded. Each team completed four conditions that varied by cooperative and competitive modes, both with and without a computer player. Henning et al.'s (2009) techniques of weighted cross coherence scores and cross-correlations were again used to assess PL. Results from a series of analyses suggested that on average, PL was present in all cardiac and electrodermal measures of teammates playing video games. The presence of a computer controlled character in the game was associated with significantly less PL, suggesting that players were not as focused on each other during those periods. Increased empathy and understanding between players was associated with significantly greater cardiac PL, though changes in conditions and self-reports were not associated with differences in PL of electrodermal activity.
Walker and colleagues (Walker, Muth, Switzer, & Rosopa, 2013) looked at PL between teams working on computer based problems to determine whether PL is an index of cognitive readiness. Two person teams were tasked with maintaining safe levels of operation in a simulated chemical plant across a variety of conditions. Performance was based on multiple calculations of team errors. Cardiac, electrodermal, and respiratory measures were continuously collected during the trials. Measures of PL were calculated using regressions and correlations. Though results did not yield a significant relationship between PL and team errors, no assessments were made to determine whether PL was present during the tasks.
In-vivo teamwork. Henning et al. (2009) assessed whether PL in heart rate variability (HRV) in team members could be used as a measure of teamwork. Speech and HRV were monitored in a preexisting 4-person graduate research group during regular meetings over 6 months. Following each meeting, team members completed a 7-item questionnaire on teamwork. Cross correlations (lag-0) were used to assess a number of indices of PL between group members. Results suggested that PL negatively predicted team ratings of their ability to work together, suggesting that in some contexts, increased PL can inhibit group cohesion.  completed the most physical study of PL to date by collecting interbeat interval from soldiers training to clear buildings of enemy combatants. Ten teams of four soldiers completed six trials in which they moved through a building, identifying live actors as combatants or non-combatants, and eliminating combatants using simulated firearms. Team performance was measured using a number of indices related to task success. Only 1-pair from each team was analyzed. Four measures derived from participants' interbeat intervals during the trials were assessed using four different measures of PL (a total of 16 measure-analysis combinations). Results from each measure-analysis combination were compared to visual inspection of the data. Six combinations were able to discriminate between visually categorized incidents of high PL and asynchrony, suggesting that these measure-analysis pairs were sufficient tests of existing PL. One measure-analysis pair identified significant differences between high and low performing teams (i.e., correlations of the log of participant's respiratory sinus arrhythmia), though no other significant differences in performance were observed through PL.
In the most in-depth study of PL to date, Müller and Lindenberger (2011) applied a series of advanced statistical procedures to assess group interactions in an eleven-person, conductor-led choir. The choir participated in 12 singing conditions that were video-recorded while HRV and respiration were continuously recorded. Tasks included singing in unison and in parts, and singing a canon in unison or in parts while participants' eyes were open and closed, both with and without the conductor singing. Physiological linkage was assessed by calculating difference in the coefficients of wavelets from each possible pair in the group. These differences in coefficients were then assessed using multiple techniques to create a set of 6 PL scores. A graph-theoretical network analysis was also run to determine group and sub-group relationships. Results showed that PL was greater in singing versus the rest periods, and when singing in unison versus singing in parts.
When the choir was singing in parts, network analyses detected subgroups with greater PL that corresponded to the sections being sung. Additionally, the analysis indicated that physiological changes in the conductor predicated similar changes in choir members.
These results were relatively consistent across multiple measures of PL.
Though multiple studies suggest a positive relationship between PL and task performance , others indicate an inverse one Henning & Korbelak, 2005), or none at all (Henning et al., 2009).
Further, whereas some works suggest that greater PL is associated with significant improvements in empathy and social interaction Jarvela et al., 2013), others indicate the opposite (Henning et al, 2009). Contradictions may be caused by differences in methodological and statistical approaches as well as differences in physiological measurements. Regardless, more work is needed to determine how PL relates to teamwork.

Friends and Strangers. The subgroup of friends and strangers is a general
categorization of participant relationships that do not fall under other sample types. Therefore, it is not necessarily independent of other categorizations, as teammates could also be friends, and therapist and clients may be meeting for the first time during a trial.
The first study to assess PL in casual relationships was completed by Kaplan and colleagues in 1963. They analyzed the conversations of medical students engaged in conversations in a group setting, with a-priori reports of affective relationships between group members. They found significantly greater correlations in SC responses when dyads reported strong affective ties (i.e., liked or disliked each other), than when they reported a neutral relationship. Field and colleagues (1992) compared PL in the HRs of children playing using autoregressive integrated moving average (ARIMA) models and correlations. They failed to find significant differences between friends versus acquaintance dyads. However, they did not assess whether the levels of PL were significantly greater than zero, only whether the levels detected differed by group. Similarly, Shearn and colleagues (Shearn, Spellman, Straley, Meirick, & Stryker, 1999) assessed differences in PL in SC and facial blushing between friends and strangers. Groups of three participants (two friends with a stranger) watched a video in which one individual from the group was singing. Results from analysis of blushing were not clear-cut. However, significant PL in SC was only observed between friends.
McFarland (2001) assessed PL as the cross-correlation of respiration of friends during conversations. Results indicated that the relationships in the breathing patterns of these dyads were significantly greater than chance. However, claims were not well supported with quantitative results. In a similarly small study, Silver and Parente (2004) assessed PL as the correlation of SC during conversations between male and female strangers as a test of first impressions. They found significant correlations across all dyads, but few quantitative results were reported.
In a methodologically focused study of PL, Guastello and colleagues (2006) compared linear and nonlinear models capabilities to detect concordance in SC of friends during conversations. Physiological linkage was detected during all conversation conditions, with no statistically significant difference between high conflict and neutral topics. Nonlinear analyses identified considerably more evidence of PL between partners, prompting the authors to conclude that physiological interdependencies are multilevel processes with both linear and nonlinear characteristics. Konvalinka et al. (2011) examined the PL of HR between fire-walkers and familial versus non-familial spectators. PL was measured via phase space modeling (i.e., cross-recurrence quantification analysis), and indicated PL between related pairs but not between unrelated pairs, indicating familiarity may mediate PL during the collective ritual experience.
Nearly all research on friend and stranger dyads has resulted in findings of PL, though comparisons between these types relationships have led to mixed results. There is some indication that the level of PL is significantly greater between friends and family as compared to strangers (e.g., Konvalinka et al., 2011;Shearn et al., 1999). However, PL has also been detected in conversations between strangers (e.g., Silver & Parente, 2001).
Findings suggest that PL between friends may be moderated by arousal level , and involve both linear and nonlinear patterns . Nonlinear models that include arousal level as a moderator may help to clarify such discrepancies in future research.

Discussion
Based on the results of this systematic review, a number of important findings can be extracted. First, the development of PL does not appear to be dependent on valance, preexisting relationships, or specific sensory cues. Mounting evidence indicates that physiological interactions can be observed between individuals meeting for the first time (e.g., , as well as in dyads or groups with established relationships (e.g., . It has also been observed across positive (e.g., empathy;  and negative (e.g., conflictual relationships;  contexts, as well as relatively neutral conditions (e.g., couples sitting together quietly ;. Additionally, PL has been observed in studies that limit physical, visual, and auditory cues, indicating that multiple pathways can lead to the development of these interactions. However, other studies have observed PL when participants were separated by a one-way mirror (e.g., Ebisch et al., 2013;Mannini et al., 2014). Physical contact has also been indicated as an isolatable mode of transmission (Creaven et al., 2014), though less work has been done in this area. These results suggest that physiological interactions between people can be generated through multiple sensory systems, but is not dependent on any one. At this point, more work is needed to determine the importance of each.
Second, the present findings suggest that PL is a transient state. Studies showing differences in PL across contexts and conditions indicate that physiological relationships change over time. This is evident in studies by  and , who showed that during a given time period, measures of concordance and lagged concordance are not static. This is an important consideration, as attempts to apply statistical models that assume a constant state may be problematic.
For example, if a dyad shifts between periods of concordance and discordance during a trial, but the entire interaction is assessed using a single linear model, then results will be an aggregate of two heterogeneous processes and will misrepresent the patterns of both.  and  address this issue well, highlighting the need for flexible statistical models capable of identifying multiple types of physiological relationships occurring during a single interaction.
A third implication in the literature is that autonomic activation may moderate PL.
Findings indicate that differences in arousal can influence the levels of PL (Craven et al., 2014;Ebisch, 2012;Maninni et al., 2013;Reed et al., 2013;Waters et al., 2014). For example, multiple studies suggest that average resting HR can moderate PL (Creaven et al., 2014;. Future studies should be designed to explore whether combinations of partners physiological levels, and states such as stability and lability, influence PL (e.g., does the combination of a high and low arousal increase the probability of a given type of PL?).
Finally, and perhaps most importantly, interpersonal physiological processes have been found to be predictive of other variables. However, results appear to be dependent on the combination of the type PL, and the context in which it occurs. For example, physiological concordance during conflict was found to be predictive of dissatisfaction in marriages, whereas concordance during psychotherapy and gamming was found to correspond to greater empathy and improved team performance. This type of synchrony has been interpreted as a feeling of being 'locked into' a negative conflict ), but as a feeling of being connected and understood during positive interactions . In another context, discordance was associated with positive interactions during partner conflict, which was interpreted as coordinated turntaking leading to more balanced communication (Reed et al., 2013). The type of PL observed during a given context can therefore be predictive of the outcome, though extensive work is needed to further explore the typologies of PL, and their relationships with context and valance.

Critical Issues for Future Research
Beyond these findings, a number of issues critical to future work were identified. The following sections highlight some of these issues, including terminology, physiological variables measured, idiographic versus nomothetic methods, laboratory versus in-vivo designs, and statistics analyses.
Terminology. The review of the literature identified terminological variation across the field, including inconsistent operational definitions. This issue is more than mere semantics, as the methodological and statistical approaches used in a study are dependent on the definition of the phenomenon they aim to identify. A number of authors have highlighted this issue and made attempts to resolve terminological ambiguities by operationally defining specific types of physiological relationships (e.g., Field, 2012;. Two example of this are morphogenic and morphostatic interactions . The first refers to continual shifts in arousal levels away from an optimal set point. For example, an escalating argument when the increase in arousal of each partner extends beyond the current state of the other, persistently increasing the arousal level of both. The latter term indicates a stable coregulatory process, when each partner's arousal level serves to maintain the state of the other, creating a mutually maintained homeostasis. For example, during a stressful period, each partner works to calm the other, and as a pair remain more stable than either would alone. The need for clearly operationalized definitions is in part due to the number of interpersonal patterns that have been theorized and observed (e.g., concordance, discordance, morphogenic, morphostatic), and the assumption that many others are possible. Quantitatively assessable definitions of distinct interpersonal patterns will help ensure that heterogeneous processes are not inappropriately aggregated.
Physiological Variables. Different measures of physiological relationships have been found to reflect unique components of interactions. For example, concordance has been found to occur in both RR and HR under some conditions, whereas in another condition, HR, but not RR, is synchronized Helm et al., 2012).
Such findings suggest that PL is systemically differentiated, in that each internal system reflects unique variance related to social encounters. Whereas some measures reflect specific autonomic systems, such as sympathetic (e.g., SC) or parasympathetic activities (e.g., HRV), other measures cannot discriminate between generating causes, and are therefore less specific (e.g., HR; Cacioppo, Tassinary, & Berntson, 2007).
Collecting data from multiple measures can lead to greater specificity of the processes involved in social contexts. An example of the successful use of multiple measures can be seen in Creaven et al. (2014). They found that the PL between mothers' HR and their children's respiratory sinus arrhythmia differed by group assignment, indicating that the physiological systems involved in linkage may differ across contexts.
In addition to the specific measures used, there are a number of complications in collecting, analyzing and interpreting physiology. The interested reader is therefore referred to other resources (e.g., Cacioppo et al., 2007;Goodwin, 2012) for more details.
Idiographic Versus Nomothetic Methods. When designing or interpreting interpersonal physiological research, it is important to consider the difference between idiographic and nomothetic designs. Results from the two approaches only correspond when all conditions of the ergodic theorems are met (e.g., multivariate normal data with equal autocorrelation and trends across the data; Molenaar, 2004a). Because nomothetic techniques model the data as a whole, results indicate the trend of the group, but obscure the unique patterns of the individuals. Nomothetic generalizations can therefore be interpreted as the tendency of the sample as a whole, and can be used to answer a variety of population level research questions. For example, nomothetic designs are well suited to determining whether a certain type of video game increases PL between players. As the game will be played by a specified population and cannot be tailored to the individual gamer, nomothetic results are appropriate. Alternatively, if the researcher is interested in the processes that lead to PL during gaming, then detailed temporal results from idiographic methods are needed.  review this discrepancy when discussing the heterogeneity of results from idiographic models of dyads. The researchers note that had a single model been fit to pooled data from all the dyads, it would represent an aggregated pattern, and not accurately represent the characteristics of dyads in the sample. Manini et al. (2013) observed this issue more directly by comparing findings from idiographic and nomothetic analyses completed on the same data. They noted that idiographic results indicated statistically significant levels of PL were present in dyads at varying lags, but that nomothetic results were non-significant. The authors discussed the heterogeneity of time lags in PL, which were could not be differentiated when data were pooled.
Though some nomothetic techniques (e.g., multilevel modeling) attempt to correct for these differences, they remain group level aggregates and are not able to represent idiographic trends (Molenaar, 2004b). Therefore, generalizations from idiographic results require different goals. For example, detailed analyses of dyads can be used to identify patterns of PL, then assess whether those patterns are recurrent across time, contexts, and dyads. A simplified example of this approach can be seen in  who assessed PL at the dyadic level, but presented results as the percentage of dyads observed with given characteristics. More quantitative generalization techniques, such as cluster analysis, also are available. For a more detailed discussion of the ergodic theorems and related issues, see Molenaar (2004a) or Velicer et al. (2014).

Controlled laboratory designs:
The current lack of research identifying mechanisms and processes involved in PL is problematic. Controlled laboratory designs aimed at discovering the building blocks that lead to these interactions is therefore needed. The goal of finding mechanisms of PL requires distinctly different procedures than those designed to utilize PL as an indicator of other constructs. For example, most studies to date use PL as a measure of group differences, with the aim of observing variations in interpersonal characteristics depending on a given condition. This methodology can be seen in studies such as , where the goal was to determine whether different levels of therapeutic training were associated with different levels of PL between therapists and clients. Alternatively, research can be designed to identify how PL changes over time. For example, Müller and Lindenberger (2001) showed that the direction of dependence between people's RRs changed as activities and roles changed. Similar approaches can be used to explore whether specified components are necessary for PL to develop, and whether specific patterns result from given conditions. Controlled, systematic research protocols designed to address components theorized to contribute to these interactions are therefore required. Accurate interpretations of results will be difficult until such research is completed (Sbarra & Hazan, 2008).

In-vivo Studies: In addition to laboratory experiments, in-vivo designs that
incorporate ambulatory assessments of participants in daily life may expose patterns that can only be assessed over longer periods of time. Longitudinal assessments may reveal ecologically valid processes that would not be obtainable through laboratory-based research. Tracking individual and interpersonal patterns over time may be the only way to establish the ecological validity of conclusions about processes such as coregulation, and may reveal a more complete picture of the emergence and consequences of PL. Though a few studies have taken longitudinal data in-vivo (e.g., , none to date have analyzed longitudinal trends, or taken advantage of the noninvasive telemetric measures that are now available.

Statistical Analyses. Another critical issue for interpersonal physiological
research is statistical analysis. The analysis of multivariate, nonstationary, intensive time series of physiology is wrought with complexities as these data violate a number of assumptions of parametric statistics. Though many viable analytic approaches are available, no 'best practices' have been established in this emerging field.

Stationarity and autocorrelation.
Two data conditions commonly overlooked are stationarity and autocorrelation. Data that is stationary maintains a relatively consistent mean and standard deviation over time. This is a rare condition in most physiological measures, but is a critical assumption for many analyses. Autocorrelation is the degree to which data is dependent on previous measurements, an unavoidable result of intensive sampling. This serial dependence violates of the assumption of independence of measurements required by most parametric statistics. Though some researchers maintain that data transformations used to account for autocorrelation (e.g., ARIMA modeling) remove important information (e.g., Henning & Korbelak, 2005;Maric & Orr, 2007), serial dependence can inflate variance estimates, leading to spurious results in any analysis that depends on variance or covariance structures .
The problem with correlation. When serial dependence and stationarity are ignored, results can be significantly affected (Chatfield, 20004). This is especially the case when analyses are either dependent on accurate estimates of variance, or assume that nonstationary data can be represented by a linear model. These critical issues are apparent in the most commonly employed measure of PL, correlation. This bivariate linear analysis requires both stationarity and independence of measurements (Chatfield, 20004;Levenson & Gotman, 1983). Ignoring these violations significantly increases the chances of spurious results. Though studies that compare results to a null hypothesis (e.g., results from random pairs) may indicate that a real effect is present, the size of that effect is not likely to be accurate.
This problem is amplified with the addition of moving windows. These popular procedures first designate a subset (or window) of the data, then calculate the correlation (or other bivariate analysis) of the subset, rather than for the entire data set. A running correlation results from iteratively shifting the window forward in time and rerunning the analysis. The technique is designed to generate a greater temporal resolution of the bivariate relationship (e.g., a low correlation in one window, and a high correlation in another). However, the same issues of stationarity and autocorrelation apply to any given window, regardless of its length. As a result, the findings in each window may be inaccurate. This problem is exacerbated when specific segments are extracted for further analysis (e.g., choosing the period with the 'highest correlation'), as the potential for type I error is compounded each time the analysis is calculated on a window (there may be hundreds of windows).
An extension of the windowing technique is the use of overlapping windows, in which some percentage of data is shared between adjacent windows. Overlaps are designed to further increase the temporal resolution of results by highlighting where changes occur in time. Unfortunately, this procedure has the potential to exponentially increase serial dependence. With these procedures, correlations calculated from serially dependent data are serially dependent, since they are assessing much of the same data, further complicating findings. Though all of these issues can be dealt with by statistically (e.g., removing autocorrelation through ARIMA models), a number of alternative statistical procedures are available.
One simple method to control for these issues is to use the first derivative, rather than the raw data. This method is typically an effective way to deal with stationarity.
Simulation studies indicate that when correlations of differenced data are nalized nomothetically (e.g., assessing group differences in correlations), effects from autocorrelation are negligible (Kettunen & Ravaja, 2000). Autocorrelation remained an issue for idiographic assessments, so alternative analyses are necessary.
Alternative statistics. There number of viable statistical procedures applicable to interpersonal physiological research is rapidly increasing. Many studies have made attempts to develop strategies tailored to these data. Some examples of method that have been applied include dynamic systems models , cross-lagged panel models , state-space modeling , Granger causality , and wavelet analysis . It is important that the researcher matches the statistical approach to the research question, as the interpretation of results can differ substantially. Though there is not currently a clear and proven approach for the assessment of any form of PL, consideration of previously discussed issues are necessary for findings to adequately answer the given research questions.

Theoretical Explanations
A number of theorists have described potential mechanisms and implications of interpersonal physiological relationships. These processes have been considered evidence of empathy , attachment (Diamond, 2008), and emotional regulation Field, 2012;Sbarra & Hazan, 2008), though there is not currently enough evidence to fully support any conclusions.
Empathy is the most commonly considered explanation of physiological interactions. From the original studies  to the most recent reports (Stratford et al., 2012), researchers Messina et al., 2012) and theorists (Adler, 2007;Sbarra & Hazan, 2008) have considered the possibility that the experiential connections that define emotional empathy (Hatfield et al., 1994;Preston & de Waal, 2002) are mirrored in physiology. These ideas suggest that the autonomic system is at the root of shared experience, and that a critical component of empathy is physiological concordance. Adler (2007)  empathy, other results indicate that these constructs are independent. Empathy may be dependent on a type of PL (e.g., concordance), but the same physiological relationships are observable in other contexts as well. Future research should be done to help disentangle this association, such as exploring whether a subtype of PL is specific to empathy.  has considered PL to be a component of a multisystemic biobehavioral synchronization that begins in gestation and continues throughout life. In a recent review incorporating her extensive work assessing biological, psychological, and behavioral synchrony, mainly between mother and infants, she considered any synchrony as a regulating process. This research indicates that interpersonal biobehavioral synchronization is required for healthy interaction, and this has been found to be an integral component of coregulation, empathy, and attachment . Feldman (2012) concluded that physiological concordance results from facial cues, and that if such behavioral synchronizations do not develop between mothers and infants, children will have lasting issues with attachment and self regulation. Though a number of studies have contradicted the assertion that PL is dependent on facial cues (e.g., Chatel-Goldman et al., 2014;Helm et al., 2012;), the importance of synchronistic relationships remains. Sbarra and Hazan (2008) consider physiological concordance to be a coregulatory process unique to attachment relationships. They argue that each individual is the primary physiological regulator for their partner, resulting in an interpersonal maintenance of emotional homeostasis. They cite evidence from a series of animal studies by Hofer (e.g., Hofer, 1995;Polan & Hofer, 1999) that showed that the removal of an attachment figure creates dysregulation in physiology and behavior. As this implies that autonomic functioning is synergetic rather than independent, they recommend modeling physiology as a bivariate system in which physiological processes are dependent on previous physiological measures of a partner. The authors recommend experimental procedures that systematically remove certain components of an attachment relationship during stress inducing tasks, such as controlling visual or olfactory cues. The presumption is that dysregulation in physiological concordance is most likely to occur during stress, and that by systematically interrupting channels through which the synchrony may be based, the mechanisms of the system could be discovered. Though their presumptions that concordance will only occur in secure attachment relationships and will be disrupted during stress have been contradicted, their recommendations for systematic exploration of interpersonal physiology are well founded needs. Sbarra and Hazan's (2008) theories are mirrored by Field (2012), who considers synchronization to be a psychobiological attunement in attachment relationships, assumed to increase in coordination over time. This model addresses the regulatory role of relationships, and proposes that explorations of interactions should assess what is missing when attachment figures are removed and synchronization is no longer evident.
Butler (2011) discusses physiological concordance as an underexplored anomaly in her theoretical paper on temporal interpersonal emotion systems (TIES). In her review, she considers human interactions as multimodal self-organizing dynamic systems.
Within that model, PL is considered an integral aspect of attachment. In later a work, Butler and Randall (2013) defined physiological coregulation as the bidirectional linkage of oscillating signals within optimal bounds, and discussed the potential of numerous additional types of PL . Though coregulatory interactions are descriptive of important processes, they are defined by long term patterns of PL, rather than conditions necessary for its development.
Another potential result of PL is interpersonal understanding. A study by Levenson and Ruef (1992) suggested that higher PL led to greater recognition of another's emotional state. A component of affective awareness may therefore be dependent on introceptive awareness (i.e., recognition of one's own state as an indicator of the state of another). As an extension of this concept, techniques used to engage, influence, and even ignore others may be physiologically based social strategies.  discuss this idea in relation to their findings indicating that mothers calm their children by first calming themselves. If this hypothesis is more generally accurate, in that individuals strategically adjust their own physiology in an effort to influence others, then a typology of social strategies may be operating at the physiological level.
Methodologies designed to observe and define this level of interaction could therefore shed new light on all social encounters.

Conclusions
Results from this review of the interpersonal physiological literature indicate that social processes are operating at the physiological level. The research to date has shown that physiological interactions are not limited to instances of synchrony, and the presence or absence of specific types of PL can be informative of the state of a relationship.
Sensory and contextual information has been shown to influence the level and presence of PL, but additional work is needed to identify the conditions that lead to these changes.
Controlled experiments designed to explore the components that generate PL are therefore needed. In addition, in-vivo designs are needed to explore these processes under natural conditions, and to add external validity to lab-based research.
The application of an inductive strategy is recommended to identify and define a typology of PL, followed by systematic replication of studies across contexts and time, both within and across people. Though converging evidence suggests that physiological interdependencies are robust enough to be detected using correlational analyses and nomothetic methods, results from these strategies may be too general to identify the mechanisms that lead to PL. Combining idiographic designs with dynamic time series analyses offers the greatest potential to explore these processes. Although physiological relationships have far reaching implications concerning the nature of human interactions, interpersonal physiology is a highly underexplored area, and extensive systematic research is required for these interactions to be well understood.      , attachment , and satisfaction . Though a small number of viable techniques have been used to analyze these data, the statistical methods most commonly applied are problematic. This is in part due to autocorrelation and nonstationarity inherent to physiological data . Autocorrelated data is serially dependent, violating the statistical assumption of independence required by most parametric procedures. Nonstationary data has an inconsistent mean and variance, so is not well represented by procedures that require these parameters to remain constant (i.e., stationary). Analyses that assume stationarity will typically model the data as a constant, and obscure the stochasticity and heterogeneity of social interactions . Therefore, dynamic idiographic techniques are needed to describe the temporal patterns at both univariate and multivariate levels.
As a first step in the assessment of any data, descriptive statistics and data visualization procedures are invaluable. However, descriptive statistics and many data visualizations are static aggregates that obscure temporal patterns. If the timescale represented by a statistic or a graph is not well matched to the temporal phase of a given process, patterns of interest may not be apparent. For example, if trying to study whether an intervention increases heart rate, a visual or statistical analysis of 10 milliseconds of heart rate (i.e., shorter than 1 heart beat) is unlikely to reveal the answer because the time scale is too short. An aggregate of 1 week of heart rate (i.e., thousands of heart beats aggregated together) will be equally uninformative, because the time scale is too long.
The appropriate timescale of the process of interest is hidden somewhere in between.
Though such a conclusion may be apparent in this example, the timescale in which many processes occur is unknown. To address this issue, the following paper offers a simple solution through a method of data decomposition in the time domain, defined as time series descriptive statistics (TSDs). The technique is designed to identify the timescale at which dynamic shifts occur in parameters of univariate and multivariate time series data.
The paper is organized in the following way. First, the problem with analysis of interpersonal physiology is reviewed in the literature. Second, details of TSDs are defined and discussed. Third, the procedure is applied to an example using data collected from a student on the autism spectrum and his teacher in-vivo. Fourth, limitations are discussed along with potential advancements and applications.

Review of the Problem
A key finding from the interpersonal physiological research is the random, transient nature of PL. Multiple studies show that lead-lag relationships and synchrony measures vary over time  indicating that the timescale of physiological interactions is inconsistent. This inhibits the utility of fixed statistical procedures that assume a stable state over a given period of time . For example,  used bivariate time series analysis to assess PL during two, 15 minute conversations (one neutral and one negative). However, it is likely that during each condition, the social dynamics involved in the conversations generated stochastic changes in the underlying physiological relationships. Using an analysis that assumes the pattern in each condition remains consistent (i.e., a linear regression coefficient), thereby treating variation in time as error, the temporal dynamics of PL are effectively ignored.  and  address this problem well, highlighting the need for flexible, dynamic analyses capable of identifying multiple types of physiological states, each with a unique timescale, occurring throughout a single interaction.

Time Series Descriptive Statistics
In order to explore temporal patterns in dynamic, nonstationary, multivariate data, an interpretable quantitative method is needed. One solution is to decompose the data in the time domain using two standard statistical procedures: descriptive statistics and moving windows. Moving windows procedures are a method of reassessing a statistic in forward shifting, equally sized subsets of data. For example, rather than assessing the variance of an entire time-series, these procedures first segment the series into fixed sized windows (e.g., 5-second subsets), then assess the variance of each segment. The result is a vector of temporal changes in the stability of the data, identifying the variability in the variance. This type of windowing is designed to generate greater temporal resolution in a measure, and has been used to transform physiological data into the frequency domain.
For example, short time Fourier transforms are frequency decompositions calculated in moving windows that have been used to acquire univariate estimates of the cyclical patterns of respiratory sinus arrhythmias as they vary over time (e.g., Blain, Meste, & Bermon, 2005;Pichon et al., 2004).
A critical decision when implementing a fixed windowing procedure is determining the appropriate length of the window. Window size refers to the length of the data in each subset, and determines the temporal resolution of results. Longer windows aggregate more data so give less temporal resolution, but may obscure shortterm dynamics. Shorter windows increase temporal resolution, but may unnecessarily segment homogeneous trends into multiple parts. A solution to this problem is to use a range of window lengths rather than choosing a single fixed window length. With this approach, the same windowing procedure is performed as previously described, but it is iteratively repeated, each time increasing the window length. The result is a matrix (W) of a given statistic (e.g., variance) being calculated on windows of increasing length. The first row of W therefore begins with the given statistic (e.g., the variance) being calculated on subsets of the smallest appropriate window size (Wmin). This row might show how variance, as calculated on 5-second windows, changes over the length of the data set. In the second row, the window length increases by a given length, (I), and the same statistic is calculated on the larger subsets. This procedure is iteratively repeated up to a given maximum window length (Wmax). Whereas window length increases by row, window origination remains constant by column. This means that looking down the columns of W, the first data point that the window assesses, regardless of the given row (i.e. window length), is always the same. The windows from every row in the first column begin with the first data point of the original time-series. The first row assesses the shortest windows (e.g., observations 1 through 5), and the last row assesses the longest window length (e.g., observations 1 through n). This leads to a triangular shape in W, as shorter windows lead to longer vectors (more windows are needed to capture the data), and longer windows lead to shorter vectors (fewer windows are needed to capture the data).
A similar triangular pattern is applied in wavelet analysis, a time-frequency decomposition. In wavelet analysis, shorter windows are associated with higher frequencies, which can be captured in full in a short time. Longer windows are used to assess lower frequencies, as they take a longer time to complete a cycle. Wavelet analyses are useful procedures when the generating function of the data is cyclical, and consists of relatively few frequency bands. However, stochastic data is less well represented in the frequency domain, as results can be difficult to interpret when the modeled process is not cyclical.
By decomposing the data in the time domain, a number of effects of time can be observed using statistics more suited to stochastic time series. For example, if a process remains constant for a long period of time, then the variance, mean, and slope of shorter and longer windows will be relatively equal. However, if a process is changing, there will be fewer similarities between shorter and longer windows. Similarly, processes that occur in short time periods will be obscured when assessing longer windows, whereas longer processes may appear as random noise in short windows.

Descriptive Statistics in Windows
The analyses conducted within each window can be nearly any descriptive statistical analysis. As each descriptive statistic represents specific mathematical parameters, applying multiple techniques can lead to a better understanding of the data.
Univariate Descriptive Statistics. Univariate descriptive statistics used in windows can include any combination of measures determined to be useful aggregates of a time-series. For example, three standard descriptive statistics used to analyze time series are the mean, variance, and slope. Whereas each measure is informative on its own, the combination provides a more robust understanding of where the data is located, its stability, and the direction in which it is changing.

Multivariate Descriptive Statistics.
To explore multivariate data, bivariate measures of distance can be used to assess similarities between univariate TSDs. One technique commonly used to assess similarities in time series data is Euclidian distance.
Typically used in procedures such as dynamic cluster analysis, Euclidian distance measures the length of the line connecting two data points. This can be assessed as the distance between two matrices (i.e., ( ! − ! ) 2 ). These can then be plotted using procedures such as heat-maps. Heat maps are graphical techniques where the values in a matrix are hierarchically represented by colors, allowing simple visual inspection of the change in value throughout the data.

Analyzing Multivariate Time Series Descriptive Statistics
Visualization. Matrices of TSDs can be analyzed using a variety of approaches.
Most simply, each matrix can be plotted for visual inspection using heat-map procedures.
These plots may help to identify distinct changes in a given parameter, and the timescales at which they occur. For example, if a process maintains a constant mean for a long period of time, then shorter and longer windows will be relatively equal. This leads to 'fields' of the same color, as the results remain stable over time. However, if the mean is changing, there will be fewer similarities between shorter and longer windows, leading to shifts in the color from left to right, and top to bottom. Similarly, short-term changes in slope may give the impression of stochasticity when assessing shorter windows (i.e., the first rows). However, if a longer trend is present, more pronounced fields of consistent color may emerge in longer windows (i.e., later rows). Whereas stabile periods show as consistent colors, short-term perturbations leave a 'pointing effect' in the color scheme, as highly localized events are isolated in short windows, but 'spread' as the event is aggregated into longer windows.

Statistical Analyses.
After decomposing a data set in the time domain, the resultant data can be analyzed using a wide range of statistical procedures. Interrupted time series analysis is one viable approach, as it enables assessment of changes in timeseries data while accounting for autocorrelation . It can therefore be used to determine whether segments from any vector of data are significantly different from each other. For example, if an event is theorized to induce a significant change in mean skin conductance, interrupted time series analysis can be used to test whether the skin conductance is significantly different before and after the event. If the event is believed to induce a short term change in the variance of skin conductance, the analysis may be run on a short timescale (e.g., 5-second windows). If it hypothesized to induce long-term changes, a longer timescale (e.g., 50-minutes) may be assessed instead.

An Application
One area of research that serves to benefit from assessment of physiological individual's behavior, could be associated with an increased probability of problem behaviors. Similarly, the degree to which a student is following the ebb and flow of social activities in the classroom may not be apparent through their actions, but may be identifiable through PL with others in the room.
A useful measure of physiological arousal is skin conductance. Reflective of sympathetic nervous system activity, skin conductance measures changes in eccrine sweat gland activity by tracking the electrical conductivity of the skin (Dawson, Schell, & Filion, 2000). Ambulatory measures of skin conductance are well suited to in-vivo study of autism spectrum disorder, as they are unobtrusive devices that have been found to be tolerable by most individuals in this population (Goodwin, Intille, Albinali, & Velicer, 2011). In the following sections, TSDs are applied to skin conductance measures collected from a student on the autism spectrum and his teacher during class in a specialized school for students with severe developmental disabilities.
The aim of this application is to explore whether intra and interpersonal patterns in skin conductance are informative of the student's behavior, and social engagement with his teacher. Towards that end, the following hypotheses are explored: First, that patterns of intra and interpersonal physiological interactions can be described using TSDs. Second, that a significant change in the variance, slope, or mean of skin conductance will be associated with increased behavioral problems in the student. Third, that student behavioral problems are associated with significant changes in the similarities between the student and teacher's skin conductance mean, variance, and slope.

Population
The following idiographic procedures address one student on the autism spectrum and his teacher during one 22-minute class. The school in which the class was held is involved in a larger, ongoing study of physiology and challenging behaviors in autism spectrum disorder. The dyad was selectively chosen because their skin conductance data was adequately clean (e.g., minimal noise and missingness) and the student presented a sufficient number of behavioral incidents. For the larger study, the student, a seventeen year old male, was required to meet standard classroom selection criteria based on intellectual ability, communication ability, behavioral characteristics, and tolerance wearing physiological monitors. His teacher was a thirty-five year old white male with a college degree. The teacher was employed at the school, and consented to be recorded with video, audio, and physiological sensors during classroom and standardized assessment activities. The two had been working together for approximately five years.

Procedures
Data were collected at the school as part of a larger study designed to evaluate the physiological, behavioral, and learning responses of children on the autism spectrum to intervention and instruction. Novel procedures utilizing advanced telemetrics in the school setting, including discretely mounted video cameras and microphones, and wireless sensors to record physiological states. Additionally, direct observations, psychosocial coding of behavior and functioning, and student records were collected.
Synchronized recordings of physiology, physical activity, video, and audio of classroom activities were collected from the student and staff during standard classroom protocols.
Wireless physiological and physical activity recording devices were fitted to the wrist, ankle, and/or around the chest of the student and teacher prior to classroom activities, and left on for the duration of the school day.

Measurement Tools
Multiple technologies were used to collect video, audio, physical activity, and physiological data. For the current study, physiological data was taken exclusively from The Q Sensor, manufactured by Affectiva. This sensor wirelessly records electrodermal activity, motor movements, and skin temperature. Data analysis was run using multiple statistical packages, including SAS, R, Matlab, and Excel.

Data Management of Skin Conductance
Each skin conductance time series was first assessed using visual inspection to determine its potential validity. Data were chosen for analyses if skin conductance from both the student and teacher showed appropriate levels and variability, had minimal missingness and artifacts, and occurred during videoed time periods. All selected data then underwent a series of data cleaning procedures. First, all data below a minimum threshold of .05uS was removed. Data was then smoothed using a 1 second (30 sample) Gaussian window using ledalab in Matlab (i.e., a low pass filter). Next, data was subsampled to 1-Hz (i.e., 1-data point per second). Visually identified artifacts (e.g., extreme peaks and drops) were then manually removed, and missing data was imputed using spline-type interpolation.

Video Coding
A set of operationally defined problem behaviors unique to the student were defined as a part of the larger study. Behaviors included jumping in his seat, elopement (i.e., leaving the area), holding his hands on his ears, being out of seat when not instructed to be, and biting his own hand. Two observers with masters degrees were then trained to identify these behaviors, and independently coded the video that accompanied the skin conductance data.

Analyses
Univariate Time Series Descriptive Statistics. Univariate TSDs were computed on the student and teacher's skin conductance. Three analyses were used: the mean (u), variance (var), and slope (s). This resulted in a matrix of each statistic for both the student, and the teacher. Student matrices are written as, S u , S var , and S s . Teacher matrices are denoted T u , T var , and T s . A minimum window size of 1-second was used, with a step of 1 (i.e., the window moved 1-data point forward, and recalculated the given statistic). This procedure results in the first row of S u and T u being equal to the original skin conductance data (S, T), and the first row of S var , T var , S s and T s being equal to zero.
Since the window increase was 1, the window length in each row is 1 data point longer than the previous. The maximum window length was the length of the original series, resulting in the last row being an aggregate of the entire series.

Analysis of Student Problem Behaviors and Skin Conductance.
To determine the relationship between univariate TSDs and the student's behaviors, an interrupted time series analyses was run. Interrupted time series analysis is a class of autoregressive integrated moving average (ARIMA) models designed to remove serial dependence in the data, then compare pre and post interruption (Glass, Willson, & Gottman, 2008). Pre interruption was defined as all periods without a behavior problem, and post interruption was defined as all periods during a behavior problem. Interrupted time series analyses were used to assess whether there was a change in S u , S var , and S s when behavioral problems occurred. Based on visual analysis of the plots of S, all analyses were run using 5-second windows (i.e., row 5 of each S matrix). Though the TSDs were run using a step of 1, for the time series analysis, a step of 5 was used (i.e., 0-overlap between windows) to ensure that the same data was not assessed both before and after the interruption.
A second set of interrupted time series analyses were then run to determine whether behavioral problems influenced the distance between student and teacher skin conductance. This analysis assessed whether the means of E u , E var , and E s were significantly different when problem behaviors occurred. The same time-scales were used as in the previous analysis. In addition, cross-correlations between student and teacher skin conductance, and cross-correlations between student and teacher TSDs were also computed using the same time-scales.

Video Coding
Following the data processing of skin conductance and video coding, only one video of the student and teacher fit the necessary criteria (i.e., adequate skin conductance from both student and teacher, during a class in which the student presented operationally defined behavior problems on video). Coding of the video reached high inter-rater reliability (kappa = 0.94), and a high number of behaviors were coded during a 22 minute class period (n = 23). Due to the infrequency of most operationally defined behaviors, the codes were combined to create a single variable for behavioral problems.

Univariate Time Series Descriptive Statistics
Univariate time series descriptive statistics (u, var, s) were computed for both the student's skin conductance (i.e., S u , S var , S s , Figure 4.1) and the teacher's skin conductance (i.e., T u , T var , T s ; Figure 4.2). A plot of the student's (mean = .63, SD= .52) and teacher's (mean = 5.32, SD = -.46) skin conductance can be seen in Figure 4.3. The student's raw skin conductance shows relatively minor variance in the first third, a large spike in activity in the middle third, and an abrupt return to lower levels in the last third.
This pattern is reflected in S u , S var , and S s . It is clear from all of these representations of the student's skin conductance, that the middle period involves the most change, and the period before is slightly more erratic than the period after. This indicates that the student's arousal level was labile, then spiked, and following a rapid recovery, was notably more stabile.
Plots of the student's TSDs indicate significant differences in the representation of student arousal level depending on the timescale used. For example, in S s , the first rows (< 1 minute window length), representative of short windows, show high variability in the slope (i.e., inconsistent coloring), suggesting frequent shifts in the direction and steepness of change in physiological arousal. However, when assessed at window lengths of approximately two to five minutes, a more stable trend emerges. This more general trend indicates a slight slope in the beginning, a steep pitch towards the middle, and near zero slope (i.e., no change) at the end. The consistency of this pattern through most window lengths indicates that it was relatively stable over time. This suggests that the student's experience during this time underwent three distinct regime shifts.
The TSDs of the teacher's skin conductance show that the mean, variance, and slope are more stable than the student's throughout the class period. The more pronounced shift in color from top to bottom rather than from left to right indicates greater stationarity. Though more subtle than the student's, the horizontal shifts in color scheme suggest change over time. However, there are some signs of similarity with the student. For example, in T s , the first rows (< 1 minute) again indicate more variability in the speed and direction of change in slope. When assessed as slightly longer trends (e.g., approximately 2-6 minutes window lengths), three shifts are also apparent. In the first segment, there is evidence of inconsistent periods of decreasing slope, as observable in the shifts to colors in the negative scale. Towards the middle of the plot, a near zero slope is maintained, followed by a consistent negative trend.

Multivariate Time Series Descriptive Statistics
Multivariate TSDs were computed for each univariate TSD to assess the Euclidian distance between the student and teacher (i.e., E u , E var , E s ; Figure 4.4). In these plots, there is a relatively stable, large distance in the means of the two series, indicating that the physiological arousal levels are consistently dissimilar. It is important to note here that the skin conductance was not standardized, and there may be limited interpretability when comparing mean levels across participants (Dawson et al., 2000).
Of more relevance, the variances of the two series remain closer during the first quarter and second half of the data, though these descriptive differences may not be meaningful.
Interestingly, the more general patterns of distance in slope (see Figure 4.4, E s ) indicate that the slopes of the two are markedly similar when assessed over the entire length of the class, suggesting that despite short term fluctuations, the speed and direction in which the student and teacher's physiological arousal levels change are similar over time.

Time Series Analysis
Interrupted time series analyses were calculated to determine whether student's behavioral problems led to significant changes in the variance, mean, or slope in the students skin conductance. Figure 4.5 includes plots of the 5-second windowed variance, mean, and slope of the student's skin conductance. Overlaid in red are periods when a behavioral problem was occurring. Two models were calculated on each of these series.
The first was a general transformation ARIMA model (5,0,0), which is designed to account for the autocorrelation regardless of the specific model in the data (Velicer & McDonald, 1991). Due to nonstationarity in the windowed-means, a differenced model was called for (i.e., 5,1,0). A second ARIMA model was determined through assessment of autocorrelation and partial autocorrelation of each series. The variance and slope called for the same model (1,0,0), whereas a differenced model (1,1,0) was needed for the mean. All results were non-significant using both models, indicating that the 5-second windowed variance, slope, and mean of the students skin conductance was not significantly different during behavior problems.
A second set of cross-correlations (lag-25) was then computed on the 5-second windowed variance, mean, and slope of the student and teacher's skin conductance. Both the general transformation model (5,0,0) and a fitted (1,0,0) model were used. All crosscorrelations were again non-significant.

Discussion
Interpersonal physiological interactions have been found to occur between partners under a variety of conditions, and shown to be indicative of specific relationship types (Palumbo, 2014). Autism spectrum disorder, often accompanied by alexithymia, has not been assessed using these methods. The hypothesis that interpersonal relationships between a student and teacher would be observable in their skin conductance activities was suggested in graphs of the Euclidian distance in skin conductance slope (i.e., E s ), but was not supported through statistical analysis.
Additionally, the hypothesis that the student's behavioral problems would be accompanied by significant changes in his skin conductance was not supported.
Despite the lack of findings in this idiographic example, the novel approach used to visualize the temporal scale of both univariate and multivariate data proved to be a useful technique. Through TSDs, the temporal scale of the data is presented in a form that allows visual identification of stability, lability, and regime shifts. This is a simple procedure that reduces a complex problem to an interpretable format. Beyond visual analysis, the matrices that result from these procedures are available for statistical analysis. Though often poorly understood, time is an important variable in a wide range of processes. By exploring the effect of changes in timescale, the period in which a process occurs can be assessed. Though similar to methods such as frequency decomposition (e.g., Fourier transform and wavelet analysis), such procedures are less suited to stochastic data as they address the range of cyclical patterns, rather than descriptions of the time components. With TSDs, the effect of time is decomposed, allowing visual and statistical assessment of the resulting data. Future studies may adapt additional statistical procedures. For example, recurrence analysis and cross recurrence analysis, dynamic systems analyses designed to find stable periods within and across data sets, are well suited to TSDs. These techniques can be used to determine the percentage of time that the same variance recurs from window to window and row to row, thereby testing for periods of stationarity. Adaptations to TSDs are also possible, such as the inclusion of lags to assess temporally distant relationships. For example, where the current study only assessed time-synchronized relationships between the student and teacher, lags could also be incorporated to test whether a change in one led to a similar change in the other across different timescales. Modeling procedures can also be used with TSDs. By mathematically defining theorized or observed patterns (e.g., a mean and variance above a given threshold lasting for a period of time), fit statistics could be used to test whether the data matched the model. Here, a researcher may theorize that an increase in the students slope above a given threshold, lasting for longer than a given period of time, would lead to greater behavioral incidents. Fit statistics (e.g., the Akaike information criterion) could then be used to test whether the given parameters were observable in the TSDs.
Despite the potential applications of TSDs, a number of limitations exist in the method, and in this application. First, this is an idiographic example of the relationship between a student, his teacher, and behavioral problems during a single class. There is therefore limited data and low power, so results can only be interpreted as descriptions of this specific data set. As such, despite statistical evidence indicating no significant relationships between these variables, there are no internally or externally generalizable results, only descriptions of one interaction. Second, though TSDs appear useful, systematic simulation studies will be necessary to test and develop the approach. Due to the number of calculations inherent in the iterative procedure, it is computationally expensive, so may not be appropriate with big data (e.g., skin conductance from a week, rather than 20 minutes). Due to the nature of visual analysis, plots can be deceptive.
Simply changing the scale of the data or the color scheme of a heat-map plot can lead to substantial differences in the appearance of the plots. Therefore, plots of TSDs must be well understood and appropriately presented to be interpretatively informative, and at best are only descriptions of the data they represent. Still, these are adaptive procedures that can be computed in a variety of ways, and the temporally decomposed data is available for statistical testing.
The research described herein defines an idiographic procedure designed to decompose both univariate and multivariate data in the time domain. The procedure was used to analyze continuously collected skin conductance from a student on the autism spectrum and his teacher during routine classroom activities in a specialized school.
Though no significant relationships were found between the student's skin conductance and his own behavior problems or the teacher's skin conductance, the methodology applied proved to be informative. Additional work is needed to further develop these procedures, but their flexibility, simplicity, and interpretability potentiate their future utility.

CONCLUSIONS FROM STUDIES
The findings presented here indicate that interpersonal physiological research has the potential to lead to significant insights in social psychology. As the presence or absence of PL may be informative, autonomic activities can be useful measures of any social interaction. Recent advances in telemetrics have enabled intensively sampled longitudinal data to be unobtrusively collected, making extensive research of interpersonal physiological processes possible in nearly any setting.

Chapter 1
Results from the first chapter show that PL is not dependent on dialog. Similar findings have been observed in other dyads (e.g. mothers and infants;  and under different conditions (i.e., , suggesting that proximity is sufficient for PL to develop. This implies that complex interactions are not necessary for social interactions to be observable in physiological processes. Future work aimed at identifying the fundamental components of PL can therefore continue to study simple social encounters to reduce confounding variables.

Chapter 2
In the second chapter, results showed that stationarity, though problematic for most analyses, is not a consistent condition with all skin conductance data. This prohibits the standard use of analyses that assume nonstationarity (e.g., cointegration). These results also point to the problem of the statistical constant required by most analyses. The general assumption that a specific model is able to define a process may not hold with these complex data. Social interactions are dynamic and unpredictable, and the physiological processes underlying them are inherently more complex. Statistical modeling approaches may therefore be attempting to define a heterogeneous set of processes as a single, constant condition. Taking this into account, future works have two potential options. The first is to continue using defined models, but rather than attempting to fit them to data from arbitrary time periods, using procedures to determine when a process begins and ends, and testing whether a given model fits that defined period. For example, the time series descriptive statistics presented in chapter four may be used to identify when a constant state begins and ends, followed by standard modeling procedures to statistically define those states. Alternatively, algorithmic search procedures can be employed to test whether a predefined condition is present, and label a given section of data as an example of that definition. For example, if a pattern of interest was defined (e.g., synchronized slopes of skin conductance of two people for 10 seconds or more), an algorithm could be used to test whether that pattern occurred in the data. Once a given pattern is located in the data, analyses could be run to assess the probability that other variables co-occur. Such an approach could lead to the identification of more specific patterns, along with covarying variables.

Chapter 3
The systematic review of the literature presented in third chapter showed that there is significant variability in the methods, analyses, and terminology used in studies of interpersonal physiology. Despite these circumstances, the field as a whole is moving toward more advanced analyses, and is beginning to generate convergent results. Results indicate that PL can be identified across populations and conditions, making interpersonal physiological methods an important addition to any social research.
In addition, the identification of specific patterns, such as concordance and discordance, suggest that there are generalizable types of PL that can be quantitatively defined and explored. Though PL appears to be a heterogeneous set of complex and potentially randomly occurring states, there are likely to be specific interaction types that recur within and across dyads and groups.

Chapter 4
The fourth study returns to the problem of analyses, but this time addresses the inconsistent timescale at which these processes appear to emerge and devolve. The method presented is a general technique designed to decompose multivariate time series data in the time domain. The approach was applied to measures of skin conductance taken from a student with autism spectrum disorder and his teacher during classroom activities. Though cross-correlations were not significant, plots suggested that periods of PL emerged. These exploratory findings suggest that individuals with autism spectrum disorder may experience a degree of social engagement, despite apparent alexithymia.
This study also indicates that physiological data can be collected in-vivo, and externally valid data can be generated and explored in challenging conditions.

Implications
Four important implications can be derived from the research to date. First, PL does not appear to be dependent on a specific context or relationship type. This implies that conditions can be met for PL to occur between any dyad or group under a wide range of contexts. This extends the utility of these processes, as they may be indicators of consequential dynamics underlying all social encounters. Second, there is a distinct typology of PL, with each definable pattern or set of patterns carrying unique implications. By exploring the different ways in which people interact at the physiological level, we can significantly enhance our understanding of social relationships. Third, findings that relate PL to constructs such as empathy suggest that there is a concurrent experiential component. Individuals may both recognize when they experience PL, and depend on it as a reference for interpersonal understanding. A component of intersubjectivity may therefore depend on mutual experience (i.e., PL), paired with accurate interoception. Finally, there is evidence that some social strategies are dependent on PL. Studies have shown that when a partner intends to influence the state of another, they first make a change in themselves (e.g., Ham et al., 2006;Muller et al., 2011). These results imply that individuals intuitively adapt their own physiological processes as a driver of social interactions. Such findings potentiate the utility of PL as a technique to improve therapeutic intervention , and as an end goal of treatment (Grove, 2006). Additional research is required to determine whether individuals already employ such strategies, and whether adaptations to them are clinically beneficial. However, should such techniques prove effective, they could lead to significant insights and advances in interpersonal understanding and influence.
At this point, it is clear that interpersonal physiology is an important research area with significant potential to enhance the field of social psychology. However, this is an emerging area, in need of advancements in research methods and analysis. Still, these are worthwhile endeavors with profound implications regarding the underlying nature of social behavior.