Infidelity

1
PREDICTING INFIDELITY WITH MACHINE LEARNING

Is Infidelity Predictable? Using Interpretable Machine Learning to Identify the Most
Important Predictors of Infidelity
Laura M. Vowels
School of Psychology, University of Southampton
Matthew J. Vowels
Centre for Computer Vision, Speech and Signal Processing, University of Surrey
Kristen P. Mark
Department of Family Medicine and Community Health, University of Minnesota

2
Abstract
Infidelity is a common occurrence in relationships and can have a devastating impact on both
partners’ well-being. A large body of literature have attempted to factors that can explain or
predict infidelity but have been unable to estimate the relative importance of each predictor.
We used a machine learning algorithm, random forest (a type of interpretable highly non-
linear decision tree), to predict in-person and online infidelity and intentions toward future
infidelity across three samples (two dyadic samples; N = 1846). We also used a game
theoretic explanation technique, Shapley values, which allowed us to estimate the effect size
of each predictor variable on infidelity. The present study showed that infidelity was
somewhat predictable overall with interpersonal factors (relationship satisfaction, love,
desire, relationship length) being the most predictive. The results suggest that addressing
relationship difficulties early in the relationship can help prevent future infidelity.
Keywords: Infidelity; Interpersonal Relationships; Machine learning; Random forest; Shapley
values
3
Is Infidelity Predictable? Using Interpretable Machine Learning to Identify the Most
Important Predictors of Infidelity
Infidelity is the most commonly reported cause of divorce in the United States (Amato
& Previti, 2004; Mark et al., 2011) and across at least 160 cultures (Betzig, 1989). The fallout
from infidelity can have devastating consequences for both members of the couple in
relationships including feelings of discontent, depression, blame, and frustration (Thompson
& O’Sullivan, 2016). In fact, infidelity is considered the third most difficult problem to work
with in therapy and the second only to abuse for having the most damaging impact on
relationships (Whisman et al., 1997). The prevalence estimates for lifetime infidelity range
between 20-52% depending on the way infidelity is defined and measured (Mark et al., 2011;
Mark & Haus, 2019; Thompson & O’Sullivan, 2016). Definitions of infidelity vary
widely across studies but can broadly can be defined as engaging in emotional or sexual
relations outside of the agreed-upon bounds of the relationship (Mark & Haus, 2019), and
may include behaviors such as flirting with someone, having an emotional connection, sexual
intercourse, or using pornography (Blow & Hartnett, 2005b). With the emergence of the
internet and smartphones, computer-mediated behaviors (e.g., sexting, sending explicit
photos, or watching live webcam porn) have also become more commonplace as forms of
infidelity (Albright, 2008).
Because of its potential devastating impact on individuals and relationships, many
studies to date have attempted to understand factors that may explain and predict infidelity
which include demographic, intraindividual, and interindividual variables (Mark & Haus,
2019). From an evolutionary perspective, men should be more motivated to engage in sexual
infidelity to maximize their reproductive success. Indeed, many studies have found that men
are more likely to engage in sex outside of a relationship (Labrecque & Whisman, 2017;
Petersen & Hyde, 2010) whereas women may be more likely to engage in emotional
4
infidelity (Selterman et al., 2019). Other studies have found similar levels of infidelity
between men and women especially when both sexual and emotional forms of infidelity are
considered (Allen et al., 2006; Fincham & May, 2017; Mark et al., 2011; Treas & Giesen,
2000).
Other demographic variables that have been previously associated with infidelity
include relationship status, education, and religion. Some studies have found that more
committed individuals are less likely to engage in infidelity (Amato & Previti, 2004; Fincham
& May, 2017) and highly educated individuals are more likely to engage in infidelity (Atkins
et al., 2001; Martins et al., 2016; Treas & Giesen, 2000). Whereas other studies have found
the opposite pattern or no difference for education (Allen et al., 2006; Fincham & May,
2017). Finally, individuals with no religious affiliation have been reported to be more likely
to engage in infidelity in some studies (Burdette et al., 2007; Fincham & May, 2017;
Mattingly et al., 2010) but others have not found religious affiliation to be a significant
predictor (Haseli et al., 2019; Mark et al., 2011).
In addition to demographic variables, there are other intraindividual factors that have
been linked to infidelity in previous studies. For example, individuals with more permissive
sexual attitudes have been shown to be more likely to engage in infidelity (Fincham & May,
2017; Haseli et al., 2019; Martins et al., 2016). Similarly, higher sexual interest in both men
and women has been associated with a higher likelihood of engaging in sexual infidelity
(Fincham & May, 2017; Treas & Giesen, 2000). Several studies have found that individual
differences in attachment predict infidelity. Specifically, more anxious (i.e., individuals who
feel unlovable and unworthy and thus seek excessive reassurance and support in
relationships) and avoidant (i.e., individuals who do not trust in other’s capacity to be there
for them and thus focus on independence and self-reliance) individuals are more likely to
5
engage in infidelity compared to more secure individuals (i.e., individuals who feel lovable
and trust others; Fincham & May, 2017; Haseli et al., 2019; McDaniel et al., 2017).
There are also interpersonal factors that are associated with greater likelihood of
infidelity in relationships. Although not consistent across all studies, most studies have found
relationship satisfaction to be a significant predictor of infidelity (Atkins et al., 2001;
Fincham & May, 2017; Glass & Wright, 1985; Haseli et al., 2019; Owen et al., 2013; Spanier
& Margolis, 1983). Dissatisfaction with one’s sexual relationship, especially related to a
decline in frequency of sex as relationship length increases has also been associated with
greater likelihood of infidelity for men (Liu, 2000). Furthermore, incompatibility between
partners in terms of sexual attitudes has been associated with infidelity, at least for women
(Haseli et al., 2019; Mark et al., 2011).
While a number of predictors have been found to be associated with infidelity, the
findings are often inconsistent and the studies suffer from poor methodologies (Blow &
Hartnett, 2005a, 2005b). Previous research has also exclusively utilized traditional linear
models, which are ill-equipped to handle a large number of predictors simultaneously, are
unable to estimate non-linear associations or complex interactions, and tend to produce
unreliable estimates that leave models completely uninterpretable (Breiman, 2001a; Lundberg
et al., 2020; Yarkoni & Westfall, 2017). A small number of studies in relationship science to
date have used machine learning to overcome issues with linear models (Großmann et al.,
2019; Joel et al., 2017, 2020). However, to date none of these studies have been able to
estimate the size or the direction of the effect of each individual predictor variable on the
model outcome.
Recent developments in machine learning have provided tools that allow
interpretation of the results through explanations of machine learning models (Lundberg et
al., 2017, 2019). This work is particularly interesting because it enables researchers to
6
combine the use of powerful machine learning algorithms and state-of-the-art model
explainability tools that can provide not only accurate predictions but also increase our
understanding of which factors are the most important in predicting the outcome. The latter is
of particular importance because one of the principal aims of psychology is to develop
understanding (Grosz et al. 2020). In the present study, we take advantage of this new
development in machine learning by using random forests (Breiman, 2001b) with Shapley
values (Lundberg et al., 2017, 2019) to estimate the effect size and direction of the effect of
each variable predicting past infidelity. A random forest is a form of interpretable decision
tree that can handle highly non-linear relationships and complex interactions without
overfitting to the data and estimate a large number of predictors simultaneously enabling us
to compare the effect sizes across different variables.
The main aims of the present study were to determine whether we could predict
sexual and online infidelity as well as future intentions toward infidelity and estimate which
variables contribute the most variance in the outcome. Because the study was exploratory in
nature and machine learning is more suitable for exploratory research (Yarkoni & Westfall,
2017), we did not make any a priori hypotheses. However, we used k-fold cross-validation,
in which the model is trained on one part of the data and tested on another. Therefore, this
technique evaluates the model generalizability on unseen test data effectively providing a
confirmatory analysis. We used data from three different studies to further aid
generalizability of our results: one in which data were collected from individuals (Sample 1)
and two datasets in which data were collected from both members of the couple (Samples 2
and 3). Because many previous studies have found differences between men and women, we
analyzed each dataset together for all participants and separately for men and women. In the
latter two samples we also estimated the models including both dyad members’ variables as
7
predictors in order to explore whether partner variables are also associated with self’s
outcome.
Results
Prevalence of Infidelity. Most participants in Sample 1 were currently in a
relationship but only one member of the couple responded to the survey. They were asked
about infidelity in their current or most recent relationship: 32.0% of a total of 891
participants (43.4% of men; 25.7% of women) had engaged in in-person infidelity compared
to 26.6% in online infidelity (41.6% of men; 18.5% of women). In Sample 2, both members
of the couple responded to the surveys and reported on engagement in sexual infidelity in
person or online in their current relationship: 17.4% of a total of 404 participants (18.8% of
men; 15.9% of women) had engaged in in-person infidelity compared to 14.1% in online
infidelity (16.8% of men; 11.4% of women).
Finally, in Sample 3, bisexual individuals who were currently in a romantic mixed-sex
relationship were invited to participate and their partners were also invited to complete the
survey. Most participants’ partners also completed the survey but the data also included some
bisexual individuals whose partners did not complete the survey. Because over a quarter of
the sample were consensually non-monogamous, we did not use the questions regarding
engagement in sexual activity with someone other than partner as a measure of infidelity.
Instead, we used whether participants had engaged in sexual activity that could have hurt
partner’s feelings. We also measured participants’ intentions toward future infidelity. In
Sample 3, 16.5% of a total of 552 participants (12.3% of men; 18.1% of women) had engaged
in a sexual behavior with someone other than partner that could hurt partner’s feelings. On
average, participants reported being unlikely to engage in infidelity (M = -1.55, SD = 1.30,
range -3 to 3) in their intentions toward engaging in infidelity (men: M = -1.36, SD = 1.42,
range -3 to 2; women: M = -1.67, SD = 1.17, range -3 to 2.43).

8
Prediction Accuracy. We estimated models for all participants as well as for men and
women separately. In Samples 2 and 3, we also estimated the models with and without
partner effects for men and women. We also estimated the models for each outcome. This
resulted in a total of 26 models. The results for the overall model performances can be found
in Table 1. We report precision, recall, and F1 scores for each class (0 = no infidelity, 1 =
infidelity) as well as an overall measure of the model performance using Matthews
correlation coefficient (MCC). The MCC coefficient can be interpreted as an overall effect
size for the model using established effect size guidelines for Pearson’s correlation: .1 =
small, .3 = medium, and .5 = large effect (Cohen, 1992).
Overall, the effect size for in-person infidelity for all participants was between .28 and
.36 indicating a medium effect size. The effect size for men was between .15 and .32 when
only actor effects were included in the models and between .08 (Sample 3) and .42 (Sample
2) when partner effects were also included. The effect size for women was between .25
and .35 when only actor effects were included in the models and between .23 and .35 when
both actor and partner effects were included in the models. Overall, including partner effects
in the models only improved the model performance for men in Sample 2 (.32 compared
to .42). The prediction effect size for online infidelity was medium to large for all participants
(.36 to .38). The effect size for men was between .28 and .33 and for women between .18 and
.49. When both actor and partner effects were included in the models, the overall effect size
decreased from .33 to .24 for men and from .49 to .40 for women suggesting that partner
effects did not add any information and may even detract from the model performance.
Finally, in addition to predicting infidelity as a class, we also used the model to
predict intention toward infidelity in Sample 3. Overall, we could predict 42.0% of the
variance for all participants. The model was better at predicting men’s (58.0%) intention
toward engaging in infidelity compared to women’s (31.6%). Adding partner effects into the
9
model did not change the model performance for men (58.0% compared to 58.8%) but
improved for women (31.6% to 40.5).
The Most Important Predictors of Infidelity. In addition to using the models to
predict infidelity, we also estimated each predictor variable’s contribution to the model
performance using Shapley values. We include the top-10 most important predictors for each
model in Figures 1-6. Due to space limitations, we only provide results for the models
without partner effects given that partner effects did not generally improve the models’
predictive ability. However, for interested readers, all results can be found on the OSF project
page (https://osf.io/ehzkm/?view_only=f9232534d9f84541a38a2fec228fc72d) including the
importances for Top-20 variables. The left side of each figure provides the mean effect of
each variable on the model outcome for each class. The right side of the figure provides the
estimates for each individual participant. Red indicates a higher value of the predictor
variable and blue indicates a lower value. For example, red is equal to 1 and blue is equal to 0
for binary variables. For the outcome variable, points on the right side of the figure show an
increase in the likelihood of engaging in infidelity whereas the left of the middle point show a
decreased likelihood of engaging in infidelity. It is important to note that the three samples
differed somewhat in the predictor and outcome variables that were available and therefore
the results for the most important predictors vary somewhat across the samples. For the sake
of brevity, we have not discussed each predictor variable in the top-10 in detail as all of the
results can be seen in the figures. We have provided examples of interpretation and discussed
the most interesting and/or consistent predictors below.
There were several variables that were included in the top-10 most predictive
variables across all three samples (Figures 1-3) across most of the analyses (all, men,
women): relationship satisfaction, solitary desire, dyadic desire, relationship length, and some
sexual activities (had anal sex, oral sex, or vaginal sex). Overall, higher scored on
10
relationship satisfaction predicted a decreased likelihood of having engaged in infidelity and
lower satisfaction an increased likelihood of engaging in infidelity. However, some highly
satisfied individuals were also more likely to have engaged in infidelity suggesting a more
complex relationship between relationship satisfaction and infidelity. Higher solitary and
dyadic desire as well as longer relationship length predicted an increase in likelihood of
having engaged in infidelity across the samples. Higher sexual satisfaction and romantic love
in Samples 2 and 3 also predicted a decreased likelihood of having engaged in infidelity.
More liberal attitudes toward sexuality in Sample 1 also predicted a higher likelihood of
having engaged in infidelity.
Online infidelity was only measured in Samples 1 (Figure 4) and 2 (Figure 5). Across
the two samples, having never had anal sex with the current partner decreased the likelihood
of also having engaged in infidelity and higher relationship length and sexual desire increased
the likelihood of having engaged in infidelity. Relationship and sexual satisfaction were only
in the top-10 predictors in Sample 2. Romantic love was also predictive of online infidelity in
Sample 2. Use of hormonal contraceptives decreased the likelihood of men having engaged in
online infidelity in Sample 1 whereas it increased the likelihood of both men and women
having engaged in online infidelity in Sample 2.
Finally, we also measured intentions toward infidelity in Sample 3 (see Figure 6).
Higher relationship and sexual satisfaction as well as romantic love predicted a decrease in
intentions to engage in infidelity for both men and women. Both higher dyadic and solitary
desire predicted an increase in likelihood of engaging in infidelity. Desire discrepancy,
however, was not in the top-10 predictors. Participants who attended weekly religious
services also had higher intentions toward engaging in infidelity compared to participants
who did not attend religious services weekly. Individuals who had been in their relationship
11
for longer also had higher intentions toward engaging in infidelity compared to individuals
whose relationship was newer.
Moderator Variables. We also examined which interactions may have contributed to
the overall prediction. Due to space limitations, we have only provided figures for the
interactions for Sample 3 as examples because the sample had reports of both past infidelity
as well as intentions toward future infidelity (see Figure 7). Figures with all possible
interactions and simple interaction plots can be found on the OSF project page for each
analysis. In the OSF figures of interaction matrices, purple indicates no interaction and
yellow indicates the strongest interaction.
An interaction between intentions toward engaging in infidelity and being a college
graduate (all), attending weekly religious service (women), and ever had anal (men) also
contributed to the prediction of having engaged in infidelity. For example, participants who
had graduated college and had more conservative sexual attitudes had the highest likelihood
of having engaged in infidelity whereas participants who had graduated college and had more
liberal attitudes were the least likely to have engaged in infidelity. Of participants who had
not graduated college, more conservative participants were less likely to have engaged in
infidelity compared to more liberal participants. An interaction between relationship
satisfaction and romantic love (all) and no religion (women and men) also contributed to the
prediction of intentions toward infidelity. For men, those who were not religious and were
satisfied in their relationship were more likely to have intentions toward engaging in
infidelity compared to less satisfied participants. In contrast, men with religious affiliation
were less likely to have intentions to engage in infidelity if they were more satisfied in the
relationship whereas when they were less satisfied, they were more likely to have higher
intentions toward engaging in infidelity. The pattern of the interaction between relationship
satisfaction and no religion was opposite for women.

12
Discussion
Infidelity is relatively common with up to half the people in relationships having
engaged in infidelity (Mark et al., 2011; Mark & Haus, 2019; Thompson & O’Sullivan, 2016)
with potentially devastating consequences for relationships causing distress (Thompson &
O’Sullivan, 2016) and often divorce (Amato & Previti, 2004). Infidelity is likely to affect not
only the couple members but also their children, extended family, and friends. It is important
to identify potential risk factors for infidelity in order to target interventions that could
prevent infidelity from occurring in the first place. The purpose of the present study was to
identify potential factors associated with infidelity and to quantify and compare different
factors to better understand which variables are the most strongly associated with infidelity.
A large body of literature has attempted to identify which factors contribute to
infidelity but has suffered from methodological and conceptual inconsistencies making the
results difficult to interpret (Blow & Hartnett, 2005a). Furthermore, the studies have relied
exclusively on linear models, which are often completely uninterpretable due to problems
such as incorrect specification of the underlying causal structure, multicollinearity,
unattainable parametric assumptions, and inability to examine complex associations
(Breiman, 2001a; Lundberg et al., 2020; Yarkoni & Westfall, 2017). The present study is the
first of its kind to examine predictors of infidelity using interpretable predictive models:
random forests (Breiman, 2001b) with Shapley values (Lundberg et al., 2017, 2019). Based
on our findings, the short answer to the question posed in the title, “is infidelity predictable?”,
is somewhat. The effect sizes that take into account the true and false positives and negatives
of both classes ranged between small (.08) to large effect (.49) across analyses and samples
suggesting that even though we were able to predict infidelity generally well above chance
level, there are also other factors that we had not accounted for.
13
While we examined the predictive accuracy of our models, our main aim was to
compare a range of different factors in their ability to predict infidelity. A recent systematic
review found that while demographics and individual characteristics are inconsistently
associated with infidelity, relationship variables tend to be more consistent across studies
(Haseli et al., 2019). We also found that relationship characteristics (relationship satisfaction,
relationship length, dyadic desire, sexual satisfaction, romantic love, and some sexual
activities within the relationship) were consistently in the top-10 most important predictors
across different samples. These findings suggest that addressing relationship issues early on
in the relationship may buffer against the likelihood of one partner going out of the
relationship to seek fulfilment. However, it is also important to note that while individuals
who were more satisfied in their relationship were generally less likely to engage in
infidelity, a subsample of highly satisfied individuals had engaged in infidelity in the past.
This may either reflect the idea that infidelity does also occur in happy relationships (Perel,
2017) or perhaps couples have worked through the infidelity and by the time they responded
to the survey were satisfied in their relationship (Atwater, 1982; Olson et al., 2002).
Furthermore, online infidelity has become more commonplace given the technological
advances in recent years (Albright, 2008). Therefore, we also examined predictors of online
infidelity in two of the three samples. Interestingly, one of the strongest predictors of a
decreased likelihood of having engaged in infidelity online was never having had anal in the
present relationship. This may reflect more restrictive attitudes toward sexuality overall.
Indeed, attitudes toward sexuality were measured in Sample 1 and ranked among the top-10
predictors of online infidelity. However, the relationship was more complex with the most
liberal sexual attitudes predicting an increase in likelihood of having engaged in infidelity
whereas more moderate and conservative attitudes predicted a decrease. These results are in
line with other studies that have found that more permissive sexual attitudes have been
14
associated with an increased likelihood of having engaged in infidelity (Fincham & May,
2017; Haseli et al., 2019; Martins et al., 2016). Higher relationship length and sexual desire
also increased the likelihood of having engaged in online infidelity. However, sexual and
relationship satisfaction were only among the top predictors in one of the two samples.
Because the studies were all cross-sectional in nature and some characteristics (e.g.,
relationship quality) may have changed since having engaged in infidelity, we also examined
future intentions toward infidelity in Sample 3. The results showed that higher relationship
and sexual satisfaction as well as romantic love predicted a decrease in intentions to engage
in infidelity whereas previous infidelity, dyadic and solitary desire, as well as longer
relationship length predicted increased intentions to engage in infidelity. Attending religious
services weekly also increased intentions to engage in infidelity for both men and women
whereas having no religion was associated with less intention toward infidelity in men.
Although not consistent, religion generally predicted a decreased likelihood of having
engaged in infidelity which is in line with previous research (Burdette et al., 2007; Fincham
& May, 2017; Mattingly et al., 2010). Therefore, it is possible that individuals who are more
religious fantasize about engaging in infidelity but are also less likely to actually act on those
fantasies. This is potentially interesting proposition and warrants further investigation.
While the results of the present study corroborate many of the existing studies and
akin to a recent systematic review (Haseli et al., 2019) show that the most robust predictors of
infidelity lie within the relationship: individuals who are more satisfied and in love in their
relationship are less likely to have engaged in infidelity and have less intentions to engage in
infidelity in the future. There are also a number of factors that have previously been
associated with infidelity that were not among the most important predictors in the present
study: education (Atkins et al., 2001; Martins et al., 2016; Treas & Giesen, 2000),
relationship status (Amato & Previti, 2004; Fincham & May, 2017), and attachment
15
(Fincham & May, 2017; Haseli et al., 2019; McDaniel et al., 2017). We only examined
attachment in Sample 1 and higher attachment avoidance did predict an increased likelihood
of having engaged in infidelity in the total sample but was not among the top-10 predictors
for men or women. Attachment anxiety was not predictive of past infidelity. Furthermore,
many previous studies suggest that men are more likely to engage in sexual infidelity than
women (Labrecque & Whisman, 2017; Petersen & Hyde, 2010). In the present study, being a
man was only an important predictor of past online infidelity in one sample supporting
studies that have found that the gender gap in infidelity is decreasing (Allen et al., 2006;
Fincham & May, 2017; Mark et al., 2011; Treas & Giesen, 2000).
The present study adds to our understanding of the most important predictors for
infidelity across three samples. We used a powerful interpretable machine learning technique
that allowed us to produce reliable estimates of the effect sizes of each variable both for the
mean effect as well as the spread of the individual effects (Lundberg et al., 2017, 2019).
Using this method, we were also able to compare a large number of predictors simultaneously
and estimate any non-linear associations and complex interactions. We also examined both
in-person and online infidelity as well as intention toward future infidelity.
However, the study also had a number of limitations that should be considered. First,
we used a single item measure of in-person and online infidelity and only used a validated
measure for intention toward future infidelity. We were thus unable to account for specific
infidelity behaviors and did not examine emotional infidelity. Future research is needed to
examine a wider range of infidelity behaviors to better understand whether the same
predictors generalize across multiple forms of infidelity or whether these are predicted by
different variables. The results from the present study suggest that these may be somewhat
different given that the most important predictors of in-person and online infidelity also
varied. Second, while we examined infidelity across three large samples, two of which
16
included data from both members of the couple, the studies were all cross-sectional and it is
not clear how recently the infidelity occurred. Therefore, some of the factors may have
changed from when the infidelity occurred to when the participants completed the survey.
This is a difficulty across most other studies on infidelity but future research should examine
infidelity over time or to conduct surveys on individuals who have just engaged in infidelity.
Third, over 30% of the participants in Sample 1 reported past infidelity. However, the
number of participants who had engaged in infidelity in the dyadic samples was much lower.
This made it more difficult for the algorithm to accurately predict infidelity which is reflected
in lower precision and recall for the infidelity class compared to no infidelity. We used
balanced random forests in order to mitigate this issue but we still had less data available of
people with past infidelity. Finally, while random forests are a powerful tool that will take
advantage of any correlations and interactions in the data, no matter how non-linear, it cannot
be used to estimate causality. However, in the absence of a means to reliably estimate
causality when examining factors relating to infidelity (after all we cannot create experiments
in which we make people engage in infidelity), we believe that using a predictive model is
perhaps the best option.
In conclusion, the present study provides the most robust and reliable evidence of
factors associated with past in-person and online infidelity as well as intentions toward future
infidelity. The results showed that relationship variables were the most robust predictors of
both past and future infidelity whereas demographics and individual differences variables
were not consistently associated with infidelity. These results suggest that intervening early
on in relationships when difficulties first arise may be the best way to prevent future
infidelity. Furthermore, because sexual desire was one of the most robust predictors of
infidelity, discussing sexual needs and desires and finding ways to meet those needs in
relationships is also likely to decrease the risk of infidelity.

17
Methods
Sample 1
Participants and Procedure
The data were collected as part of a larger cross-sectional study. Participants were
recruited through mTurk and were asked to complete an online survey and were paid 30 cents
for the task. Recruitment was also conducted through social networking sites (e.g., Facebook,
Twitter), email listservs, and targeted recruitment for sexual minority participants on online
forums. Participants recruited from these mediums were entered into a draw to win one of
four $40 Amazon gift cards. Participants were eligible for the study if they were over 18
years of age and had experience with at least one romantic relationship. Ethical approval was
obtained from the [blinded for peer-review] institutional review board and all participants
received a written informed consent at the start of the baseline survey. Details of the
procedure can be found from [blinded for peer review].
A total of 1,097 participants consented to participate. Participants who had not
completed the study, had a large amount of missing data, or were missing the outcome
variable were removed from the analyses. Therefore, the final sample consisted of 891
participants; 557 (62.5%) cis-gender women, 279 (31.3%) cis-gender men, and 25 (2.8%)
genderqueer. Most of the participants were straight (n = 483; 53.9%), 189 (21.2%) identified
as bisexual, 101 (11.3%) gay, and 60 (6.7%) lesbian. Majority of the participants were White
(88.4%), married or cohabiting (62.7%), had at least one child (24.5%), had at least some
level of college (95.8%), and did not identify with any religion (54.5%). The average age of
the participants was 32.7 years (SD = 9.63) and the average relationship length for those who
were in a relationship was 6.21 (SD = 7.12).
Measures
18
We included all measures as predictor variables that were collected in the study,
which included a total of 95 variables after recoding all categorical variables into dummy
variables. These included demographic questions on age, race/ethnicity, gender, sexual
orientation, relationship status, children, and education. Participants also completed questions
around their contraceptive use, sexual behaviors, whether they wanted sex or communication
more or less than they were currently engaging in, and mental and physical health. The
outcome, infidelity, was measured using a single item question for in person infidelity (“I had
sex (e.g., vaginal sex, anal sex, oral sex) with someone other than my current partner”) and
online infidelity (“I interacted sexually with someone other than my current partner on the
Internet (had chat room sex, web cam sex, etc.)”). Both questions were dichotomized with
yes = 1 and no = 0. The following constructs were assessed using previously validated
questionnaires:
Sexual desire was assessed using the Sexual Desire Inventory (SDI; Spector et al.,
1996). The scale was used as both a single scale (13 items) as well as divided into dyadic
(nine items) and solitary desire (four items) and assesses an individual’s interest sexual
activity over the past month with higher scores being indicative of higher sexual desire.
Sexual desire was also assessed using the Halbert Index for Sexual Desire (HISD; Yousefi et
al., 2014) which measures sexual desire using 25 items with higher scores being indicative of
higher sexual desire. Sexual satisfaction was assessed using the General Measure of Sexual
Satisfaction Scale (GMSEX; Lawrance & Byers, 1992). The GMSEX is a 5-item measure
used to assess satisfaction with the sexual relationship. Relationship satisfaction was assessed
using the General Measure of Relationship Satisfaction (GMREL; Lawrance & Byers, 1992).
Both GMREL and GMSEX are scored on a 7-point semantic differential scale and higher
scores are indicative of greater satisfaction. Dispositional mindfulness was measured using
the Five Facet Mindfulness Questionnaire – short form (FFMQ-SF; Bohlmeijer et al., 2011).
19
The scale comprises of a total of 24 items that are divided into five subscales: being non-
reactive, observant, acting with awareness, describing feelings, and non-judgmental attitude.
The items are scored on a 5-point Likert scale with higher scores indicating participants’
agreement with the statement. Attitudes Toward Sexuality Scale (ATSS; Fisher & Hall, 1988)
was used to assess participants’ attitudes toward sexuality. The scale comprises of 13 items
that are measured on a 5-point Likert scale with higher scores indicating the participant is
more liberal, lower more conservative. The Perception of Love and Sex Scale (PLSS;
Hendrick & Hendrick, 2002) measures one’s perception of love and sex comprising of four
subscales: love is most important (six items), sex demonstrates love (four items), love comes
before sex (four items), and sex is declining (three items). The items are measured on a 5-
point Likert scale with higher scores indicating higher agreement. Attachment style was
assessed using the Experience in Close Relationships Scale – Short form (ECR-S; Wei et al.,
2007). The ECR-S consists of two 6-item Likert scales: one for anxiety and one for
avoidance. Higher scores indicate higher levels of insecure attachment.
Sample 2
We used baseline data from a longitudinal study of couples. The couples were
recruited through various listservs, websites, and social media (e.g., Facebook, Twitter).
Participants who were 18 years of age or older, in a mixed sex relationship for a minimum of
three years, currently living with that partner, with no children under the age of one, and not
pregnant (or with a pregnant partner) at the time, met the inclusion criteria and were directed
to provide their partner’s email address. Partners were then emailed the same information that
the initial potential participant was provided and asked the same eligibility criteria questions.
If the partner also met eligibility criteria and agreed to participate, they were both sent
individual unique links to the baseline survey. Participants who completed the baseline were
20
provided with a $10 gift card ($20/couple). Ethical approval was obtained from the [blinded
for peer-review] institutional review board and all participants received a written informed
consent at the start of the baseline survey. Details of the procedure can be found from
[blinded for peer review].
The sample consisted of 202 mixed-sex couples (404 individuals). The majority of
participants (89%) were from the United States, with a minority of the participants from
Canada (11%). The mean age of the sample was 32.5 (SD = 8.90) relationship length of the
couples was 9.19 (SD = 6.85) years. The majority of the sample identified as heterosexual
(93%), with a minority identifying as bisexual (5%), questioning or uncertain (1%), and other
(1%). The majority of participants were White (89%) and this was a fairly educated sample,
with 96% indicating they had attended at least some college.
Measures
The study used many of the same measures as Sample 1 and had a total of 66
variables1. The following questionnaires were not available in the sample: attachment styles
(ECR-S), attitudes toward sexuality (ATSS), Halbert Index of Sexual Desire (HISD), trait
mindfulness (FFQM-SF), and perception of love and sex (PLSS). The study had an additional
scale measuring romantic love, the Romantic Love Scale (Rubin, 1970). The scale consists of
13 items that are meant to measure affiliative and dependent need, a predisposition to help,
and orientation of exclusiveness and absorption. The scale is scored on a 9-point scale with
higher scores indicating higher romantic love. For dyadic analyses, both dyad members’
scores were included as predictors. The outcome measures were the same as in Sample 1.
Sample 3
1
The lower number of variables in the dataset is mainly due to the sample being of dyadic mixed-sex couples
and therefore many of the variables had fewer categories and thus fewer dummy coded variables (e.g.,
relationship status, sexual orientation)
21
The final sample consists of couples in which at least one member of the dyad
identified as bisexual. Participants were recruited for the current study utilizing targeted
recruitment in bisexual spaces primarily online (e.g., bisexual-focused websites, Facebook,
Twitter, and Reddit). The recruitment messaging explicitly stated that the study aimed to
recruit bisexual individuals and their partners in mixed-sex relationships. A participant met
eligibility criteria if they were over the age of 18, identified as bisexual, were in a romantic
mixed-sex relationship at the time of the survey, and were willing to provide the email
address of their partner to also participate. The respondent first completed the online survey
in which they provided an email address for their partner who was then contacted to complete
the survey. Ethical approval was obtained from the [blinded for peer-review] institutional
review board and all participants received a written informed consent at the start of the
baseline survey. Details of the procedure can be found from [blinded for peer review].
A total of 552 participants completed the baseline survey. Of those, there were 354
individuals who contributed to a dyad (177 couples) and 198 individuals whose partner did
not complete the survey. There were a total of 203 (37%) men, 337 (61%) women, and 12
(2%) transgender/non-binary; 153 (28%) were straight and 380 (69%) were bisexual.
Participants were 29 years old on average (SD = 6.95; range 18-50). The vast majority of the
participants were White (n = 447; 81%), married (n = 299; 54%), and had completed at least
some college (n = 480; 87%). Many participants did not identify with a specific religious
identity (n = 309; 56%) or were Christian (n = 170; 31%). On average, participants had been
in their current relationship for 6.10 years (SD = 5.36); 400 (72%) of those relationships were
monogamous and 152 (28%) were consensually non-monogamous.
Measures
The sample included all the same measures as Sample 2 but also had some additional
measures. These include self-esteem (Rosenberg, 1965) which is a 10-item, 5-point Likert
22
scale with higher scores indicating higher self-esteem, and satisfaction with life scale (Diener
et al., 1985) which is a 5-item, 7-point Likert scale with higher scores indicating better life
satisfaction. The outcome measures used were also different because a quarter of the sample
was consensually non-monogamous and therefore likely to be regularly engaging in sexual
activity with someone other than their primary partner. For this sample, we measured
infidelity using a question “Have you done something sexual with another person that could
hurt the relationship?”. We also used a measure of intentions toward infidelity. The Intentions
Toward Infidelity Scale (ITIS; Jones et al., 2010) assesses the likelihood of someone being
unfaithful to their partner. The scale consists of seven items and the response options range
from -3 (Not at all likely) to +3 (Extremely likely).
Data Analysis
Data Preparation. All categorical variables were dummy coded (0 and 1) with each
option included in the models. Any variables that were essentially the same as the outcome
variable were removed from the analyses. Any missing variables were imputed using
Random forest multiple imputation. Less than 0.1% of the data were missing, and any
missing data points were imputed using the scikit-learn package Iterative Imputer (Pedregosa
et al., 2011) with a Bayesian ridge estimator.
Analyses. All data were analyzed at the individual level with the full sample, with
men only, and with women only. Additionally, the data from dyads in which both members
of the couple had responded to the questionnaire was also analyzed separately for men and
women including both actor and partner effects in the model. The results were analyzed using
Python 3.7 and the code can be found here: [blinded for peer-review]. Each dataset was
analyzed using either a random forest regressor (Breiman, 2001b), or a balanced random
forest classifier (Breiman, 2001b; Chen et al., 2004) for continuous and categorical outcomes,
respectively. A random forest is a type of decision tree that trains on bootstrapped sub-
23
samples of the data in order to avoid overfitting. The tree can model highly non-linear
relationships in the data, and therefore represents a significantly more flexible model than a
logistic regression. In cases where one class occurs much more often than another, many
classifiers may learn to predict the majority class well, but not learn important associations
necessary to predict the minority class. The balanced random forest variant, for categorical
outcomes, is designed to provide better results in scenarios where there may be a class
imbalance in the dataset. In the current study, there is imbalance between participants who
had engaged in infidelity and those who had not. The balanced random forest is able to
mitigate the problems associated with unequal class ‘support’ by undersampling the majority
class in the bootstrapping process, thereby balancing the classes during training.
In general, random forest models are sensitive to hyperparameter settings (such as the
number of estimators, or the maximum depth of the decision tree). However, tuning
hyperparameters requires a separate validation data split which reduces the effective sample
size available for training and testing. Therefore, we use the default “imbalanced-learn”
balanced random forest classifier (IMBLEARN cite) and the default “scikit learn” random
forest regressor (Pedregosa et al., 2011) with k-fold cross-validation. The out-of-bag error is
a built-in metric frequently used to estimate the performance of random forests (Joel et al.,
2017, 2020), but in some circumstances this metric been shown to be biased above the true
error (Janitza & Hornung, 2018; Mitchell, 2011). By using a k-fold cross-validation
approach, instead of the out-of-bag error, we were able to test the model over the entire
dataset, and to acquire estimates for the standard error (see below). It is essential that the
trained model is tested on a separate partition of the dataset, even for less complex linear
models, when any data-driven decisions are made (Heyman & Smith Slep, 2001; Yarkoni &
Westfall, 2017).
24
A ten-fold cross-validation scheme was used to train and test the model. This means
the total dataset is randomly split into ten equally sized folds. The model is trained on nine
out of ten folds, tested on the tenth, and the test fold performance is recorded. This is repeated
until all ten folds have been used as a test set. The average performance, as well as the
standard error across the ten folds, provide an estimate of model performance on unseen data.
For continuous outcome (intention toward infidelity), the metrics for test data model
performance are the mean-squared error (which is the averaged squared difference between
the prediction and the observed value), the R2, and the variance explained. For the binary
outcomes (past online and in-person infidelity) the metrics for test data model performance
are the precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). These
metrics provide a more complete picture than an accuracy score, particularly for imbalanced
data. For instance, if a dataset contained a 90/10 imbalance, an accuracy of 90% could be
achieved simply by predicting the majority class for all new datapoints, and is therefore
meaningless. In contrast, precision is the ratio of true positives to the sum of true positives
and false positives; recall is the ratio of true positives to the sum of true positives and false
negatives, and the F1-score is the harmonic mean of precision and recall. These metrics
therefore provide a more complete picture about a classifier’s performance on imbalanced
data. Arguably the best summary statistic for imbalanced classification problems is MCC
(Boughorbel et al., 2017; Chicco & Jurman, 2020; Matthews, 1975). The MCC provides a
score bounded between [-1, 1] and is directly analogous to Pearon’s correlation coefficient. If
MCC=0 then the classifier is no better than random chance, if MCC=1 then the classifier
achieves perfect prediction, and if MCC=-1 the classifier perfectly predicts the opposite of
the correct class.

25
The last model to be trained as part of the k-fold cross-validation process is saved,
and explained using the “SHapley Additive exPlanations” package (SHAP) (Lundberg et al.,
2017, 2019, 2020). The SHAP package is a unified framework for undertaking model
explainability, and derives from the seminal game theoretic work of Lloyd Shapley (Shapley,
1952). The framework conceives of predictors as collaborating agents seeking to maximize a
common goal (i.e., the regressor performance). The approach involves systematically
evaluating changes in model performance in response to including or restricting the influence
from different combinations of predictors. Traditional approaches (e.g., using the coefficients
from a linear model, or importances from a random forest) are unreliable and ‘inconsistent’,
and the Shapley approach has been shown to provide explanations with certain theoretic
guarantees (Lundberg et al., 2020). The SHAP TreeExplainer function provides estimations
of the per-datapoint, per-predictor impact on model output, as well as the average predictor
impacts. This function provides estimations of the impact of per-datapoint pairwise
interactions on model output. For the analysis the default settings of the SHAP package
TreeExplainer were used, and the entire dataset was fed to the model for explanation. The
combination of the powerful function approximation capabilities of random forests with the
consistent and meaningful estimations of per-datapoint, per-predictor impact on model output
enables a reliable and informative exploration of predictor importance, as well as a means to
identify key predictor interactions.

26
References
Albright, J. M. (2008). Sex in America online: An exploration of sex, marital status, and
sexual identity in internet sex seeking and its impacts. Journal of Sex Research, 45(2),
175–186. https://doi.org/10.1080/00224490801987481
Allen, E. S., Atkins, D. C., Baucom, D. H., Snyder, D. K., Gordon, K. C., & Glass, S. P.
(2006). Intrapersonal, interpersonal, and fontextual Factors in engaging in and
responding to extramarital involvement. Clinical Psychology: Science and Practice,
12(2), 101–130. https://doi.org/10.1093/clipsy.bpi014
Amato, P. R., & Previti, D. (2004). People’s reasons for divorcing: Gender, social class, the
life course, and adjustment. Journal of Family Issues, 24(5), 602–626.
https://doi.org/10.1177/0192513x03254507
Atkins, D. C., Baucom, D. H., & Jacobson, N. S. (2001). Understanding infidelity: Correlates
in a national random sample. Journal of Family Psychology, 15(4), 735–749.
https://doi.org/10.1037//0893-3200.15.4.735
Atwater, L. (1982). The Extramarital Connection: Sex, Intimacy, and Identity. Irvington.
Betzig, L. (1989). Causes of conjugal dissolution: A cross-cultural study. Source: Current
Anthropology, 30(5), 654–676.
Blow, A. J., & Hartnett, K. (2005a). Infidelity in committed relationships I: A methodological
review. In Journal of Marital and Family Therapy (Vol. 31, Issue 2, pp. 183–216).
Blackwell Publishing Inc. https://doi.org/10.1111/j.1752-0606.2005.tb01555.x
Blow, A. J., & Hartnett, K. (2005b). Infidelity in committed relationships II: A substantive
review. In Journal of Marital and Family Therapy (Vol. 31, Issue 2, pp. 217–233).
Blackwell Publishing Inc. https://doi.org/10.1111/j.1752-0606.2005.tb01556.x
Bohlmeijer, E., Klooster, P. M., Fledderus, M., Veehof, M., & Baer, R. (2011). Psychometric
properties of the five facet mindfulness questionnaire in depressed adults and

27
development of a short form. Assessment, 18(3), 308–320.
https://doi.org/10.1177/1073191111408231
Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data
using Matthews Correlation Coefficient metric. PLOS ONE, 12(6), e0177678.
https://doi.org/10.1371/journal.pone.0177678
Breiman, L. (2001a). Statistical modeling: The two cultures. In Statistical Science (Vol. 16,
Issue 3).
Breiman, L. (2001b). Random forests. Machine Learning, 45(1), 5–32.
https://doi.org/10.1023/A:1010933404324
Burdette, A. M., Ellison, C. G., Sherkat, D. E., & Gore, K. A. (2007). Are there religious
variations in marital infidelity? Journal of Family Issues, 28(12), 1553–1581.
https://doi.org/10.1177/0192513X07304269
Chen, C., Liaw, A., & Breiman, L. (2004). Using random forest to learn imbalanced data.
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient
(MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics,
21(1), 6. https://doi.org/10.1186/s12864-019-6413-7
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159.
https://doi.org/10.1037/0033-2909.112.1.155
Diener, E., Emmons, R. A., Larsem, R. J., & Griffin, S. (1985). The Satisfaction With Life
Scale. Journal of Personality Assessment, 49(1), 71–75.
https://doi.org/10.1207/s15327752jpa4901_13
Fincham, F. D., & May, R. W. (2017). Infidelity in romantic relationships. In Current
Opinion in Psychology (Vol. 13, pp. 70–74). Elsevier.
https://doi.org/10.1016/j.copsyc.2016.03.008
Fisher, T. D., & Hall, R. G. (1988). A scale for the comparison of the sexual attitudes of
28
adolescents and their parents. The Journal of Sex Research, 24(1), 90–100.
https://doi.org/10.1080/00224498809551400
Glass, S. P., & Wright, T. L. (1985). Sex differences in type of extramarital involvement and
marital dissatisfaction. Sex Roles, 12(9–10), 1101–1120.
https://doi.org/10.1007/BF00288108
Großmann, I., Hottung, A., & Krohn-Grimberghe, A. (2019). Machine learning meets partner
matching: Predicting the future relationship quality based on personality traits. PLoS
ONE, 14(3), 1–16. https://doi.org/10.1371/journal.pone.0213569
Haseli, A., Shariati, M., Nazari, A. M., Keramat, A., & Emamian, M. H. (2019). Infidelity
and its associated factors: A systematic review. In Journal of Sexual Medicine (Vol. 16,
Issue 8, pp. 1155–1169). Elsevier B.V. https://doi.org/10.1016/j.jsxm.2019.04.011
Hendrick, S. S., & Hendrick, C. (2002). Linking romantic love with sex: Development of the
perceptions of love and sex scale. Journal of Social and Personal Relationships, 19(3),
361–378. https://doi.org/10.1177/0265407502193004
Heyman, R. E., & Smith Slep, A. M. (2001). The hazards of predicting divorce without
crossvalidation. Journal of Marriage and Family, 63(2), 473–479.
https://doi.org/10.1111/j.1741-3737.2001.00473.x
Janitza, S., & Hornung, R. (2018). On the overestimation of random forest’s out-of-bag error.
PLoS ONE, 13(8), e0201904. https://doi.org/10.1371/journal.pone.0201904
Joel, S., Eastwick, P. W., Allison, C. J., Arriaga, X. B., Baker, Z. G., Bar-Kalifa, E.,
Bergeron, S., Birnbaum, G., Brock, R. L., Brumbaugh, C. C., Carmichael, C. L., Chen,
S., Clarke, J., Cobb, R. J., Coolsen, M. K., Davis, J., Jong, D. C. de, Debrot, A., DeHaas,
E. C., … Wolf, S. (2020). Machine learning uncovers the most robust self-report
predictors of relationship quality across 43 longitudinal couples studies. Manuscript
submitted for publication. https://doi.org/10.1073/pnas.1917036117

29
Joel, S., Eastwick, P. W., & Finkel, E. J. (2017). Is romantic desire predictable? Machine
learning applied to initial romantic attraction. Psychological Science, 28, 1478–1489.
https://doi.org/10.1177/0956797617714580
Jones, D. N., Olderbak, S. G., & Figueredo, A. J. (2010). Intentions Towards Infidelity Scale.
https://doi.org/10.4324/9781315881089.CH87
Labrecque, L. T., & Whisman, M. A. (2017). Attitudes toward and prevalence of extramarital
sex and descriptions of extramarital partners in the 21st century. Journal of Family
Psychology, 31(7), 952–957. https://doi.org/10.1037/fam0000280
Lawrance, K., & Byers, E. S. (1992). Development of the interpersonal exchange model of
sexual satisfaction in long-term relationships. The Canadian Journal of Human
Sexuality, 1, 123–128. https://doi.org/10.1111/j.1475-6811.1995.tb00092.x
Liu, C. (2000). A theory of marital sexual life. Journal of Marriage and Family, 62(2), 363–
374. https://doi.org/10.1111/j.1741-3737.2000.00363.x
Lundberg, S. M., Allen, P. G., & Lee, S.-I. (2017). A unified approach to interpreting model
predictions. Neural Information Processing Systems. https://github.com/slundberg/shap
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R.,
Himmelfarb, J., Bansal, N., & Lee, S. I. (2020). From local explanations to global
understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67.
https://doi.org/10.1038/s42256-019-0138-9
Lundberg, S. M., Erion, G. G., & Lee, S.-I. (2019). Consistent individualized feature
attribution for tree ensembles. http://github.com/slundberg/shap
Mark, K. P., & Haus, K. R. (2019). Extradyadic relations. In A. C. Michalos (Ed.),
Encyclopedia of Quality of Life and Well-Being Research. (pp. 2102–2105). Springer.
Mark, K. P., Janssen, E., & Milhausen, R. R. (2011). Infidelity in heterosexual couples:
Demographic, interpersonal, and personality-related predictors of extradyadic sex.

30
Archives of Sexual Behavior, 40(5), 971–982. https://doi.org/10.1007/s10508-011-9771-
Martins, A., Pereira, M., Andrade, R., Dattilio, F. M., Narciso, I., & Canavarro, M. C. (2016).
Infidelity in dating relationships: Gender-specific correlates of face-to-face and online
extradyadic involvement. Archives of Sexual Behavior, 45(1), 193–205.
https://doi.org/10.1007/s10508-015-0576-3
Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of
T4 phage lysozyme. BBA - Protein Structure, 405(2), 442–451.
https://doi.org/10.1016/0005-2795(75)90109-9
Mattingly, B. A., Wilson, K., Clark, E. M., Bequette, A. W., & Weidler, D. J. (2010). Foggy
faithfulness: Relationship quality, religiosity, and the Perceptions of Dating Infidelity
Scale in an adult sample. Journal of Family Issues, 31(11), 1465–1480.
https://doi.org/10.1177/0192513X10362348
McDaniel, B. T., Drouin, M., & Cravens, J. D. (2017). Do you have anything to hide?
Infidelity-related behaviors on social media sites and marital satisfaction. Computers in
Human Behavior, 66, 88–95. https://doi.org/10.1016/j.chb.2016.09.031
Mitchell, M. W. (2011). Bias of the random forest out-of-bag (OOB) error for certain input
parameters. Open Journal of Statistics, 01(03), 205–211.
https://doi.org/10.4236/ojs.2011.13024
Olson, M. M., Russell, C. S., Higgins-Kessler, M., & Miller, R. B. (2002). Emotional
processes following disclosure of an extramarital affair. Journal of Marital and Family
Therapy, 28(4), 423–434. https://doi.org/10.1111/j.1752-0606.2002.tb00367.x
Owen, J., Rhoades, G. K., & Stanley, S. M. (2013). Sliding versus deciding in relationships:
Associations with relationship quality, commitment, and infidelity. Journal of Couple
and Relationship Therapy, 12(2), 135–149.

31
https://doi.org/10.1080/15332691.2013.779097
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,
Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D.,
Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in
Python. Journal of Machine Learning Research, 12, 2825–2830. http://scikit-learn.org.
Perel, E. (2017). The State Of Affairs: Rethinking Infidelity - a book for anyone who has ever
loved. Hachette UK.
Petersen, J. L., & Hyde, J. S. (2010). A meta-analytic review of research on gender
differences in dexuality, 1993-2007. Psychological Bulletin, 136(1), 21–38.
https://doi.org/10.1037/a0017504
Rosenberg, M. (1965). Society and the Adolescent Self-Image. Princeton University Press.
https://www.jstor.org/stable/j.ctt183pjjh
Rubin, Z. (1970). Measurement of romantic love. Journal of Personality and Social
Psychology, 16(2), 265–273. https://doi.org/10.1037/h0029841
Selterman, D., Garcia, J. R., & Tsapelas, I. (2019). Motivations for extradyadic infidelity
revisited. Journal of Sex Research, 56(3), 273–286.
https://doi.org/10.1080/00224499.2017.1393494
Shapley, L. S. (1952). A Value for n-Person Games. RAND Corporation.
Spanier, G. B., & Margolis, R. L. (1983). Marital separation and extramarital sexual
behavior. The Journal of Sex Research, 19(1), 23–48.
https://doi.org/10.1080/00224498309551167
Spector, I. P. I. P., Carey, M. P. M. P., & Steinberg, L. (1996). The sexual desire inventory:
Development, factor structure, and evidence of reliability. Journal of Sex and Marital
Therapy, 22(3), 175–190. https://doi.org/10.1080/00926239608414655
Thompson, A. E., & O’Sullivan, L. F. (2016). Drawing the line: The development of a
32
comprehensive assessment of infidelity judgments. Journal of Sex Research, 53(8), 910–
926. https://doi.org/10.1080/00224499.2015.1062840
Treas, J., & Giesen, D. (2000). Sexual infidelity among married and cohabiting Americans.
Journal of Marriage and Family, 62(1), 48–60. https://doi.org/10.1111/j.1741-
3737.2000.00048.x
Wei, M., Russell, D. W., Mallinckrodt, B., & Vogel, D. L. (2007). The Experiences in Close
Relationship Scale (ECR)-short form: Reliability, validity, and factor structure. Journal
of Personality Assessment, 88(2), 187–204. https://doi.org/10.1080/00223890701268041
Whisman, M. A., Dixon, A. E., & Johnson, B. (1997). Therapists’ perspectives of couple
problems and treatment issues in couple therapy. Journal of Family Psychology, 11(3),
361–366. https://doi.org/10.1037/0893-3200.11.3.361
Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology:
Lessons from machine learning. Perspectives on Psychological Science, 12(6), 1100–
1122. https://doi.org/10.1177/1745691617693393
Yousefi, N., Farsani, K., Shakiba, A., Hemmati, S., & Nabavi Hesar, J. (2014). Halbert Index
of Sexual Desire (HISD) questionnaire validation. Scientific Journal of Clinical
Psychology & Personality, 2(9), 107–118.

33
Table 1
The Overall Results for Infidelity, Infidelity Online, and Intention toward Infidelity across the Three Samples
Sample 1 Sample 2 Sample 3

Outcome Class Pre Rec F1 MCC a Pre Rec F1 MCC Pre Rec F1 MCC
Infidelity
All: 0 .81 (.02) .65 (.02) .72 (.02) .28 .92 (.01) .80 (.02) .85 (.01) .36 (.06) .92 (.02) .71 (.02) .80 (.02 .30 (.04)
1 .48 (.03) .69 (.02) .56 (.02) (.03) .40 (.05) .62 (.06) .48 (.06) .32 (.03) .69 (.06) .42
(.03)
Men: 0 .70 (.05) .66 (.03) .66 (.03) .28 .91 (.03) .69 (.03) .78 (.02) .32 (.03) .91 (.03) .64 (.05) .74 .15 (.10)
1 .58 (.04) .63 (.05) .59 (.03) (.08) .34 (.03) .73 (.07) .44 (.04) .20 (.06) .58 (.13) (.03)
.28
(.07)
Men dyadic: 0 .92 (.02) .77 (.03) .84 (.02) .42 (.06) .90 (.03) .54 (.03) .67 .08 (.09)
1 .43 (.06) .75 (.07) .52 (.05) .16 (.04) .57 (.13) (.03)
.25
(.06)
Women: 0 .84 (.02) .65 (.03) .73 (.02) .25 .92 (.02) .79 (.03) .85 (.03) .35 (.09) .92 (.02) .73 (.02) .81 .33 (.05)
1 .39 (.03) .62 (.05) .46 (.03) (.04) .36 (.07) .64 (.10) .79 (.08) .35 (.04) .67 (.07) (.01)
.45
(.04)
Women dyadic: 0 .93 (.02) .78 (.04) .84 (.02) .35 (.08) .93 (.02) .70 (.04) .79 .23 (.08)
1 .36 (.06) .65 (.11) .44 (.07) .26 (.06) .54 (.11) (.03)
.34
(.08)
Infidelity online
All: 0 .87 (.02) .70 (.02) .77 (.01) .36 .94 (.02) .80 (.02) .86 (.01) .38 (.06)
1 .46 (.03) .71 (.02) .55 (.02) (.02) .36 (.06) .71 (.08) .44 (.06)
Men: 0 .72 (.06) .64 (.04) .67 (.04) .28 .94 (.03) .80 (.03) .86 (.02) .33 (.05)
1 .56 (.04) .65 (.06) .59 (.04) (.08) .36 (.04) .71 (.09) .44 (.04)
Men dyadic: 0 .90 (.03) .67 (.02) .77 (.02) .24 (.05)
1 .27 (.04) .68 (.08) .37 (.05)
Women: 0 .88 (.02) .63 (.02) .73 (.02) .18 .98 (.01) .85 (.04) .90 (.02) .49 (.07)
1 .27 (.03) .60 (.07) .36 (.04) (.05) .41 (.07) .79 (.10) .51 (.08)
Women dyadic: 0 .96 (.01) .87 (.03) .91 (.02) .40 (.11)
1 .39 (.11) .62 (.14) .45 (.11)
% Var MSE R2
34
Intention toward
infidelity
All 42.0 0.82 (.08) .47
(.05) (.05)
Men 58.0 0.79 (.09) .56
(.04) (.04)
Men dyadic 58.8 0.85 (.10) .54
(.06) (.06)
Women 31.6 0.93 (.14) .26
(.05) (.04)
Women dyadic 40.5 0.80 (.10) .36
(.05) (.06)
Note. Standard error across the ten folds is in brackets. 0 = no infidelity, 1 = infidelity. Pre = precision, Rec = recall, MCC = Matthews
correlations coefficient, % Var = percentage of variance explained, MSE = mean squared error.
a. MCC is the overall effect size of the classification that takes into account the true and false positives and negatives in each class and
provides an overall measure of accuracy. The MCC can be interpreted akin to Pearson’s correlation coefficient with effect sizes of small
= .1, medium = .3, and large = .5.

35
Figure 1
The Top-10 Most Important Predictors for In-Person Infidelity in Sample 1
In f id e lit y in P e r s o n A ll
N ever had anal
S o lit a r y d e s ir e
T o t a l d e s ir e
A t t a c h m e n t a v o id a n c e
R e la t io n s h ip le n g t h
L o v e b e fo re s e x
N e v e r m a stu rb a te p a rtn e r
AT S S to ta l
R e la t io n s h i p s a t is f a c t io n
N e v e r u se d se x to y
In f id e lit y in P e r s o n M e n
AT S S to ta l
N ever had anal
F F M Q - S F d e s c r ib e
S e c d e m o n s t r a t e s lo v e
S e x d e c lin in g
D y a d ic d e s ire
H I S D d e s ir e
In f id e lit y in P e r s o n W o m e n
N ever had anal
AT S S to ta l
B is e x u a l
L o v e m o s t im p o r t a n t
36
Figure 2
The Top-10 Most Important Predictors for In-Person Infidelity in Sample 2
In f id e lit y in P e r s o n A ll
R e l a t io n s h i p s a t i s f a c t i o n
R o m a n t i c L o v e S c a le
S e x u a l s a t is fa c t io n
D y a d ic d e s ir e
R e l a t i o n s h ip l e n g t h
A ge
P h y s ic a l h e a lt h
H a d v a g in a l s e x p a s t m o n t h
W h it e
In f id e lit y in P e r s o n M e n
R e c e iv e o r a l p a s t m o n t h
A ge
O r a l c o n t r a c e p t iv e
H a d v a g in a l s e x p a s t m o n t h
In f id e lit y in P e r s o n W o m e n
P h y s ic a l h e a lt h
A ge
W h it e
N ever had anal
37
Figure 3
The Top-10 Most Important Predictors for Infidelity in Sample 2
In f id e lit y G e n e r a l A ll
IT IS s c o r e
Ever had anal
R e l a t io n s h ip le n g t h
A tt e n d s e r v ic e w e e k ly
S o l it a r y d e s ir e
G r a d u a t e d c o l le g e
N ever had anal
In f id e lit y G e n e r a l M e n
R e l a t io n s h ip le n g t h
N e v e r a tt e n d s e r v ic e
L ife s a t is fa c t io n
S e lf - e s t e e m
Age
IT IS s c o r e
B a r r ie r b i r t h c o n t r o l
In f id e lit y G e n e r a l W o m e n
IT IS s c o r e
Ever had anal
S e lf - e s t e e m
L ife s a t is fa c t io n
38
Figure 4
The Top-10 Most Important Predictors for Online Infidelity in Sample 1
In f id e lit y O n lin e A ll
N ever had anal
M an
R e la t io n s h i p l e n g t h
W om an
G ay
H o r m o n a l c o n t r a c e p t io n
I n f id e li t y p a s t w e e k
A T S S to ta l
In f id e lit y O n lin e M e n
S t ra ig h t
N ever had anal
G ay
F e m a le p a r t n e r
H o r m o n a l c o n t r a c e p t io n
F F M Q -S F n o n -re a c t
In f id e lit y O n lin e W o m e n
N ever had anal
A T S S to ta l
B is e x u a l
N o r e l ig i o u s s e r v i c e
F F M Q -S F o b s e rv e
N e v e r m a stu rb a te p a rtn e r
39
Figure 5
The Top-10 Most Important Predictors for Online Infidelity in Sample 2
In f id e lit y O n lin e A ll
S o l it a r y d e s i r e
R o m a n t ic L o v e S c a le
R e la t io n s h ip s a t is fa c t io n
Ever had anal
V a g in a l s e x p a s t m o n t h
N ever had anal
G r a d u a t e d c o ll e g e
In f id e lit y O n lin e M e n
Ever had anal
G r a d u a t e d c o ll e g e
V a g in a l s e x p a s t m o n t h
N ever had anal
In f id e lit y O n lin e W o m e n
Ever had anal
N a t u r a l b ir t h c o n t r o l
S t r a ig h t
I m p la n t b i r t h c o n t r o l
40
Figure 6
The Top-10 Most Important Predictors for Intention toward Future Infidelity in Sample 3
In t e n t io n t o w a r d In f id e lit y A ll
R e la t i o n s h i p s a t is f a c t i o n
H a d e n g a g e d i n in f i d e li t y
S o li t a r y d e s i r e
R e la t i o n s h ip le n g t h
A t t e n d w e e k ly s e r v i c e
B is e x u a l
N ever had anal
In t e n t io n t o w a r d In f id e lit y M e n
N o r e l ig i o n
B is e x u a l
S t r a ig h t
In t e n t io n t o w a r d In f id e lit y W o m e n
H a d e n g a g e d i n i n f id e li t y
Age
L i fe s a t is f a c t i o n
41
Figure 7
The Results for the Most Important Moderators for Infidelity and Intention toward Infidelity
in Sample 3
E n g a g e d in In f id e lit y A ll E n g a g e d in In f id e lit y W o m e n
A t t e n d s e r v ic e w e e k ly
G r a d u a t e d c o lle g e
IT IS s c o r e IT IS s c o r e
E n g a g e d in In f id e lit y M e n In t e n t io n t o w a r d In f id e lit y A ll
Ever had anal
IT IS s c o r e R e l a t i o n s h ip s a t i s f a c t i o n
In t e n t io n t o w a r d In f id e lit y M e n In t e n t io n t o w a r d In f id e lit y W o m e n
N o r e l ig i o n
N o r e l ig i o n
R e la t io n s h ip s a t is fa c t io n R e l a t i o n s h ip s a t i s f a c t io n

Infidelity

Uploaded by

Copyright:

Available Formats

Infidelity

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Infidelity

Uploaded by

Copyright:

Available Formats

1

PREDICTING INFIDELITY WITH MACHINE LEARNING

Important Predictors of Infidelity

School of Psychology, University of Southampton

Department of Family Medicine and Community Health, University of Minnesota

somewhat predictable overall with interpersonal factors (relationship satisfaction, love,

Keywords: Infidelity; Interpersonal Relationships; Machine learning; Random forest; Shapley

Important Predictors of Infidelity

relationships including feelings of discontent, depression, blame, and frustration (Thompson

internet and smartphones, computer-mediated behaviors (e.g., sexting, sending explicit

infidelity (Albright, 2008).

Because of its potential devastating impact on individuals and relationships, many

predictor (Haseli et al., 2019; Mark et al., 2011).

relationship satisfaction to be a significant predictor of infidelity (Atkins et al., 2001;

(Haseli et al., 2019; Mark et al., 2011).

unable to estimate non-linear associations or complex interactions, and tend to produce

Recent developments in machine learning have provided tools that allow

interpretation of the results through explanations of machine learning models (Lundberg et

of particular importance because one of the principal aims of psychology is to develop

to compare the effect sizes across different variables.

Prevalence of Infidelity. Most participants in Sample 1 were currently in a

infidelity (16.8% of men; 11.4% of women).

Finally, in Sample 3, bisexual individuals who were currently in a romantic mixed-sex

partner’s feelings. We also measured participants’ intentions toward future infidelity. In

average, participants reported being unlikely to engage in infidelity (M = -1.55, SD = 1.30,

range -3 to 3) in their intentions toward engaging in infidelity (men: M = -1.36, SD = 1.42,

range -3 to 2; women: M = -1.67, SD = 1.17, range -3 to 2.43).

infidelity) as well as an overall measure of the model performance using Matthews

small, .3 = medium, and .5 = large effect (Cohen, 1992).

Finally, in addition to predicting infidelity as a class, we also used the model to

improved for women (31.6% to 40.5).

The Most Important Predictors of Infidelity. In addition to using the models to

page (https://osf.io/ehzkm/?view_only=f9232534d9f84541a38a2fec228fc72d) including the

the most interesting and/or consistent predictors below.

lower satisfaction an increased likelihood of engaging in infidelity. However, some highly

dyadic desire as well as longer relationship length predicted an increase in likelihood of

in Samples 2 and 3 also predicted a decreased likelihood of having engaged in infidelity.

having engaged in infidelity.

having engaged in online infidelity in Sample 2.

desire predicted an increase in likelihood of engaging in infidelity. Desire discrepancy,

whose relationship was newer.

Moderator Variables. We also examined which interactions may have contributed to

yellow indicates the strongest interaction.

An interaction between intentions toward engaging in infidelity and being a college

infidelity compared to more liberal participants. An interaction between relationship

satisfaction and no religion was opposite for women.

Infidelity is relatively common with up to half the people in relationships having

A large body of literature has attempted to identify which factors contribute to

such as incorrect specification of the underlying causal structure, multicollinearity,

unattainable parametric assumptions, and inability to examine complex associations

liberal sexual attitudes predicting an increase in likelihood of having engaged in infidelity

relationship length predicted increased intentions to engage in infidelity. Attending religious

Although not consistent, religion generally predicted a decreased likelihood of having

fantasies. This is potentially interesting proposition and warrants further investigation.

in-person and online infidelity as well as intention toward future infidelity.

be used to estimate causality. However, in the absence of a means to reliably estimate

perhaps the best option.

relationships is also likely to decrease the risk of infidelity.

Participants and Procedure