Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Infidelity

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

1

PREDICTING INFIDELITY WITH MACHINE LEARNING


Is Infidelity Predictable? Using Interpretable Machine Learning to Identify the Most

Important Predictors of Infidelity

Laura M. Vowels

School of Psychology, University of Southampton

Matthew J. Vowels

Centre for Computer Vision, Speech and Signal Processing, University of Surrey

Kristen P. Mark

Department of Family Medicine and Community Health, University of Minnesota


2
PREDICTING INFIDELITY WITH MACHINE LEARNING
Abstract

Infidelity is a common occurrence in relationships and can have a devastating impact on both

partners’ well-being. A large body of literature have attempted to factors that can explain or

predict infidelity but have been unable to estimate the relative importance of each predictor.

We used a machine learning algorithm, random forest (a type of interpretable highly non-

linear decision tree), to predict in-person and online infidelity and intentions toward future

infidelity across three samples (two dyadic samples; N = 1846). We also used a game

theoretic explanation technique, Shapley values, which allowed us to estimate the effect size

of each predictor variable on infidelity. The present study showed that infidelity was

somewhat predictable overall with interpersonal factors (relationship satisfaction, love,

desire, relationship length) being the most predictive. The results suggest that addressing

relationship difficulties early in the relationship can help prevent future infidelity.

Keywords: Infidelity; Interpersonal Relationships; Machine learning; Random forest; Shapley

values
3
PREDICTING INFIDELITY WITH MACHINE LEARNING
Is Infidelity Predictable? Using Interpretable Machine Learning to Identify the Most

Important Predictors of Infidelity

Infidelity is the most commonly reported cause of divorce in the United States (Amato

& Previti, 2004; Mark et al., 2011) and across at least 160 cultures (Betzig, 1989). The fallout

from infidelity can have devastating consequences for both members of the couple in

relationships including feelings of discontent, depression, blame, and frustration (Thompson

& O’Sullivan, 2016). In fact, infidelity is considered the third most difficult problem to work

with in therapy and the second only to abuse for having the most damaging impact on

relationships (Whisman et al., 1997). The prevalence estimates for lifetime infidelity range

between 20-52% depending on the way infidelity is defined and measured (Mark et al., 2011;

Mark & Haus, 2019; Thompson & O’Sullivan, 2016). Definitions of infidelity vary

widely across studies but can broadly can be defined as engaging in emotional or sexual

relations outside of the agreed-upon bounds of the relationship (Mark & Haus, 2019), and

may include behaviors such as flirting with someone, having an emotional connection, sexual

intercourse, or using pornography (Blow & Hartnett, 2005b). With the emergence of the

internet and smartphones, computer-mediated behaviors (e.g., sexting, sending explicit

photos, or watching live webcam porn) have also become more commonplace as forms of

infidelity (Albright, 2008).

Because of its potential devastating impact on individuals and relationships, many

studies to date have attempted to understand factors that may explain and predict infidelity

which include demographic, intraindividual, and interindividual variables (Mark & Haus,

2019). From an evolutionary perspective, men should be more motivated to engage in sexual

infidelity to maximize their reproductive success. Indeed, many studies have found that men

are more likely to engage in sex outside of a relationship (Labrecque & Whisman, 2017;

Petersen & Hyde, 2010) whereas women may be more likely to engage in emotional
4
PREDICTING INFIDELITY WITH MACHINE LEARNING
infidelity (Selterman et al., 2019). Other studies have found similar levels of infidelity

between men and women especially when both sexual and emotional forms of infidelity are

considered (Allen et al., 2006; Fincham & May, 2017; Mark et al., 2011; Treas & Giesen,

2000).

Other demographic variables that have been previously associated with infidelity

include relationship status, education, and religion. Some studies have found that more

committed individuals are less likely to engage in infidelity (Amato & Previti, 2004; Fincham

& May, 2017) and highly educated individuals are more likely to engage in infidelity (Atkins

et al., 2001; Martins et al., 2016; Treas & Giesen, 2000). Whereas other studies have found

the opposite pattern or no difference for education (Allen et al., 2006; Fincham & May,

2017). Finally, individuals with no religious affiliation have been reported to be more likely

to engage in infidelity in some studies (Burdette et al., 2007; Fincham & May, 2017;

Mattingly et al., 2010) but others have not found religious affiliation to be a significant

predictor (Haseli et al., 2019; Mark et al., 2011).

In addition to demographic variables, there are other intraindividual factors that have

been linked to infidelity in previous studies. For example, individuals with more permissive

sexual attitudes have been shown to be more likely to engage in infidelity (Fincham & May,

2017; Haseli et al., 2019; Martins et al., 2016). Similarly, higher sexual interest in both men

and women has been associated with a higher likelihood of engaging in sexual infidelity

(Fincham & May, 2017; Treas & Giesen, 2000). Several studies have found that individual

differences in attachment predict infidelity. Specifically, more anxious (i.e., individuals who

feel unlovable and unworthy and thus seek excessive reassurance and support in

relationships) and avoidant (i.e., individuals who do not trust in other’s capacity to be there

for them and thus focus on independence and self-reliance) individuals are more likely to
5
PREDICTING INFIDELITY WITH MACHINE LEARNING
engage in infidelity compared to more secure individuals (i.e., individuals who feel lovable

and trust others; Fincham & May, 2017; Haseli et al., 2019; McDaniel et al., 2017).

There are also interpersonal factors that are associated with greater likelihood of

infidelity in relationships. Although not consistent across all studies, most studies have found

relationship satisfaction to be a significant predictor of infidelity (Atkins et al., 2001;

Fincham & May, 2017; Glass & Wright, 1985; Haseli et al., 2019; Owen et al., 2013; Spanier

& Margolis, 1983). Dissatisfaction with one’s sexual relationship, especially related to a

decline in frequency of sex as relationship length increases has also been associated with

greater likelihood of infidelity for men (Liu, 2000). Furthermore, incompatibility between

partners in terms of sexual attitudes has been associated with infidelity, at least for women

(Haseli et al., 2019; Mark et al., 2011).

While a number of predictors have been found to be associated with infidelity, the

findings are often inconsistent and the studies suffer from poor methodologies (Blow &

Hartnett, 2005a, 2005b). Previous research has also exclusively utilized traditional linear

models, which are ill-equipped to handle a large number of predictors simultaneously, are

unable to estimate non-linear associations or complex interactions, and tend to produce

unreliable estimates that leave models completely uninterpretable (Breiman, 2001a; Lundberg

et al., 2020; Yarkoni & Westfall, 2017). A small number of studies in relationship science to

date have used machine learning to overcome issues with linear models (Großmann et al.,

2019; Joel et al., 2017, 2020). However, to date none of these studies have been able to

estimate the size or the direction of the effect of each individual predictor variable on the

model outcome.

Recent developments in machine learning have provided tools that allow

interpretation of the results through explanations of machine learning models (Lundberg et

al., 2017, 2019). This work is particularly interesting because it enables researchers to
6
PREDICTING INFIDELITY WITH MACHINE LEARNING
combine the use of powerful machine learning algorithms and state-of-the-art model

explainability tools that can provide not only accurate predictions but also increase our

understanding of which factors are the most important in predicting the outcome. The latter is

of particular importance because one of the principal aims of psychology is to develop

understanding (Grosz et al. 2020). In the present study, we take advantage of this new

development in machine learning by using random forests (Breiman, 2001b) with Shapley

values (Lundberg et al., 2017, 2019) to estimate the effect size and direction of the effect of

each variable predicting past infidelity. A random forest is a form of interpretable decision

tree that can handle highly non-linear relationships and complex interactions without

overfitting to the data and estimate a large number of predictors simultaneously enabling us

to compare the effect sizes across different variables.

The main aims of the present study were to determine whether we could predict

sexual and online infidelity as well as future intentions toward infidelity and estimate which

variables contribute the most variance in the outcome. Because the study was exploratory in

nature and machine learning is more suitable for exploratory research (Yarkoni & Westfall,

2017), we did not make any a priori hypotheses. However, we used k-fold cross-validation,

in which the model is trained on one part of the data and tested on another. Therefore, this

technique evaluates the model generalizability on unseen test data effectively providing a

confirmatory analysis. We used data from three different studies to further aid

generalizability of our results: one in which data were collected from individuals (Sample 1)

and two datasets in which data were collected from both members of the couple (Samples 2

and 3). Because many previous studies have found differences between men and women, we

analyzed each dataset together for all participants and separately for men and women. In the

latter two samples we also estimated the models including both dyad members’ variables as
7
PREDICTING INFIDELITY WITH MACHINE LEARNING
predictors in order to explore whether partner variables are also associated with self’s

outcome.

Results

Prevalence of Infidelity. Most participants in Sample 1 were currently in a

relationship but only one member of the couple responded to the survey. They were asked

about infidelity in their current or most recent relationship: 32.0% of a total of 891

participants (43.4% of men; 25.7% of women) had engaged in in-person infidelity compared

to 26.6% in online infidelity (41.6% of men; 18.5% of women). In Sample 2, both members

of the couple responded to the surveys and reported on engagement in sexual infidelity in

person or online in their current relationship: 17.4% of a total of 404 participants (18.8% of

men; 15.9% of women) had engaged in in-person infidelity compared to 14.1% in online

infidelity (16.8% of men; 11.4% of women).

Finally, in Sample 3, bisexual individuals who were currently in a romantic mixed-sex

relationship were invited to participate and their partners were also invited to complete the

survey. Most participants’ partners also completed the survey but the data also included some

bisexual individuals whose partners did not complete the survey. Because over a quarter of

the sample were consensually non-monogamous, we did not use the questions regarding

engagement in sexual activity with someone other than partner as a measure of infidelity.

Instead, we used whether participants had engaged in sexual activity that could have hurt

partner’s feelings. We also measured participants’ intentions toward future infidelity. In

Sample 3, 16.5% of a total of 552 participants (12.3% of men; 18.1% of women) had engaged

in a sexual behavior with someone other than partner that could hurt partner’s feelings. On

average, participants reported being unlikely to engage in infidelity (M = -1.55, SD = 1.30,

range -3 to 3) in their intentions toward engaging in infidelity (men: M = -1.36, SD = 1.42,

range -3 to 2; women: M = -1.67, SD = 1.17, range -3 to 2.43).


8
PREDICTING INFIDELITY WITH MACHINE LEARNING
Prediction Accuracy. We estimated models for all participants as well as for men and

women separately. In Samples 2 and 3, we also estimated the models with and without

partner effects for men and women. We also estimated the models for each outcome. This

resulted in a total of 26 models. The results for the overall model performances can be found

in Table 1. We report precision, recall, and F1 scores for each class (0 = no infidelity, 1 =

infidelity) as well as an overall measure of the model performance using Matthews

correlation coefficient (MCC). The MCC coefficient can be interpreted as an overall effect

size for the model using established effect size guidelines for Pearson’s correlation: .1 =

small, .3 = medium, and .5 = large effect (Cohen, 1992).

Overall, the effect size for in-person infidelity for all participants was between .28 and

.36 indicating a medium effect size. The effect size for men was between .15 and .32 when

only actor effects were included in the models and between .08 (Sample 3) and .42 (Sample

2) when partner effects were also included. The effect size for women was between .25

and .35 when only actor effects were included in the models and between .23 and .35 when

both actor and partner effects were included in the models. Overall, including partner effects

in the models only improved the model performance for men in Sample 2 (.32 compared

to .42). The prediction effect size for online infidelity was medium to large for all participants

(.36 to .38). The effect size for men was between .28 and .33 and for women between .18 and

.49. When both actor and partner effects were included in the models, the overall effect size

decreased from .33 to .24 for men and from .49 to .40 for women suggesting that partner

effects did not add any information and may even detract from the model performance.

Finally, in addition to predicting infidelity as a class, we also used the model to

predict intention toward infidelity in Sample 3. Overall, we could predict 42.0% of the

variance for all participants. The model was better at predicting men’s (58.0%) intention

toward engaging in infidelity compared to women’s (31.6%). Adding partner effects into the
9
PREDICTING INFIDELITY WITH MACHINE LEARNING
model did not change the model performance for men (58.0% compared to 58.8%) but

improved for women (31.6% to 40.5).

The Most Important Predictors of Infidelity. In addition to using the models to

predict infidelity, we also estimated each predictor variable’s contribution to the model

performance using Shapley values. We include the top-10 most important predictors for each

model in Figures 1-6. Due to space limitations, we only provide results for the models

without partner effects given that partner effects did not generally improve the models’

predictive ability. However, for interested readers, all results can be found on the OSF project

page (https://osf.io/ehzkm/?view_only=f9232534d9f84541a38a2fec228fc72d) including the

importances for Top-20 variables. The left side of each figure provides the mean effect of

each variable on the model outcome for each class. The right side of the figure provides the

estimates for each individual participant. Red indicates a higher value of the predictor

variable and blue indicates a lower value. For example, red is equal to 1 and blue is equal to 0

for binary variables. For the outcome variable, points on the right side of the figure show an

increase in the likelihood of engaging in infidelity whereas the left of the middle point show a

decreased likelihood of engaging in infidelity. It is important to note that the three samples

differed somewhat in the predictor and outcome variables that were available and therefore

the results for the most important predictors vary somewhat across the samples. For the sake

of brevity, we have not discussed each predictor variable in the top-10 in detail as all of the

results can be seen in the figures. We have provided examples of interpretation and discussed

the most interesting and/or consistent predictors below.

There were several variables that were included in the top-10 most predictive

variables across all three samples (Figures 1-3) across most of the analyses (all, men,

women): relationship satisfaction, solitary desire, dyadic desire, relationship length, and some

sexual activities (had anal sex, oral sex, or vaginal sex). Overall, higher scored on
10
PREDICTING INFIDELITY WITH MACHINE LEARNING
relationship satisfaction predicted a decreased likelihood of having engaged in infidelity and

lower satisfaction an increased likelihood of engaging in infidelity. However, some highly

satisfied individuals were also more likely to have engaged in infidelity suggesting a more

complex relationship between relationship satisfaction and infidelity. Higher solitary and

dyadic desire as well as longer relationship length predicted an increase in likelihood of

having engaged in infidelity across the samples. Higher sexual satisfaction and romantic love

in Samples 2 and 3 also predicted a decreased likelihood of having engaged in infidelity.

More liberal attitudes toward sexuality in Sample 1 also predicted a higher likelihood of

having engaged in infidelity.

Online infidelity was only measured in Samples 1 (Figure 4) and 2 (Figure 5). Across

the two samples, having never had anal sex with the current partner decreased the likelihood

of also having engaged in infidelity and higher relationship length and sexual desire increased

the likelihood of having engaged in infidelity. Relationship and sexual satisfaction were only

in the top-10 predictors in Sample 2. Romantic love was also predictive of online infidelity in

Sample 2. Use of hormonal contraceptives decreased the likelihood of men having engaged in

online infidelity in Sample 1 whereas it increased the likelihood of both men and women

having engaged in online infidelity in Sample 2.

Finally, we also measured intentions toward infidelity in Sample 3 (see Figure 6).

Higher relationship and sexual satisfaction as well as romantic love predicted a decrease in

intentions to engage in infidelity for both men and women. Both higher dyadic and solitary

desire predicted an increase in likelihood of engaging in infidelity. Desire discrepancy,

however, was not in the top-10 predictors. Participants who attended weekly religious

services also had higher intentions toward engaging in infidelity compared to participants

who did not attend religious services weekly. Individuals who had been in their relationship
11
PREDICTING INFIDELITY WITH MACHINE LEARNING
for longer also had higher intentions toward engaging in infidelity compared to individuals

whose relationship was newer.

Moderator Variables. We also examined which interactions may have contributed to

the overall prediction. Due to space limitations, we have only provided figures for the

interactions for Sample 3 as examples because the sample had reports of both past infidelity

as well as intentions toward future infidelity (see Figure 7). Figures with all possible

interactions and simple interaction plots can be found on the OSF project page for each

analysis. In the OSF figures of interaction matrices, purple indicates no interaction and

yellow indicates the strongest interaction.

An interaction between intentions toward engaging in infidelity and being a college

graduate (all), attending weekly religious service (women), and ever had anal (men) also

contributed to the prediction of having engaged in infidelity. For example, participants who

had graduated college and had more conservative sexual attitudes had the highest likelihood

of having engaged in infidelity whereas participants who had graduated college and had more

liberal attitudes were the least likely to have engaged in infidelity. Of participants who had

not graduated college, more conservative participants were less likely to have engaged in

infidelity compared to more liberal participants. An interaction between relationship

satisfaction and romantic love (all) and no religion (women and men) also contributed to the

prediction of intentions toward infidelity. For men, those who were not religious and were

satisfied in their relationship were more likely to have intentions toward engaging in

infidelity compared to less satisfied participants. In contrast, men with religious affiliation

were less likely to have intentions to engage in infidelity if they were more satisfied in the

relationship whereas when they were less satisfied, they were more likely to have higher

intentions toward engaging in infidelity. The pattern of the interaction between relationship

satisfaction and no religion was opposite for women.


12
PREDICTING INFIDELITY WITH MACHINE LEARNING
Discussion

Infidelity is relatively common with up to half the people in relationships having

engaged in infidelity (Mark et al., 2011; Mark & Haus, 2019; Thompson & O’Sullivan, 2016)

with potentially devastating consequences for relationships causing distress (Thompson &

O’Sullivan, 2016) and often divorce (Amato & Previti, 2004). Infidelity is likely to affect not

only the couple members but also their children, extended family, and friends. It is important

to identify potential risk factors for infidelity in order to target interventions that could

prevent infidelity from occurring in the first place. The purpose of the present study was to

identify potential factors associated with infidelity and to quantify and compare different

factors to better understand which variables are the most strongly associated with infidelity.

A large body of literature has attempted to identify which factors contribute to

infidelity but has suffered from methodological and conceptual inconsistencies making the

results difficult to interpret (Blow & Hartnett, 2005a). Furthermore, the studies have relied

exclusively on linear models, which are often completely uninterpretable due to problems

such as incorrect specification of the underlying causal structure, multicollinearity,

unattainable parametric assumptions, and inability to examine complex associations

(Breiman, 2001a; Lundberg et al., 2020; Yarkoni & Westfall, 2017). The present study is the

first of its kind to examine predictors of infidelity using interpretable predictive models:

random forests (Breiman, 2001b) with Shapley values (Lundberg et al., 2017, 2019). Based

on our findings, the short answer to the question posed in the title, “is infidelity predictable?”,

is somewhat. The effect sizes that take into account the true and false positives and negatives

of both classes ranged between small (.08) to large effect (.49) across analyses and samples

suggesting that even though we were able to predict infidelity generally well above chance

level, there are also other factors that we had not accounted for.
13
PREDICTING INFIDELITY WITH MACHINE LEARNING
While we examined the predictive accuracy of our models, our main aim was to

compare a range of different factors in their ability to predict infidelity. A recent systematic

review found that while demographics and individual characteristics are inconsistently

associated with infidelity, relationship variables tend to be more consistent across studies

(Haseli et al., 2019). We also found that relationship characteristics (relationship satisfaction,

relationship length, dyadic desire, sexual satisfaction, romantic love, and some sexual

activities within the relationship) were consistently in the top-10 most important predictors

across different samples. These findings suggest that addressing relationship issues early on

in the relationship may buffer against the likelihood of one partner going out of the

relationship to seek fulfilment. However, it is also important to note that while individuals

who were more satisfied in their relationship were generally less likely to engage in

infidelity, a subsample of highly satisfied individuals had engaged in infidelity in the past.

This may either reflect the idea that infidelity does also occur in happy relationships (Perel,

2017) or perhaps couples have worked through the infidelity and by the time they responded

to the survey were satisfied in their relationship (Atwater, 1982; Olson et al., 2002).

Furthermore, online infidelity has become more commonplace given the technological

advances in recent years (Albright, 2008). Therefore, we also examined predictors of online

infidelity in two of the three samples. Interestingly, one of the strongest predictors of a

decreased likelihood of having engaged in infidelity online was never having had anal in the

present relationship. This may reflect more restrictive attitudes toward sexuality overall.

Indeed, attitudes toward sexuality were measured in Sample 1 and ranked among the top-10

predictors of online infidelity. However, the relationship was more complex with the most

liberal sexual attitudes predicting an increase in likelihood of having engaged in infidelity

whereas more moderate and conservative attitudes predicted a decrease. These results are in

line with other studies that have found that more permissive sexual attitudes have been
14
PREDICTING INFIDELITY WITH MACHINE LEARNING
associated with an increased likelihood of having engaged in infidelity (Fincham & May,

2017; Haseli et al., 2019; Martins et al., 2016). Higher relationship length and sexual desire

also increased the likelihood of having engaged in online infidelity. However, sexual and

relationship satisfaction were only among the top predictors in one of the two samples.

Because the studies were all cross-sectional in nature and some characteristics (e.g.,

relationship quality) may have changed since having engaged in infidelity, we also examined

future intentions toward infidelity in Sample 3. The results showed that higher relationship

and sexual satisfaction as well as romantic love predicted a decrease in intentions to engage

in infidelity whereas previous infidelity, dyadic and solitary desire, as well as longer

relationship length predicted increased intentions to engage in infidelity. Attending religious

services weekly also increased intentions to engage in infidelity for both men and women

whereas having no religion was associated with less intention toward infidelity in men.

Although not consistent, religion generally predicted a decreased likelihood of having

engaged in infidelity which is in line with previous research (Burdette et al., 2007; Fincham

& May, 2017; Mattingly et al., 2010). Therefore, it is possible that individuals who are more

religious fantasize about engaging in infidelity but are also less likely to actually act on those

fantasies. This is potentially interesting proposition and warrants further investigation.

While the results of the present study corroborate many of the existing studies and

akin to a recent systematic review (Haseli et al., 2019) show that the most robust predictors of

infidelity lie within the relationship: individuals who are more satisfied and in love in their

relationship are less likely to have engaged in infidelity and have less intentions to engage in

infidelity in the future. There are also a number of factors that have previously been

associated with infidelity that were not among the most important predictors in the present

study: education (Atkins et al., 2001; Martins et al., 2016; Treas & Giesen, 2000),

relationship status (Amato & Previti, 2004; Fincham & May, 2017), and attachment
15
PREDICTING INFIDELITY WITH MACHINE LEARNING
(Fincham & May, 2017; Haseli et al., 2019; McDaniel et al., 2017). We only examined

attachment in Sample 1 and higher attachment avoidance did predict an increased likelihood

of having engaged in infidelity in the total sample but was not among the top-10 predictors

for men or women. Attachment anxiety was not predictive of past infidelity. Furthermore,

many previous studies suggest that men are more likely to engage in sexual infidelity than

women (Labrecque & Whisman, 2017; Petersen & Hyde, 2010). In the present study, being a

man was only an important predictor of past online infidelity in one sample supporting

studies that have found that the gender gap in infidelity is decreasing (Allen et al., 2006;

Fincham & May, 2017; Mark et al., 2011; Treas & Giesen, 2000).

The present study adds to our understanding of the most important predictors for

infidelity across three samples. We used a powerful interpretable machine learning technique

that allowed us to produce reliable estimates of the effect sizes of each variable both for the

mean effect as well as the spread of the individual effects (Lundberg et al., 2017, 2019).

Using this method, we were also able to compare a large number of predictors simultaneously

and estimate any non-linear associations and complex interactions. We also examined both

in-person and online infidelity as well as intention toward future infidelity.

However, the study also had a number of limitations that should be considered. First,

we used a single item measure of in-person and online infidelity and only used a validated

measure for intention toward future infidelity. We were thus unable to account for specific

infidelity behaviors and did not examine emotional infidelity. Future research is needed to

examine a wider range of infidelity behaviors to better understand whether the same

predictors generalize across multiple forms of infidelity or whether these are predicted by

different variables. The results from the present study suggest that these may be somewhat

different given that the most important predictors of in-person and online infidelity also

varied. Second, while we examined infidelity across three large samples, two of which
16
PREDICTING INFIDELITY WITH MACHINE LEARNING
included data from both members of the couple, the studies were all cross-sectional and it is

not clear how recently the infidelity occurred. Therefore, some of the factors may have

changed from when the infidelity occurred to when the participants completed the survey.

This is a difficulty across most other studies on infidelity but future research should examine

infidelity over time or to conduct surveys on individuals who have just engaged in infidelity.

Third, over 30% of the participants in Sample 1 reported past infidelity. However, the

number of participants who had engaged in infidelity in the dyadic samples was much lower.

This made it more difficult for the algorithm to accurately predict infidelity which is reflected

in lower precision and recall for the infidelity class compared to no infidelity. We used

balanced random forests in order to mitigate this issue but we still had less data available of

people with past infidelity. Finally, while random forests are a powerful tool that will take

advantage of any correlations and interactions in the data, no matter how non-linear, it cannot

be used to estimate causality. However, in the absence of a means to reliably estimate

causality when examining factors relating to infidelity (after all we cannot create experiments

in which we make people engage in infidelity), we believe that using a predictive model is

perhaps the best option.

In conclusion, the present study provides the most robust and reliable evidence of

factors associated with past in-person and online infidelity as well as intentions toward future

infidelity. The results showed that relationship variables were the most robust predictors of

both past and future infidelity whereas demographics and individual differences variables

were not consistently associated with infidelity. These results suggest that intervening early

on in relationships when difficulties first arise may be the best way to prevent future

infidelity. Furthermore, because sexual desire was one of the most robust predictors of

infidelity, discussing sexual needs and desires and finding ways to meet those needs in

relationships is also likely to decrease the risk of infidelity.


17
PREDICTING INFIDELITY WITH MACHINE LEARNING
Methods

Sample 1

Participants and Procedure

The data were collected as part of a larger cross-sectional study. Participants were

recruited through mTurk and were asked to complete an online survey and were paid 30 cents

for the task. Recruitment was also conducted through social networking sites (e.g., Facebook,

Twitter), email listservs, and targeted recruitment for sexual minority participants on online

forums. Participants recruited from these mediums were entered into a draw to win one of

four $40 Amazon gift cards. Participants were eligible for the study if they were over 18

years of age and had experience with at least one romantic relationship. Ethical approval was

obtained from the [blinded for peer-review] institutional review board and all participants

received a written informed consent at the start of the baseline survey. Details of the

procedure can be found from [blinded for peer review].

A total of 1,097 participants consented to participate. Participants who had not

completed the study, had a large amount of missing data, or were missing the outcome

variable were removed from the analyses. Therefore, the final sample consisted of 891

participants; 557 (62.5%) cis-gender women, 279 (31.3%) cis-gender men, and 25 (2.8%)

genderqueer. Most of the participants were straight (n = 483; 53.9%), 189 (21.2%) identified

as bisexual, 101 (11.3%) gay, and 60 (6.7%) lesbian. Majority of the participants were White

(88.4%), married or cohabiting (62.7%), had at least one child (24.5%), had at least some

level of college (95.8%), and did not identify with any religion (54.5%). The average age of

the participants was 32.7 years (SD = 9.63) and the average relationship length for those who

were in a relationship was 6.21 (SD = 7.12).

Measures
18
PREDICTING INFIDELITY WITH MACHINE LEARNING
We included all measures as predictor variables that were collected in the study,

which included a total of 95 variables after recoding all categorical variables into dummy

variables. These included demographic questions on age, race/ethnicity, gender, sexual

orientation, relationship status, children, and education. Participants also completed questions

around their contraceptive use, sexual behaviors, whether they wanted sex or communication

more or less than they were currently engaging in, and mental and physical health. The

outcome, infidelity, was measured using a single item question for in person infidelity (“I had

sex (e.g., vaginal sex, anal sex, oral sex) with someone other than my current partner”) and

online infidelity (“I interacted sexually with someone other than my current partner on the

Internet (had chat room sex, web cam sex, etc.)”). Both questions were dichotomized with

yes = 1 and no = 0. The following constructs were assessed using previously validated

questionnaires:

Sexual desire was assessed using the Sexual Desire Inventory (SDI; Spector et al.,

1996). The scale was used as both a single scale (13 items) as well as divided into dyadic

(nine items) and solitary desire (four items) and assesses an individual’s interest sexual

activity over the past month with higher scores being indicative of higher sexual desire.

Sexual desire was also assessed using the Halbert Index for Sexual Desire (HISD; Yousefi et

al., 2014) which measures sexual desire using 25 items with higher scores being indicative of

higher sexual desire. Sexual satisfaction was assessed using the General Measure of Sexual

Satisfaction Scale (GMSEX; Lawrance & Byers, 1992). The GMSEX is a 5-item measure

used to assess satisfaction with the sexual relationship. Relationship satisfaction was assessed

using the General Measure of Relationship Satisfaction (GMREL; Lawrance & Byers, 1992).

Both GMREL and GMSEX are scored on a 7-point semantic differential scale and higher

scores are indicative of greater satisfaction. Dispositional mindfulness was measured using

the Five Facet Mindfulness Questionnaire – short form (FFMQ-SF; Bohlmeijer et al., 2011).
19
PREDICTING INFIDELITY WITH MACHINE LEARNING
The scale comprises of a total of 24 items that are divided into five subscales: being non-

reactive, observant, acting with awareness, describing feelings, and non-judgmental attitude.

The items are scored on a 5-point Likert scale with higher scores indicating participants’

agreement with the statement. Attitudes Toward Sexuality Scale (ATSS; Fisher & Hall, 1988)

was used to assess participants’ attitudes toward sexuality. The scale comprises of 13 items

that are measured on a 5-point Likert scale with higher scores indicating the participant is

more liberal, lower more conservative. The Perception of Love and Sex Scale (PLSS;

Hendrick & Hendrick, 2002) measures one’s perception of love and sex comprising of four

subscales: love is most important (six items), sex demonstrates love (four items), love comes

before sex (four items), and sex is declining (three items). The items are measured on a 5-

point Likert scale with higher scores indicating higher agreement. Attachment style was

assessed using the Experience in Close Relationships Scale – Short form (ECR-S; Wei et al.,

2007). The ECR-S consists of two 6-item Likert scales: one for anxiety and one for

avoidance. Higher scores indicate higher levels of insecure attachment.

Sample 2

Participants and Procedure

We used baseline data from a longitudinal study of couples. The couples were

recruited through various listservs, websites, and social media (e.g., Facebook, Twitter).

Participants who were 18 years of age or older, in a mixed sex relationship for a minimum of

three years, currently living with that partner, with no children under the age of one, and not

pregnant (or with a pregnant partner) at the time, met the inclusion criteria and were directed

to provide their partner’s email address. Partners were then emailed the same information that

the initial potential participant was provided and asked the same eligibility criteria questions.

If the partner also met eligibility criteria and agreed to participate, they were both sent

individual unique links to the baseline survey. Participants who completed the baseline were
20
PREDICTING INFIDELITY WITH MACHINE LEARNING
provided with a $10 gift card ($20/couple). Ethical approval was obtained from the [blinded

for peer-review] institutional review board and all participants received a written informed

consent at the start of the baseline survey. Details of the procedure can be found from

[blinded for peer review].

The sample consisted of 202 mixed-sex couples (404 individuals). The majority of

participants (89%) were from the United States, with a minority of the participants from

Canada (11%). The mean age of the sample was 32.5 (SD = 8.90) relationship length of the

couples was 9.19 (SD = 6.85) years. The majority of the sample identified as heterosexual

(93%), with a minority identifying as bisexual (5%), questioning or uncertain (1%), and other

(1%). The majority of participants were White (89%) and this was a fairly educated sample,

with 96% indicating they had attended at least some college.

Measures

The study used many of the same measures as Sample 1 and had a total of 66

variables1. The following questionnaires were not available in the sample: attachment styles

(ECR-S), attitudes toward sexuality (ATSS), Halbert Index of Sexual Desire (HISD), trait

mindfulness (FFQM-SF), and perception of love and sex (PLSS). The study had an additional

scale measuring romantic love, the Romantic Love Scale (Rubin, 1970). The scale consists of

13 items that are meant to measure affiliative and dependent need, a predisposition to help,

and orientation of exclusiveness and absorption. The scale is scored on a 9-point scale with

higher scores indicating higher romantic love. For dyadic analyses, both dyad members’

scores were included as predictors. The outcome measures were the same as in Sample 1.

Sample 3

Participants and Procedure

1
The lower number of variables in the dataset is mainly due to the sample being of dyadic mixed-sex couples
and therefore many of the variables had fewer categories and thus fewer dummy coded variables (e.g.,
relationship status, sexual orientation)
21
PREDICTING INFIDELITY WITH MACHINE LEARNING
The final sample consists of couples in which at least one member of the dyad

identified as bisexual. Participants were recruited for the current study utilizing targeted

recruitment in bisexual spaces primarily online (e.g., bisexual-focused websites, Facebook,

Twitter, and Reddit). The recruitment messaging explicitly stated that the study aimed to

recruit bisexual individuals and their partners in mixed-sex relationships. A participant met

eligibility criteria if they were over the age of 18, identified as bisexual, were in a romantic

mixed-sex relationship at the time of the survey, and were willing to provide the email

address of their partner to also participate. The respondent first completed the online survey

in which they provided an email address for their partner who was then contacted to complete

the survey. Ethical approval was obtained from the [blinded for peer-review] institutional

review board and all participants received a written informed consent at the start of the

baseline survey. Details of the procedure can be found from [blinded for peer review].

A total of 552 participants completed the baseline survey. Of those, there were 354

individuals who contributed to a dyad (177 couples) and 198 individuals whose partner did

not complete the survey. There were a total of 203 (37%) men, 337 (61%) women, and 12

(2%) transgender/non-binary; 153 (28%) were straight and 380 (69%) were bisexual.

Participants were 29 years old on average (SD = 6.95; range 18-50). The vast majority of the

participants were White (n = 447; 81%), married (n = 299; 54%), and had completed at least

some college (n = 480; 87%). Many participants did not identify with a specific religious

identity (n = 309; 56%) or were Christian (n = 170; 31%). On average, participants had been

in their current relationship for 6.10 years (SD = 5.36); 400 (72%) of those relationships were

monogamous and 152 (28%) were consensually non-monogamous.

Measures

The sample included all the same measures as Sample 2 but also had some additional

measures. These include self-esteem (Rosenberg, 1965) which is a 10-item, 5-point Likert
22
PREDICTING INFIDELITY WITH MACHINE LEARNING
scale with higher scores indicating higher self-esteem, and satisfaction with life scale (Diener

et al., 1985) which is a 5-item, 7-point Likert scale with higher scores indicating better life

satisfaction. The outcome measures used were also different because a quarter of the sample

was consensually non-monogamous and therefore likely to be regularly engaging in sexual

activity with someone other than their primary partner. For this sample, we measured

infidelity using a question “Have you done something sexual with another person that could

hurt the relationship?”. We also used a measure of intentions toward infidelity. The Intentions

Toward Infidelity Scale (ITIS; Jones et al., 2010) assesses the likelihood of someone being

unfaithful to their partner. The scale consists of seven items and the response options range

from -3 (Not at all likely) to +3 (Extremely likely).

Data Analysis

Data Preparation. All categorical variables were dummy coded (0 and 1) with each

option included in the models. Any variables that were essentially the same as the outcome

variable were removed from the analyses. Any missing variables were imputed using

Random forest multiple imputation. Less than 0.1% of the data were missing, and any

missing data points were imputed using the scikit-learn package Iterative Imputer (Pedregosa

et al., 2011) with a Bayesian ridge estimator.

Analyses. All data were analyzed at the individual level with the full sample, with

men only, and with women only. Additionally, the data from dyads in which both members

of the couple had responded to the questionnaire was also analyzed separately for men and

women including both actor and partner effects in the model. The results were analyzed using

Python 3.7 and the code can be found here: [blinded for peer-review]. Each dataset was

analyzed using either a random forest regressor (Breiman, 2001b), or a balanced random

forest classifier (Breiman, 2001b; Chen et al., 2004) for continuous and categorical outcomes,

respectively. A random forest is a type of decision tree that trains on bootstrapped sub-
23
PREDICTING INFIDELITY WITH MACHINE LEARNING
samples of the data in order to avoid overfitting. The tree can model highly non-linear

relationships in the data, and therefore represents a significantly more flexible model than a

logistic regression. In cases where one class occurs much more often than another, many

classifiers may learn to predict the majority class well, but not learn important associations

necessary to predict the minority class. The balanced random forest variant, for categorical

outcomes, is designed to provide better results in scenarios where there may be a class

imbalance in the dataset. In the current study, there is imbalance between participants who

had engaged in infidelity and those who had not. The balanced random forest is able to

mitigate the problems associated with unequal class ‘support’ by undersampling the majority

class in the bootstrapping process, thereby balancing the classes during training.

In general, random forest models are sensitive to hyperparameter settings (such as the

number of estimators, or the maximum depth of the decision tree). However, tuning

hyperparameters requires a separate validation data split which reduces the effective sample

size available for training and testing. Therefore, we use the default “imbalanced-learn”

balanced random forest classifier (IMBLEARN cite) and the default “scikit learn” random

forest regressor (Pedregosa et al., 2011) with k-fold cross-validation. The out-of-bag error is

a built-in metric frequently used to estimate the performance of random forests (Joel et al.,

2017, 2020), but in some circumstances this metric been shown to be biased above the true

error (Janitza & Hornung, 2018; Mitchell, 2011). By using a k-fold cross-validation

approach, instead of the out-of-bag error, we were able to test the model over the entire

dataset, and to acquire estimates for the standard error (see below). It is essential that the

trained model is tested on a separate partition of the dataset, even for less complex linear

models, when any data-driven decisions are made (Heyman & Smith Slep, 2001; Yarkoni &

Westfall, 2017).
24
PREDICTING INFIDELITY WITH MACHINE LEARNING
A ten-fold cross-validation scheme was used to train and test the model. This means

the total dataset is randomly split into ten equally sized folds. The model is trained on nine

out of ten folds, tested on the tenth, and the test fold performance is recorded. This is repeated

until all ten folds have been used as a test set. The average performance, as well as the

standard error across the ten folds, provide an estimate of model performance on unseen data.

For continuous outcome (intention toward infidelity), the metrics for test data model

performance are the mean-squared error (which is the averaged squared difference between

the prediction and the observed value), the R2, and the variance explained. For the binary

outcomes (past online and in-person infidelity) the metrics for test data model performance

are the precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). These

metrics provide a more complete picture than an accuracy score, particularly for imbalanced

data. For instance, if a dataset contained a 90/10 imbalance, an accuracy of 90% could be

achieved simply by predicting the majority class for all new datapoints, and is therefore

meaningless. In contrast, precision is the ratio of true positives to the sum of true positives

and false positives; recall is the ratio of true positives to the sum of true positives and false

negatives, and the F1-score is the harmonic mean of precision and recall. These metrics

therefore provide a more complete picture about a classifier’s performance on imbalanced

data. Arguably the best summary statistic for imbalanced classification problems is MCC

(Boughorbel et al., 2017; Chicco & Jurman, 2020; Matthews, 1975). The MCC provides a

score bounded between [-1, 1] and is directly analogous to Pearon’s correlation coefficient. If

MCC=0 then the classifier is no better than random chance, if MCC=1 then the classifier

achieves perfect prediction, and if MCC=-1 the classifier perfectly predicts the opposite of

the correct class.


25
PREDICTING INFIDELITY WITH MACHINE LEARNING
The last model to be trained as part of the k-fold cross-validation process is saved,

and explained using the “SHapley Additive exPlanations” package (SHAP) (Lundberg et al.,

2017, 2019, 2020). The SHAP package is a unified framework for undertaking model

explainability, and derives from the seminal game theoretic work of Lloyd Shapley (Shapley,

1952). The framework conceives of predictors as collaborating agents seeking to maximize a

common goal (i.e., the regressor performance). The approach involves systematically

evaluating changes in model performance in response to including or restricting the influence

from different combinations of predictors. Traditional approaches (e.g., using the coefficients

from a linear model, or importances from a random forest) are unreliable and ‘inconsistent’,

and the Shapley approach has been shown to provide explanations with certain theoretic

guarantees (Lundberg et al., 2020). The SHAP TreeExplainer function provides estimations

of the per-datapoint, per-predictor impact on model output, as well as the average predictor

impacts. This function provides estimations of the impact of per-datapoint pairwise

interactions on model output. For the analysis the default settings of the SHAP package

TreeExplainer were used, and the entire dataset was fed to the model for explanation. The

combination of the powerful function approximation capabilities of random forests with the

consistent and meaningful estimations of per-datapoint, per-predictor impact on model output

enables a reliable and informative exploration of predictor importance, as well as a means to

identify key predictor interactions.


26
PREDICTING INFIDELITY WITH MACHINE LEARNING
References

Albright, J. M. (2008). Sex in America online: An exploration of sex, marital status, and

sexual identity in internet sex seeking and its impacts. Journal of Sex Research, 45(2),

175–186. https://doi.org/10.1080/00224490801987481

Allen, E. S., Atkins, D. C., Baucom, D. H., Snyder, D. K., Gordon, K. C., & Glass, S. P.

(2006). Intrapersonal, interpersonal, and fontextual Factors in engaging in and

responding to extramarital involvement. Clinical Psychology: Science and Practice,

12(2), 101–130. https://doi.org/10.1093/clipsy.bpi014

Amato, P. R., & Previti, D. (2004). People’s reasons for divorcing: Gender, social class, the

life course, and adjustment. Journal of Family Issues, 24(5), 602–626.

https://doi.org/10.1177/0192513x03254507

Atkins, D. C., Baucom, D. H., & Jacobson, N. S. (2001). Understanding infidelity: Correlates

in a national random sample. Journal of Family Psychology, 15(4), 735–749.

https://doi.org/10.1037//0893-3200.15.4.735

Atwater, L. (1982). The Extramarital Connection: Sex, Intimacy, and Identity. Irvington.

Betzig, L. (1989). Causes of conjugal dissolution: A cross-cultural study. Source: Current

Anthropology, 30(5), 654–676.

Blow, A. J., & Hartnett, K. (2005a). Infidelity in committed relationships I: A methodological

review. In Journal of Marital and Family Therapy (Vol. 31, Issue 2, pp. 183–216).

Blackwell Publishing Inc. https://doi.org/10.1111/j.1752-0606.2005.tb01555.x

Blow, A. J., & Hartnett, K. (2005b). Infidelity in committed relationships II: A substantive

review. In Journal of Marital and Family Therapy (Vol. 31, Issue 2, pp. 217–233).

Blackwell Publishing Inc. https://doi.org/10.1111/j.1752-0606.2005.tb01556.x

Bohlmeijer, E., Klooster, P. M., Fledderus, M., Veehof, M., & Baer, R. (2011). Psychometric

properties of the five facet mindfulness questionnaire in depressed adults and


27
PREDICTING INFIDELITY WITH MACHINE LEARNING
development of a short form. Assessment, 18(3), 308–320.

https://doi.org/10.1177/1073191111408231

Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data

using Matthews Correlation Coefficient metric. PLOS ONE, 12(6), e0177678.

https://doi.org/10.1371/journal.pone.0177678

Breiman, L. (2001a). Statistical modeling: The two cultures. In Statistical Science (Vol. 16,

Issue 3).

Breiman, L. (2001b). Random forests. Machine Learning, 45(1), 5–32.

https://doi.org/10.1023/A:1010933404324

Burdette, A. M., Ellison, C. G., Sherkat, D. E., & Gore, K. A. (2007). Are there religious

variations in marital infidelity? Journal of Family Issues, 28(12), 1553–1581.

https://doi.org/10.1177/0192513X07304269

Chen, C., Liaw, A., & Breiman, L. (2004). Using random forest to learn imbalanced data.

Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient

(MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics,

21(1), 6. https://doi.org/10.1186/s12864-019-6413-7

Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159.

https://doi.org/10.1037/0033-2909.112.1.155

Diener, E., Emmons, R. A., Larsem, R. J., & Griffin, S. (1985). The Satisfaction With Life

Scale. Journal of Personality Assessment, 49(1), 71–75.

https://doi.org/10.1207/s15327752jpa4901_13

Fincham, F. D., & May, R. W. (2017). Infidelity in romantic relationships. In Current

Opinion in Psychology (Vol. 13, pp. 70–74). Elsevier.

https://doi.org/10.1016/j.copsyc.2016.03.008

Fisher, T. D., & Hall, R. G. (1988). A scale for the comparison of the sexual attitudes of
28
PREDICTING INFIDELITY WITH MACHINE LEARNING
adolescents and their parents. The Journal of Sex Research, 24(1), 90–100.

https://doi.org/10.1080/00224498809551400

Glass, S. P., & Wright, T. L. (1985). Sex differences in type of extramarital involvement and

marital dissatisfaction. Sex Roles, 12(9–10), 1101–1120.

https://doi.org/10.1007/BF00288108

Großmann, I., Hottung, A., & Krohn-Grimberghe, A. (2019). Machine learning meets partner

matching: Predicting the future relationship quality based on personality traits. PLoS

ONE, 14(3), 1–16. https://doi.org/10.1371/journal.pone.0213569

Haseli, A., Shariati, M., Nazari, A. M., Keramat, A., & Emamian, M. H. (2019). Infidelity

and its associated factors: A systematic review. In Journal of Sexual Medicine (Vol. 16,

Issue 8, pp. 1155–1169). Elsevier B.V. https://doi.org/10.1016/j.jsxm.2019.04.011

Hendrick, S. S., & Hendrick, C. (2002). Linking romantic love with sex: Development of the

perceptions of love and sex scale. Journal of Social and Personal Relationships, 19(3),

361–378. https://doi.org/10.1177/0265407502193004

Heyman, R. E., & Smith Slep, A. M. (2001). The hazards of predicting divorce without

crossvalidation. Journal of Marriage and Family, 63(2), 473–479.

https://doi.org/10.1111/j.1741-3737.2001.00473.x

Janitza, S., & Hornung, R. (2018). On the overestimation of random forest’s out-of-bag error.

PLoS ONE, 13(8), e0201904. https://doi.org/10.1371/journal.pone.0201904

Joel, S., Eastwick, P. W., Allison, C. J., Arriaga, X. B., Baker, Z. G., Bar-Kalifa, E.,

Bergeron, S., Birnbaum, G., Brock, R. L., Brumbaugh, C. C., Carmichael, C. L., Chen,

S., Clarke, J., Cobb, R. J., Coolsen, M. K., Davis, J., Jong, D. C. de, Debrot, A., DeHaas,

E. C., … Wolf, S. (2020). Machine learning uncovers the most robust self-report

predictors of relationship quality across 43 longitudinal couples studies. Manuscript

submitted for publication. https://doi.org/10.1073/pnas.1917036117


29
PREDICTING INFIDELITY WITH MACHINE LEARNING
Joel, S., Eastwick, P. W., & Finkel, E. J. (2017). Is romantic desire predictable? Machine

learning applied to initial romantic attraction. Psychological Science, 28, 1478–1489.

https://doi.org/10.1177/0956797617714580

Jones, D. N., Olderbak, S. G., & Figueredo, A. J. (2010). Intentions Towards Infidelity Scale.

https://doi.org/10.4324/9781315881089.CH87

Labrecque, L. T., & Whisman, M. A. (2017). Attitudes toward and prevalence of extramarital

sex and descriptions of extramarital partners in the 21st century. Journal of Family

Psychology, 31(7), 952–957. https://doi.org/10.1037/fam0000280

Lawrance, K., & Byers, E. S. (1992). Development of the interpersonal exchange model of

sexual satisfaction in long-term relationships. The Canadian Journal of Human

Sexuality, 1, 123–128. https://doi.org/10.1111/j.1475-6811.1995.tb00092.x

Liu, C. (2000). A theory of marital sexual life. Journal of Marriage and Family, 62(2), 363–

374. https://doi.org/10.1111/j.1741-3737.2000.00363.x

Lundberg, S. M., Allen, P. G., & Lee, S.-I. (2017). A unified approach to interpreting model

predictions. Neural Information Processing Systems. https://github.com/slundberg/shap

Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R.,

Himmelfarb, J., Bansal, N., & Lee, S. I. (2020). From local explanations to global

understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67.

https://doi.org/10.1038/s42256-019-0138-9

Lundberg, S. M., Erion, G. G., & Lee, S.-I. (2019). Consistent individualized feature

attribution for tree ensembles. http://github.com/slundberg/shap

Mark, K. P., & Haus, K. R. (2019). Extradyadic relations. In A. C. Michalos (Ed.),

Encyclopedia of Quality of Life and Well-Being Research. (pp. 2102–2105). Springer.

Mark, K. P., Janssen, E., & Milhausen, R. R. (2011). Infidelity in heterosexual couples:

Demographic, interpersonal, and personality-related predictors of extradyadic sex.


30
PREDICTING INFIDELITY WITH MACHINE LEARNING
Archives of Sexual Behavior, 40(5), 971–982. https://doi.org/10.1007/s10508-011-9771-

Martins, A., Pereira, M., Andrade, R., Dattilio, F. M., Narciso, I., & Canavarro, M. C. (2016).

Infidelity in dating relationships: Gender-specific correlates of face-to-face and online

extradyadic involvement. Archives of Sexual Behavior, 45(1), 193–205.

https://doi.org/10.1007/s10508-015-0576-3

Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of

T4 phage lysozyme. BBA - Protein Structure, 405(2), 442–451.

https://doi.org/10.1016/0005-2795(75)90109-9

Mattingly, B. A., Wilson, K., Clark, E. M., Bequette, A. W., & Weidler, D. J. (2010). Foggy

faithfulness: Relationship quality, religiosity, and the Perceptions of Dating Infidelity

Scale in an adult sample. Journal of Family Issues, 31(11), 1465–1480.

https://doi.org/10.1177/0192513X10362348

McDaniel, B. T., Drouin, M., & Cravens, J. D. (2017). Do you have anything to hide?

Infidelity-related behaviors on social media sites and marital satisfaction. Computers in

Human Behavior, 66, 88–95. https://doi.org/10.1016/j.chb.2016.09.031

Mitchell, M. W. (2011). Bias of the random forest out-of-bag (OOB) error for certain input

parameters. Open Journal of Statistics, 01(03), 205–211.

https://doi.org/10.4236/ojs.2011.13024

Olson, M. M., Russell, C. S., Higgins-Kessler, M., & Miller, R. B. (2002). Emotional

processes following disclosure of an extramarital affair. Journal of Marital and Family

Therapy, 28(4), 423–434. https://doi.org/10.1111/j.1752-0606.2002.tb00367.x

Owen, J., Rhoades, G. K., & Stanley, S. M. (2013). Sliding versus deciding in relationships:

Associations with relationship quality, commitment, and infidelity. Journal of Couple

and Relationship Therapy, 12(2), 135–149.


31
PREDICTING INFIDELITY WITH MACHINE LEARNING
https://doi.org/10.1080/15332691.2013.779097

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,

Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D.,

Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in

Python. Journal of Machine Learning Research, 12, 2825–2830. http://scikit-learn.org.

Perel, E. (2017). The State Of Affairs: Rethinking Infidelity - a book for anyone who has ever

loved. Hachette UK.

Petersen, J. L., & Hyde, J. S. (2010). A meta-analytic review of research on gender

differences in dexuality, 1993-2007. Psychological Bulletin, 136(1), 21–38.

https://doi.org/10.1037/a0017504

Rosenberg, M. (1965). Society and the Adolescent Self-Image. Princeton University Press.

https://www.jstor.org/stable/j.ctt183pjjh

Rubin, Z. (1970). Measurement of romantic love. Journal of Personality and Social

Psychology, 16(2), 265–273. https://doi.org/10.1037/h0029841

Selterman, D., Garcia, J. R., & Tsapelas, I. (2019). Motivations for extradyadic infidelity

revisited. Journal of Sex Research, 56(3), 273–286.

https://doi.org/10.1080/00224499.2017.1393494

Shapley, L. S. (1952). A Value for n-Person Games. RAND Corporation.

Spanier, G. B., & Margolis, R. L. (1983). Marital separation and extramarital sexual

behavior. The Journal of Sex Research, 19(1), 23–48.

https://doi.org/10.1080/00224498309551167

Spector, I. P. I. P., Carey, M. P. M. P., & Steinberg, L. (1996). The sexual desire inventory:

Development, factor structure, and evidence of reliability. Journal of Sex and Marital

Therapy, 22(3), 175–190. https://doi.org/10.1080/00926239608414655

Thompson, A. E., & O’Sullivan, L. F. (2016). Drawing the line: The development of a
32
PREDICTING INFIDELITY WITH MACHINE LEARNING
comprehensive assessment of infidelity judgments. Journal of Sex Research, 53(8), 910–

926. https://doi.org/10.1080/00224499.2015.1062840

Treas, J., & Giesen, D. (2000). Sexual infidelity among married and cohabiting Americans.

Journal of Marriage and Family, 62(1), 48–60. https://doi.org/10.1111/j.1741-

3737.2000.00048.x

Wei, M., Russell, D. W., Mallinckrodt, B., & Vogel, D. L. (2007). The Experiences in Close

Relationship Scale (ECR)-short form: Reliability, validity, and factor structure. Journal

of Personality Assessment, 88(2), 187–204. https://doi.org/10.1080/00223890701268041

Whisman, M. A., Dixon, A. E., & Johnson, B. (1997). Therapists’ perspectives of couple

problems and treatment issues in couple therapy. Journal of Family Psychology, 11(3),

361–366. https://doi.org/10.1037/0893-3200.11.3.361

Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology:

Lessons from machine learning. Perspectives on Psychological Science, 12(6), 1100–

1122. https://doi.org/10.1177/1745691617693393

Yousefi, N., Farsani, K., Shakiba, A., Hemmati, S., & Nabavi Hesar, J. (2014). Halbert Index

of Sexual Desire (HISD) questionnaire validation. Scientific Journal of Clinical

Psychology & Personality, 2(9), 107–118.


33
PREDICTING INFIDELITY WITH MACHINE LEARNING
Table 1

The Overall Results for Infidelity, Infidelity Online, and Intention toward Infidelity across the Three Samples

Sample 1 Sample 2 Sample 3


Outcome Class Pre Rec F1 MCC a Pre Rec F1 MCC Pre Rec F1 MCC
Infidelity
All: 0 .81 (.02) .65 (.02) .72 (.02) .28 .92 (.01) .80 (.02) .85 (.01) .36 (.06) .92 (.02) .71 (.02) .80 (.02 .30 (.04)
1 .48 (.03) .69 (.02) .56 (.02) (.03) .40 (.05) .62 (.06) .48 (.06) .32 (.03) .69 (.06) .42
(.03)
Men: 0 .70 (.05) .66 (.03) .66 (.03) .28 .91 (.03) .69 (.03) .78 (.02) .32 (.03) .91 (.03) .64 (.05) .74 .15 (.10)
1 .58 (.04) .63 (.05) .59 (.03) (.08) .34 (.03) .73 (.07) .44 (.04) .20 (.06) .58 (.13) (.03)
.28
(.07)
Men dyadic: 0 .92 (.02) .77 (.03) .84 (.02) .42 (.06) .90 (.03) .54 (.03) .67 .08 (.09)
1 .43 (.06) .75 (.07) .52 (.05) .16 (.04) .57 (.13) (.03)
.25
(.06)
Women: 0 .84 (.02) .65 (.03) .73 (.02) .25 .92 (.02) .79 (.03) .85 (.03) .35 (.09) .92 (.02) .73 (.02) .81 .33 (.05)
1 .39 (.03) .62 (.05) .46 (.03) (.04) .36 (.07) .64 (.10) .79 (.08) .35 (.04) .67 (.07) (.01)
.45
(.04)
Women dyadic: 0 .93 (.02) .78 (.04) .84 (.02) .35 (.08) .93 (.02) .70 (.04) .79 .23 (.08)
1 .36 (.06) .65 (.11) .44 (.07) .26 (.06) .54 (.11) (.03)
.34
(.08)
Infidelity online
All: 0 .87 (.02) .70 (.02) .77 (.01) .36 .94 (.02) .80 (.02) .86 (.01) .38 (.06)
1 .46 (.03) .71 (.02) .55 (.02) (.02) .36 (.06) .71 (.08) .44 (.06)
Men: 0 .72 (.06) .64 (.04) .67 (.04) .28 .94 (.03) .80 (.03) .86 (.02) .33 (.05)
1 .56 (.04) .65 (.06) .59 (.04) (.08) .36 (.04) .71 (.09) .44 (.04)
Men dyadic: 0 .90 (.03) .67 (.02) .77 (.02) .24 (.05)
1 .27 (.04) .68 (.08) .37 (.05)
Women: 0 .88 (.02) .63 (.02) .73 (.02) .18 .98 (.01) .85 (.04) .90 (.02) .49 (.07)
1 .27 (.03) .60 (.07) .36 (.04) (.05) .41 (.07) .79 (.10) .51 (.08)
Women dyadic: 0 .96 (.01) .87 (.03) .91 (.02) .40 (.11)
1 .39 (.11) .62 (.14) .45 (.11)
% Var MSE R2
34
PREDICTING INFIDELITY WITH MACHINE LEARNING
Intention toward
infidelity
All 42.0 0.82 (.08) .47
(.05) (.05)
Men 58.0 0.79 (.09) .56
(.04) (.04)
Men dyadic 58.8 0.85 (.10) .54
(.06) (.06)
Women 31.6 0.93 (.14) .26
(.05) (.04)
Women dyadic 40.5 0.80 (.10) .36
(.05) (.06)
Note. Standard error across the ten folds is in brackets. 0 = no infidelity, 1 = infidelity. Pre = precision, Rec = recall, MCC = Matthews

correlations coefficient, % Var = percentage of variance explained, MSE = mean squared error.

a. MCC is the overall effect size of the classification that takes into account the true and false positives and negatives in each class and

provides an overall measure of accuracy. The MCC can be interpreted akin to Pearson’s correlation coefficient with effect sizes of small

= .1, medium = .3, and large = .5.


35
PREDICTING INFIDELITY WITH MACHINE LEARNING
Figure 1

The Top-10 Most Important Predictors for In-Person Infidelity in Sample 1

In f id e lit y in P e r s o n A ll
N ever had anal
S o lit a r y d e s ir e
T o t a l d e s ir e
A t t a c h m e n t a v o id a n c e
R e la t io n s h ip le n g t h
L o v e b e fo re s e x
N e v e r m a stu rb a te p a rtn e r
AT S S to ta l
R e la t io n s h i p s a t is f a c t io n
N e v e r u se d se x to y

In f id e lit y in P e r s o n M e n
AT S S to ta l
S o lit a r y d e s ir e
L o v e b e fo re s e x
N ever had anal
N e v e r u se d se x to y
F F M Q - S F d e s c r ib e
S e c d e m o n s t r a t e s lo v e
S e x d e c lin in g
D y a d ic d e s ire
H I S D d e s ir e

In f id e lit y in P e r s o n W o m e n
D y a d ic d e s ire
N e v e r u se d se x to y
N ever had anal
L o v e b e fo re s e x
R e la t io n s h ip le n g t h
D y a d ic d e s ire
AT S S to ta l
B is e x u a l
S o lit a r y d e s ir e
L o v e m o s t im p o r t a n t
36
PREDICTING INFIDELITY WITH MACHINE LEARNING
Figure 2

The Top-10 Most Important Predictors for In-Person Infidelity in Sample 2

In f id e lit y in P e r s o n A ll
R e l a t io n s h i p s a t i s f a c t i o n
R o m a n t i c L o v e S c a le
S o lit a r y d e s ir e
S e x u a l s a t is fa c t io n
D y a d ic d e s ir e
R e l a t i o n s h ip l e n g t h
A ge
P h y s ic a l h e a lt h
H a d v a g in a l s e x p a s t m o n t h
W h it e

In f id e lit y in P e r s o n M e n
R e l a t io n s h i p s a t i s f a c t i o n
R o m a n t i c L o v e S c a le
S o lit a r y d e s ir e
R e c e iv e o r a l p a s t m o n t h
S e x u a l s a t is fa c t io n
A ge
D y a d ic d e s ir e
O r a l c o n t r a c e p t iv e
H a d v a g in a l s e x p a s t m o n t h
R e l a t i o n s h ip l e n g t h

In f id e lit y in P e r s o n W o m e n
R o m a n t i c L o v e S c a le
S o lit a r y d e s ir e
S e x u a l s a t is fa c t io n
R e l a t i o n s h ip l e n g t h
P h y s ic a l h e a lt h
R e l a t io n s h i p s a t i s f a c t i o n
D y a d ic d e s ir e
A ge
W h it e
N ever had anal
37
PREDICTING INFIDELITY WITH MACHINE LEARNING
Figure 3

The Top-10 Most Important Predictors for Infidelity in Sample 2

In f id e lit y G e n e r a l A ll
IT IS s c o r e
Ever had anal
R e l a t io n s h ip le n g t h
A tt e n d s e r v ic e w e e k ly
S o l it a r y d e s ir e
S e x u a l s a t is fa c t io n
G r a d u a t e d c o l le g e
R e la t io n s h i p s a t is f a c t io n
D y a d ic d e s ir e
N ever had anal

In f id e lit y G e n e r a l M e n
R e l a t io n s h ip le n g t h
N e v e r a tt e n d s e r v ic e
L ife s a t is fa c t io n
R o m a n t i c L o v e S c a le
S e lf - e s t e e m
S o l it a r y d e s ir e
Age
IT IS s c o r e
G r a d u a t e d c o l le g e
B a r r ie r b i r t h c o n t r o l

In f id e lit y G e n e r a l W o m e n
IT IS s c o r e
Ever had anal
D y a d ic d e s ir e
G r a d u a t e d c o l le g e
S e lf - e s t e e m
S o l it a r y d e s ir e
S e x u a l s a t is fa c t io n
R o m a n t i c L o v e S c a le
L ife s a t is fa c t io n
R e la t io n s h i p s a t is f a c t io n
38
PREDICTING INFIDELITY WITH MACHINE LEARNING
Figure 4

The Top-10 Most Important Predictors for Online Infidelity in Sample 1

In f id e lit y O n lin e A ll
N ever had anal
M an
R e la t io n s h i p l e n g t h
W om an
G ay
H o r m o n a l c o n t r a c e p t io n
T o t a l d e s ir e
S o lit a r y d e s ir e
I n f id e li t y p a s t w e e k
A T S S to ta l

In f id e lit y O n lin e M e n
S t ra ig h t
S o lit a r y d e s ir e
N ever had anal
G ay
F e m a le p a r t n e r
H o r m o n a l c o n t r a c e p t io n
T o t a l d e s ir e
R e la t io n s h i p l e n g t h
F F M Q -S F n o n -re a c t
L o v e b e fo re s e x

In f id e lit y O n lin e W o m e n
N ever had anal
R e la t io n s h i p l e n g t h
A T S S to ta l
N e v e r u se d se x to y
B is e x u a l
N o r e l ig i o u s s e r v i c e
S o lit a r y d e s ir e
F F M Q -S F o b s e rv e
N e v e r m a stu rb a te p a rtn e r
T o t a l d e s ir e
39
PREDICTING INFIDELITY WITH MACHINE LEARNING
Figure 5

The Top-10 Most Important Predictors for Online Infidelity in Sample 2

In f id e lit y O n lin e A ll
S o l it a r y d e s i r e
R o m a n t ic L o v e S c a le
R e la t io n s h ip s a t is fa c t io n
D y a d ic d e s ir e
S e x u a l s a t is fa c t io n
Ever had anal
V a g in a l s e x p a s t m o n t h
N ever had anal
R e la t io n s h ip le n g t h
G r a d u a t e d c o ll e g e

In f id e lit y O n lin e M e n
S e x u a l s a t is fa c t io n
S o l it a r y d e s i r e
R o m a n t ic L o v e S c a le
D y a d ic d e s ir e
R e la t io n s h ip s a t is fa c t io n
Ever had anal
G r a d u a t e d c o ll e g e
V a g in a l s e x p a s t m o n t h
O r a l c o n t r a c e p t iv e
N ever had anal

In f id e lit y O n lin e W o m e n
R o m a n t ic L o v e S c a le
S o l it a r y d e s i r e
D y a d ic d e s ir e
R e la t io n s h ip s a t is fa c t io n
O r a l c o n t r a c e p t iv e
Ever had anal
R e la t io n s h ip le n g t h
N a t u r a l b ir t h c o n t r o l
S t r a ig h t
I m p la n t b i r t h c o n t r o l
40
PREDICTING INFIDELITY WITH MACHINE LEARNING
Figure 6

The Top-10 Most Important Predictors for Intention toward Future Infidelity in Sample 3

In t e n t io n t o w a r d In f id e lit y A ll
R e la t i o n s h i p s a t is f a c t i o n
R o m a n t i c L o v e S c a le
H a d e n g a g e d i n in f i d e li t y
S o li t a r y d e s i r e
D y a d ic d e s ire
R e la t i o n s h ip le n g t h
S e x u a l s a t is fa c t io n
A t t e n d w e e k ly s e r v i c e
B is e x u a l
N ever had anal

In t e n t io n t o w a r d In f id e lit y M e n
R e la t i o n s h i p s a t is f a c t i o n
N o r e l ig i o n
R e la t i o n s h ip le n g t h
B is e x u a l
A t t e n d w e e k ly s e r v i c e
S o li t a r y d e s i r e
S e x u a l s a t is fa c t io n
R o m a n t i c L o v e S c a le
S t r a ig h t
D y a d ic d e s ire

In t e n t io n t o w a r d In f id e lit y W o m e n
R e la t i o n s h i p s a t is f a c t i o n
R o m a n t i c L o v e S c a le
H a d e n g a g e d i n i n f id e li t y
S e x u a l s a t is fa c t io n
D y a d ic d e s ire
S o li t a r y d e s i r e
Age
L i fe s a t is f a c t i o n
A t t e n d w e e k ly s e r v i c e
R e la t i o n s h ip le n g t h
41
PREDICTING INFIDELITY WITH MACHINE LEARNING
Figure 7

The Results for the Most Important Moderators for Infidelity and Intention toward Infidelity

in Sample 3

E n g a g e d in In f id e lit y A ll E n g a g e d in In f id e lit y W o m e n

A t t e n d s e r v ic e w e e k ly
G r a d u a t e d c o lle g e

IT IS s c o r e IT IS s c o r e

E n g a g e d in In f id e lit y M e n In t e n t io n t o w a r d In f id e lit y A ll

R o m a n t i c L o v e S c a le
Ever had anal

IT IS s c o r e R e l a t i o n s h ip s a t i s f a c t i o n

In t e n t io n t o w a r d In f id e lit y M e n In t e n t io n t o w a r d In f id e lit y W o m e n
N o r e l ig i o n

N o r e l ig i o n

R e la t io n s h ip s a t is fa c t io n R e l a t i o n s h ip s a t i s f a c t io n

You might also like