Research Article – Vol. 01, Issue 03, No. 01
Journal of Artificial Intelligence and System Modelling
Journal Web Page: https://jaism.bilijipub.com
Predicting the Matching possibility of Online Dating youths using
Novel Machine Learning Algorithm
Karthikeyan palanisamy1, Muthumani Muralidharan 2*
1.Assistant
2 PPG
Professor, School of CS & IT, JAIN (Deemed-to-be University), Bengaluru
College of Arts & Science, Coimbatore, Tamilnadu, India
Highlights
➢
➢
➢
➢
➢
Speed dating offers efficient and convenient face-to-face connections for busy singles.
Study uses LGBC combined with HGSOA, FFO, and MO for hybrid relationship forecasting.
LGBC model shows lower accuracy (0.938) compared to LGHS (0.945) and LGMO (0.956).
Achieves remarkable R2 value of 99.7% in training for both Hydrogen (H_2) and Nitrogen (N_2).
HGFF model excels with 0.965 accuracy, best for predicting early relationship dynamics.
Article Info
Received: 29 April 2024
Received in revised: 26 June
2024
Accepted: 29 June 2024
Available online: 30 June 2024
Keywords
Speed Dating
Online Dating
Henry
Gass
Solubility
Optimization Algorithm
Flying Fox Optimization
Mayflies Optimization
Machine Learning.
Abstract
In today's fast-paced society, many choose speed dating since it is efficient and convenient. Speed
dating events are organized to allow busy singles to meet a variety of potential partners in a short
timeframe, thereby maximizing their chances of making connections. It creates an organized setting
that encourages brief but significant contacts, allowing people to quickly assess chemistry and
compatibility. Furthermore, in the digital age, when online dating can be impersonal, speed dating
provides face-to-face connection, which increases authenticity and reduces the ambiguity of online
profiles. In general, speed dating appeals to modern daters who want quick and tangible results in
their search for romance. This research project aims to gain insights into forecasting the course of
relationships created during initial meetings utilizing cutting-edge Machine Learning (ML)
approaches. Light Gradient Boosting Classification (LGBC) serves as a foundational framework,
and an innovative approach is introduced by combining it with the Henry Gass Solubility
Optimization Algorithm (HGSOA), Flying Fox Optimization (FFO), and Mayflies Optimization
(MO), resulting in a hybrid model. Investigation reveals that throughout the training phase, the
LGBC model achieved a small accuracy of 0.938, suggesting its comparative inferiority to the LGHS
and LGMO models, which achieved accuracies of 0.945 and 0.956, respectively. Nonetheless, the
hybrid HGFF model emerged as the clear accurate model, outperforming all other competitors with
an astounding accuracy of 0.965. As a result, it is often regarded as the best model for anticipating
relationship dynamics during early meetings, providing vital insights into the complexities of
relationships on first dates.
1. Introduction
Speed dating events are typically set up with circular or
semi-circular tables and pairs of seats facing each other [2].
Participants are allocated a table number or location and
cycle across the room to meet new individuals. Before the
event begins, each participant is handed a scorecard or
form on which they can record the names or identification
numbers of those they are interested in [3].
Speed dating is a planned matchmaking technique
that allows people to quickly meet and connect with
possible romantic partners. Companies or groups typically
sponsor speed dating events, which are generally hosted in
pubs, restaurants, or community centers [1]. The concept
consists of a series of brief, timed interactions between
individuals, each lasting ranging from three to ten minutes.
Corresponding Author: Muthumani Muralidharan
Email: principalppgcas@ppg.edu.in
*
1
Once the event begins, attendees participate in chats
with one another, getting to know one another in the
allotted time. These chats are usually informal and
lighthearted, including themes including hobbies, interests,
and goals [4]. The restricted time restriction pushes
participants to create a rapid impression and determine
whether they have a connection with the person they are
speaking with. At the end of each round, participants
receive a signal to go to the next table or exchange partners.
This rotation repeats until every person has had a chance to
meet everyone at the event. After that, attendees send their
scorecards to the event planners, indicating which people
they want to see again [4].
If two participants have a common interest, the event
organizers promote the sharing of contact information,
allowing them to interact beyond the speed dating event. If
no mutual interest is demonstrated, the participants part
ways, and the organizers may offer options for future events
or matching services [5], [6], [7]. Overall, speed dating
provides a quick and effective means for singles to meet
possible love partners, as opposed to traditional dating
techniques, which may be time-consuming and
unexpected. It creates a calm and social environment in
which people may develop important relationships in a
short period [8], [9], [10].
ML has significant benefits for forecasting the
continuation of partnerships after a first date [11]. ML
algorithms excel in detecting small patterns indicative of
relationship potential by analyzing large datasets
containing diverse behavioral, vocal, and nonverbal signs
seen during interactions [12]. These algorithms use
complex analytical approaches to interpret data quickly and
effectively, resulting in accurate predictions that
outperform human capabilities [12]. One key advantage is
the objectivity inherent in ML models. Without personal
human judgment, which may be influenced by prejudices
and preconceptions, algorithms rely only on observable
facts, reducing the possibility of personal biases clouding
the research. This objective review improves decisionmaking by giving people a better grasp of the relationship's
possibilities. Furthermore, ML enables individualized
predictions based on the unique qualities and preferences
of the persons involved [13]. By examining varied datasets
that include a variety of demographic and behavioral
factors, computers can account for individual variances,
resulting in more nuanced and accurate projections. This
tailored method improves the relevance and application of
forecasts, allowing individuals to make educated decisions
based on their circumstances. ML models may also be
refined and improved continually. Algorithms improve
their forecast accuracy and flexibility over time through
iterative learning procedures that incorporate fresh data,
feedback, and results. This continuous development
guarantees that forecasts remain current and responsive to
changing dating dynamics and social trends, increasing
their usefulness and dependability over time [11].
Although many researchers have embraced speeddating's scientific promise, such approaches, like other
methodological breakthroughs, should be approached with
caution (Finkel et al [14]). For example, while speed dating
has significant external validity in certain areas, it may lack
it in others [15]. After all, speed dating events differ
significantly from traditional methods of meeting love
partners, and these distinctions may appeal mainly to a
minority of singles. Such external validity difficulties,
however, are not unique to speed dating. Scholars have yet
to establish (a) how romantic relationships begin at church
socials differ from those starting at work, at the beach, or
on the subway (e.g., perhaps interactions starting at church
benefit from spiritual rather than sexual reliability, whereas
interactions starting at the beach show the opposite
pattern); or (b) how the character traits of individuals who
meet partners in one setting differ from the personalities of
people who meet partner Future study might reveal
whether particular methods of finding partners are more
appropriate for some people than others. Another possible
risk is that speed dating may not result in romantic
attraction. The academic usefulness of speed-dating
methods would be significantly reduced if speed-daters
were only infrequently attracted to each other or initiated
post-event contact (compared to parallel frequency in other
contexts). Fortunately, preliminary data shows that speed
dating may be an exceptionally efficient way of introducing
people who then pursue follow-up dates with one other
[14].
1.1. Objective
In today’s fast-paced society, speed dating has become
a popular method for individuals to efficiently meet
potential partners amidst their busy schedules. These
events offer participants a structured environment to
engage in brief, face-to-face interactions aimed at assessing
compatibility and chemistry quickly. Unlike online dating
platforms, which can sometimes feel impersonal and
fraught with uncertainties related to profile accuracy and
authenticity,
speed
dating
provides
immediate
interpersonal feedback. This direct interaction appeals to
modern daters seeking tangible results in their quest for
meaningful relationships. While existing research
acknowledges the effectiveness of speed dating in
facilitating initial connections, this study seeks to advance
the field by integrating cutting-edge Machine Learning
2
(ML) techniques to predict the trajectory of relationships
formed during these brief encounters. Specifically, we
employ the Light Gradient Boosting Classification (LGBC)
as a foundational framework and introduce a novel hybrid
model that incorporates the Henry Gass Solubility
Optimization
Algorithm
(HGSOA),
Flying
Fox
Optimization (FFO), and Mayflies Optimization (MO). This
hybrid approach aims to enhance predictive accuracy
beyond what traditional models achieve, offering insights
into the complex dynamics at play during initial meetings.
Furthermore, this study contributes to the literature by not
only forecasting relationship outcomes but also by
demonstrating the efficacy of hybrid ML models in
enhancing prediction accuracy. By comparing the hybrid
model (HGFF) against standard LGBC and other variants
(LGHS and LGMO), significant improvements were
illustrated in accuracy, underscoring the applicability of
advanced ML techniques in understanding and predicting
relationship development from initial interactions.
tweaking, or algorithm selection can enhance predicted
accuracy [26]. ML may help forecast the continuation of
speed dating encounters by studying participant
interactions. Using historical data and prediction
algorithms, event organizers may improve matching
success and provide a more enjoyable experience for
attendees [27].
2. Datasets and Methods
2.1. Data Collection
The dataset consists of interactions between
participants during these events, focusing on various
attributes relevant to initial impressions and compatibility
assessment. In total, the dataset comprises 1000 samples.
Data preprocessing involves several steps to ensure quality
and consistency. Firstly, missing data points, primarily
from rating scales and demographic information, were
handled using mean imputation for numeric variables and
mode imputation for categorical variables. Outliers,
identified through box plots and domain knowledge, were
either corrected or removed to prevent skewing results
during model training. Furthermore, to maintain data
integrity, categorical variables were encoded using one-hot
encoding, while numeric variables were standardized to a
mean of 0 and a standard deviation of 1. This normalization
step aimed to mitigate the influence of varying scales and
magnitudes across different features, ensuring fair
representation in our predictive models.
1.2. Study Procedure
ML can predict whether or not speed dating sessions
will continue by examining multiple data points about the
interactions of the participants [16]. Initially, data is
gathered from previous speed dating events, including
demographics, hobbies, discussion duration, mutual
interest, and body language, indicators [17]. These data
points are used as features to train prediction algorithms
[18]. During the training phase, ML methods such as
logistic regression, decision trees, and neural networks are
used. These algorithms learn from previous data,
recognizing patterns and links between characteristics and
the outcome variable whether participants choose to
maintain their connection after the event [19], [20].
Feature engineering is critical for retrieving useful data.
Participants'
ages,
genders,
common
interests,
conversation quality, and nonverbal signs such as eye
contact are all possible features [21]. The algorithm learns
to weigh these factors and determine which combinations
indicate sustained interest [22].
Throughout prediction, the trained model examines
real-time or post-event data to determine the likelihood of
sustained interest between pairings [23], [24]. This
prediction helps event organizers find possible matches,
which improves the speed dating experience. Using ML,
organizers may provide targeted follow-up services or
suggestions, enhancing the success rate of matchmaking
during the event [25]. The model's performance is assessed
using evaluation criteria such as accuracy, precision, recall,
and area under the ROC curve. Iterative use of refinement
approaches like as feature selection, hyperparameter
2.2.
Description
On the first date, couples usually pay attention to a
variety of aspects. These include physical attractiveness,
conversational flow, common interests, and mutual
compatibility. Body language and nonverbal clues play an
important part in determining interest and comfort levels.
Additionally, people frequently evaluate their date's
abilities to communicate, politeness, and overall attitude.
Emotional connection and chemistry are quite important,
and they frequently influence whether a second date is
wanted. The atmosphere and activities chosen for the date
may have a big influence on the entire experience and
impression left. Finally, first dates provide a chance for
initial impressions and future relationship prospects to
develop spontaneously.
When comparing the components included in a first
date to one another, it is clear that they are interconnected
and contribute to the entire experience. Physical appeal
frequently sparks attention, acting as an initial lure.
Conversational flow and common interests, on the other
hand, strengthen the bond and promote emotional
involvement. Body language and nonverbal clues
3
supplement spoken communication by providing insight
into mutual compatibility. Similarly, measuring manners
and behavior indicates common ideals and respect.
"Together" stresses the combination of these elements,
emphasizing the synergy and coherence among persons. In
essence, each aspect influences the quality of the contact
and the possibility of forming a meaningful relationship.
The influence of additional factors, such as shared interests,
partner intelligence, and physical attractiveness, on the
sustenance of a relationship is delineated in Fig. 1.
Fig. 1. Correlation of the input and outputs
2.1.1. Feature Selection
Feature selection is a critical aspect of the ML process,
aimed at identifying and utilizing the most relevant features
from the dataset to improve model performance and
interpretability. The primary goal of feature selection is to
reduce the dimensionality of the dataset by eliminating
irrelevant or redundant features while retaining those that
contribute the most to the predictive power of the model.
This process offers several benefits, including improved
model accuracy, reduced overfitting, faster training times,
and enhanced interpretability of the model. There are
various techniques for feature selection, ranging from
simple filter methods based on statistical tests or
correlation analysis to more complex wrapper methods,
which involve evaluating different subsets of features using
a specific ML algorithm.
Fig. 2 presents the F-statistic feature selection result of
the input variables. The F-statistic feature selection
method, also known as analysis of variance (ANOVA), is a
statistical technique used to assess the significance of
individual features in a dataset concerning the target
variable. It is commonly employed in regression and
classification tasks to identify the most relevant features for
predicting the target variable. The F-statistic assesses the
relationship between each predictor variable and the target
variable by evaluating the variability explained by the
predictor compared to the variability not explained. This
method is particularly suited for the dataset, which includes
a mix of numeric and categorical features related to
participant demographics, ratings, and perceived
attributes. This approach not only streamlined model
complexity but also improved interpretability by focusing
on variables that most significantly influence relationship
predictions. The empirical evaluation indicated that feature
selection using the F-statistic method enhanced model
4
performance by reducing overfitting and increasing
prediction accuracy.
In this study, a selection of six features has been
meticulously curated to optimize results and circumvent
the inclusion of irrelevant features during the training
process.
Fig. 2. Feature Selection result of the input variables
2.3.
Light Gradient Boosting Classification
(LGBC)
into Because ensemble classifiers perform better in
classification than individual classifiers, they have attracted
more interest in the fields of ML and pattern recognition.
To improve classification accuracy, predictions from
different (𝑠𝑖𝑛𝑔𝑙𝑒) classifiers are combined using the
majority voting procedure. Several techniques, such as
Random Forest (RF), bagging, boosting, stacking, and
others, are frequently used to build ensemble classifiers. In
the context of this research, which focuses only on using
(𝐿𝑖𝑔ℎ𝑡𝐺𝐵𝑀) as a group strategy, the fundamentals of
boosting will be particularly examined. The boosting
strategy involves training a set of separate classifiers one at
a time to improve the performance of the weaker ones.
Training data with the same weights is used in the first step
of the iteration, and these weights are recalibrated during
the training phase. Because previous iterations' classifiers
were weaker, incorrectly identified pixels are given higher
weights, which corrects their classification in the next
iteration. Various boosting strategies are used in the field of
remote sensing and ML. Two examples of these techniques
are gradient-boosted decision trees (GBDT) and gradientboosting machines (GBM)
A large number of recent studies have investigated and
evaluated new ensemble learning methods for remotely
sensed picture categorization. The CCF algorithm (2015)
and 𝑋𝑔𝐵𝑜𝑜𝑠𝑡 (2016) are two examples. 𝐿𝐺𝐵𝑀 is a relative
newcomer to the ensemble learning scene and has attracted
a lot of interest from the ML field. Recent ML and data
science contests have shown that it performs better than
other boosting frameworks, especially when dealing with
massive datasets. Please refer to the next paragraph for an
overview of 𝐿𝐺𝐵𝑀; Ke et al's study [28]. contains more
detailed information. The 𝐿𝑖𝑔ℎ𝑡𝐺𝐵𝑀 classifications were
carried out using the current section of the Light GBM
Python Package. Decision tree algorithms serve as the
foundation for the gradient-boosting framework known as
LGBM.
Unlike previous ensemble learning algorithms, which
use the level-wise method, the leaf-wise technique is used
for tree development. Two cutting-edge techniques that set
the LGBM platform apart are exclusive feature bundling
(EFB) and gradient-based one-side sampling (GOSS).
Rather than using all instances, GOSS uses a subset made
up of smaller instances, whereas EFB combines exclusive
properties into smaller bundles. When these techniques are
used in 𝐿𝐺𝐵𝑀, it produces benefits like faster learning
5
𝑋𝑖𝑗 (𝑡 + 1) = 𝑋𝑖,𝑗 (𝑡) + 𝐹 × 𝑟 × 𝛾 × (𝑋𝑗,𝑏𝑒𝑠𝑡 (𝑡) −
times and more accuracy than other gradient boosting
frameworks. Specifically, the short training time and low
memory use have led to the term "𝐿𝐺𝐵𝑀." For best results,
it is necessary to modify the model's parameters, which
include things like boosting type, maximum depth, learning
rate, and leaf count.
𝑋𝑖,𝑗 (𝑡)) + 𝐹 × 𝑟 × 𝛼 × (𝑆𝑖,𝑗 (𝑡) × 𝑋𝑏𝑒𝑠𝑡 (𝑡) −
𝑋𝑖,𝑗 (𝑡))
𝛾 = 𝛽 × 𝑒𝑥𝑝 (
𝐹𝑏𝑒𝑠𝑡 (𝑡)+𝜀
𝐹𝑖,𝑗 (𝑡)+𝜀
(4)
),𝜀
With r: random number, α, β, and γ: constants. 𝐹𝑖,𝑗 (𝑡)
is the fitness of gas 𝑖 in cluster 𝑗, and 𝐹𝑏𝑒𝑠𝑡 (𝑡) is the best fit
value across all of the clusters typically, the fitness is
interpreted as being equal to the value of the objective
function.
2.4.
The Henry Gas Solubility Optimization
Algorithm (HGSO)
The HGSO's basic idea is covered in this section
(HGSO).
The HGSO was created lately by Hashim et ali [29] and
is based on Henry's law. The procedures in mathematics as
follows can be explained in the HGSO:
Initialization: Select the number of gases (𝑁) and
their initial locations. Initializing the partial pressure on the
gases is necessary. The gas's location, indicated by 𝑥𝑖 , 𝑗, may
be found using the formula below.
𝑋𝑖 (𝑡 + 1) = 𝑋𝑚𝑖𝑛 + 𝑟(𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛 )
(1)
2.4.1. Escape from local optimum
This step's objective is to go away from the local
optimum. The following formula is used to determine the
number of worst agents 𝑁𝑤 .
𝑁𝑤 = 𝑁 × (𝑟𝑎𝑛𝑑(𝑐2− 𝑐1 + 𝑐1 ),
𝑐1 = 0.1 𝑎𝑛𝑑𝑐2 = 0.2
(5)
The number of search agents is denoted by 𝑁.
2.4.2. Update the position of the worst agents
For calculating the new ranks of the 𝑁𝑤 worst gases, an
alternative formula has been utilized. The following is the
formula:
𝐺(𝑖,𝑗) = 𝐺𝑚𝑖𝑛 (𝑖,𝑗) + 𝑟 × (𝐺𝑚𝑎𝑥(𝑖,𝑗) − 𝐺𝑚𝑖𝑛 (𝑖,𝑗) )
(6)
Using 𝑋𝑚𝑎𝑥 and 𝑋𝑚𝑖𝑛 at the upper and lower limits, and
𝑟: a random number between 0 and 1.
Clustering: Selecting the number of clusters is an
essential process. The same kind of gases ought to be in the
same group.
Evaluation: There are two steps to this. Finding the
greatest gas inside each cluster is the first step toward
identifying the best gas overall. The gases can be ranked
using the objective function. 𝑋𝑗,𝑏𝑒𝑠𝑡 indicates the location of
the best gas within each cluster (𝑗), while 𝑋𝑏𝑒𝑠𝑡 indicates the
position of the best gas across clusters.
Update Henry’s coefficient: Each cluster's Henry's
coefficient is calculated using the following formula:
1
1
𝐻𝑗 (𝑡 + 1) = 𝐻𝑗 𝑥 𝑒𝑥𝑝 (−𝐶𝑗 (
− 𝜃 ))
𝑇(𝑡) 𝑇
(2)
𝑡
)
T(t)=𝑒𝑥𝑝 (−
With 𝑟: random number 𝑚, 𝐺min (𝑖,𝑗) ) and 𝐺max(𝑖,𝑗) :
lower and upper bounds.
2.5.
Flying Foxes Optimization Algorithm
(FFO)
FFO, or flying fox’s optimization, is a populationbased stochastic method that makes use of strategies that
FF employs to withstand extreme heat. It makes use of a
hybrid algorithm structure that is dependent on the
attraction constant, replacement list, and population size.
These variables affect how well the algorithm performs.
𝑖𝑡𝑒𝑟
With T(t): the iteration's temperature in the 𝑡 𝑡ℎ , 𝑖𝑡𝑒𝑟:
the maximum number of times 𝑡 𝜃 displays a constant value
throughout an iteration.
Update solubility: Each gas's solubility is
determined using the formula below:
𝑆𝑖,𝑗 (𝑡) = 𝐾𝐻𝑗 (t+1)𝑥𝐻𝑖𝑗 (𝑡)
(3)
2.5.1. Functioning of Flying Foxes Algorithm
Some of the biggest bat species on the planet are
represented by FF. Since they are unable to echolocate,
their ability to move about in space depends on their
awareness of their surroundings.
They return to their habitat trees after their evening
feeding. Foxes that fly look for cooler trees to rest upon to
shield themselves from heat waves that arise in the
morning. The majority of the time, FFs suffocate each other
and perish when they find a tree with a suitable level of heat
first.
𝑃𝑖𝑗 (𝑡) represents the partial pressure on gas 𝑖 in cluster
𝑗, whereas 𝐾 is a constant.
Update position: The updated position is given
below. The following equation is used to update the particle
positions:
2.5.2.
6
The Application of FFO Algorithm
The starting point of this new paradigmatic method is
an arbitrary collection of several places of each FF. A vector
with m-dimensional components,𝑥 = (𝑥1 , … , 𝑥𝑚 ), is used to
show these placements. After that, the objective function
assesses the answers. for the roles that were previously
stated. To ensure their life in the event of extreme heat, FF
searches for a cooler tree.
once it has located the tree with the lowest temperature. If
not, it returns to the most recent place.
2.5.4. Death and Replacement Flying Foxes
FF die for a variety of reasons. For example, in their
search for the coolest tree, they can wind up in a very
distant area with extreme heat. If so, they will not be able to
avoid dying. An alternative is to use a replacement List
(𝑅𝐿). using the special optimum solutions of the 𝑁𝐿. this
results in the creation of an arbitrary number 𝑛 ∈ [2, 𝑁𝐿],
and the position of a 𝑛𝑒𝑤𝑙𝑦 − 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑒𝑑 FF is denoted by
the equation that follows:
∑𝑛𝑘=1 𝑅𝐿𝑡𝑘,𝑗
𝑡+1
𝑥𝑖,𝑗
=
(10)
𝑛
2.5.3. Movement of FF
Given that FF follow one another's trails and seek the
closest tree, it is likely that they will migrate to a new tree
in order to get away from the excessive heat if their habitat
tree does not offer a comfortable minimum temperature.
The following equation may serve as an illustration of how
this movement was formulated.
𝑡
𝑡+1
𝑡
𝑥𝑖,𝑗
= 𝑥𝑖,𝑗
+ 𝑎. 𝑟𝑎𝑛𝑑(𝑐𝑜𝑜𝑙 𝑗 − 𝑥𝑖,𝑗)
(7)
At reiteration 𝑡, an is a stable value, 𝑟𝑎𝑛𝑑 ∼ 𝑈(0,1),
and cool denotes the position of the FF in the tree with the
𝑡
lowest temperature. With𝑤𝑖𝑡ℎ 𝑥𝑖0 ~(𝑥𝑚𝑖𝑛 , 𝑥𝑚𝑎𝑥 )), 𝑥𝑖,𝑗
is the
𝑗 − 𝑡ℎ member of FF(i). The application of Eq. (7) occurs
𝛿
when |𝑓(𝑐𝑜𝑜𝑙) − 𝑓(𝑥𝑖 |) > 1, where cool is the flying fox's
Thus, at the t reiteration, 𝑅𝐿𝑡𝑘 represents the 𝑘 − 𝑡ℎ FF
on the 𝑅𝐿. The goal of Eq. (10) is to increase the likelihood
of finding a suitable location. Suffocation by other members
of the colony is another way that FF might perish. In this
instance, a probability is established based on the number
of flying foxes detected in the areas with the lowest
temperatures prior to the conclusion of an iteration. It is
explained as follows:
𝑛𝑐 − 1
𝑝𝐷 =
𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒
(11)
2
position vector, located at the coolest spot ever identified.
This is the best answer to date and parameter
𝛿1 corresponds to the maximum distance at which two
flying foxes may be considered to be close to one another.
An FF looks for the closest space to avoid suffocating as it
approaches a tree with the lowest temperature|𝑓(𝑐𝑜𝑜𝑙) −
𝛿
𝑓(𝑥𝑖 |) > 1. The phenomena are further explained by the
Where the number of FF with an objective function
comparable to the ideal solution is directly related to 𝑛𝑐.
2
2.5.5. Crossover Process
The process of genetic crossover is used to help two FF
mates. First, two parents are chosen at random from the
population; make sure they are not the same. Two children
are formed by this crossover procedure, and they are as
follows:
𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔1 = 𝐿. 𝑅1
+ (1 − 𝐿). 𝑅2
𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔2
(12)
(1
= 𝐿. 𝑅2 + − 𝐿). 𝑅2
following equations:
𝑡+1
𝑡
𝑡
𝑛𝑥𝑖,𝑗
= 𝑥𝑖,𝑗
+ 𝑟𝑎𝑛𝑑1,𝑗 . (𝑐𝑜𝑜𝑙𝑗 − 𝑥𝑖,𝑗
)
(8)
𝑡
𝑡
+ 𝑟𝑎𝑛𝑑2,𝑗 . (𝑥𝑅1𝑗 − 𝑥𝑅2𝑗
)
𝑡+1
𝑛𝑥𝑖,𝑗
, 𝑖𝑓 𝑗 = 𝑘 𝑜𝑟 𝑟𝑛𝑑𝑗 ≥ 𝑝𝑎
𝑡+1
𝑥𝑖,𝑗
={
(9)
𝑡 ′
𝑥𝑖,𝑗
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Where 𝑝𝑎 is a probability constant, rndj is an arbitrary
number between 0 and 1, 𝑟𝑎𝑛𝑑 ∼ 𝑈(0,1), and 𝑥𝑅𝑡 1 and 𝑥𝑅𝑡 2 are
two arbitrary members of the current population.
Eventually, 𝑘, selected at random in {1,2, … , 𝑚} and
𝑡+1
ensuring that a minimum 𝑥𝑖,𝑗
selects one ingredient from
𝑡+1
𝑛𝑥𝑖,𝑗 to ensure that the new solution and the existing one
do not duplicate each other. An evaluation of the computed
solutions is conducted. The FF is accepted as a new solution
Randomly picked members of the population 𝑅1 and
𝑅2 are identified, whereas a randomly produced value 𝐿
falls between 0 and 1. Fig. 3 illustrates the process of the
FFO algorithm.
7
Fig. 3. Process of the FFO.
trajectories, 𝑓(𝑥ℎ𝑖 ), would be used to adjust the velocity.
Male mayflies would update their velocities based on their
current velocities, the distance between them and the global
best location, and the historical best trajectories 𝑖𝑓 𝑓(𝑥𝑖 ) >
𝑓(𝑥ℎ𝑖 ).
2
𝑣𝑖 (𝑡 + 1) = 𝑔. 𝑣𝑖 (t)+𝑎1 𝑒 −𝛽𝑟𝑝 [𝑥ℎ𝑖 − 𝑥𝑖 (𝑡)] +
2
(14)
𝑎2 𝑒 −𝛽𝑟𝑔 [𝑥𝑔 − 𝑥𝑖 (𝑡)]
2.6.
Mayflies Optimization Algorithm (MO)
For the MO algorithm, male and female mayflies in
swarms would be distinguished. Additionally, the male
mayflies would always be stronger, which would improve
their optimization. The people in the MO algorithm would
update their locations based on their current positions
𝑝𝑖 (𝑡) and velocity 𝑣𝑖 (𝑡) at the current iteration, just like the
individuals in swarms of the 𝑃𝑆𝑂 𝑎𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚:
𝑝𝑖 (𝑡 + 1) = 𝑝𝑖 (𝑡) + 𝑣𝑖 (𝑡 + 1
(13)
Eq. (13) would be used by all male and female mayflies
to update their locations. Their velocities would be updated
in various ways, though.
2.6.1. Movements of Male Mayflies
During iterations, MM in swarms would continue the
process of exploration or exploitation. Their present fitness
values, 𝑓(𝑥𝑖 ), and their history of best fitness values in
The system described in the literature has 𝑎 variable
called g that decreases linearly from its highest value, and
values are balanced by constants called 𝑎1, 𝑎2, and 𝛽. The
second norm is the Cartesian distance between individuals
and their historical optimal location within swarms.
𝑛
‖𝑥𝑖 − 𝑥𝑗 ‖ = √∑(𝑥𝑖𝑘 − 𝑥𝑗𝑘 )2
𝑘=1
8
(15)
In contrast, the MM would update their velocities from
the present one with a random dance coefficient 𝑑 if
𝑖𝑓 (𝑥𝑖 ) < 𝑓(𝑥ℎ𝑖 )
𝑣𝑖 (𝑡 + 1) = 𝑔. 𝑣𝑖 (𝑡) + 𝑑. 𝑟1
(16)
In this case, an additional constant that is also utilized
to balance the velocities is𝑎3 . The Cartesian distance
between them is represented by 𝑟𝑚 . In contrast, the FM
would update their velocities from the present one with
another arbitrary 𝑑𝑎𝑛𝑐𝑒 𝑓𝑙 𝑖𝑓 (𝑦𝑖 ) < 𝑓(𝑥)𝑖 .
𝑣𝑖 (𝑡) = 𝑔. 𝑣𝑖 (𝑡) + 𝑓𝑙. 𝑟2
(18)
In this case, 𝑟1 represents a uniformly distributed
random number chosen from the domain [−1, 1].
2.6.2. Movements of Female Mayflies
The FM would alter the way they updated their
velocities. The FM would be in a hurry to locate the MM in
order to marry and procreate since, according to biology,
they can only live for one to seven days at most.
Consequently, they would revise their velocity according to
the MM with whom they wish to mate. The first mate in the
MO algorithm would be the best female and MM, and the
second mate would be the best female and MM, and so on.
Thus, in the case of the 𝑖 − 𝑡ℎ FM, 𝑖𝑓 𝑓(𝑦𝑖 ) < 𝑓(𝑥𝑖 ):
2
𝑣𝑖 (𝑡 + 1) = 𝑔. 𝑣𝑖 (𝑡) + 𝑎3 𝑒 −𝛽𝑟 𝑚𝑓 [𝑥𝑖 (𝑡) − 𝑦𝑖 (𝑡)]
(17)
In the domain [−1, 1], where 𝑟2 is likewise a random
integer with a uniform distribution.
2.6.3. Mating of mayflies
The top half of mayflies would mate and produce a pair
of offspring, who would randomly develop from their
parents.
𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔1 = 𝑙 ∗ 𝑚𝑎𝑙𝑒 + (1 − 𝐿) ∗ 𝑓𝑒𝑚𝑎𝑙𝑒
(19)
𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔2 = 𝐿 ∗ 𝑓𝑒𝑚𝑎𝑙𝑒 + (1 − 𝐿) ∗ 𝑚𝑎𝑙𝑒
(20)
Hence, in a Gauss distribution, 𝐿 are random integers.
The procedure of the MO algorithm has been presented in
Fig. 4.
Fig. 4. MOA procedure.
9
2.7.
Performance Criteria
The efficacy of any classification method hinges on a
thorough assessment of its performance, unveiling its
strengths and weaknesses. This evaluation process
demands a careful selection of metrics, guided by various
factors including data characteristics, error costs, and
project objectives. Eqs. (21)-(24) presents the formula of
the utilized metrics:
𝑇𝑃 + 𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
(21)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃𝑅 =
𝐹1 𝑠𝑐𝑜𝑟𝑒 =
𝑇𝑃
𝑇𝑃
=
𝑃
𝑇𝑃 + 𝐹𝑁
2 × 𝑅𝑒𝑐𝑎𝑙𝑙 × 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
𝑅𝑒𝑐𝑎𝑙𝑙 + 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
Initiating and nurturing a romantic bond is seen as a
paramount preoccupation for individuals. Speed dating,
serving as a facilitator, provides individuals with the
opportunity to cultivate meaningful connections. However,
the primary aim of this paper is to delve into the realm of
ML algorithms to predict the compatibility of couples upon
their inaugural rendezvous. The outcomes generated by the
assembled models, which integrate diverse models with
specified optimizers, are subjected to meticulous scrutiny
via an array of plots and tables. The overarching objective
is to pinpoint the most efficacious model endowed with
supreme functionality, thus advancing the understanding
of relationship dynamics through computational
methodologies.
(22)
(23)
3.1. Convergence Curve
A convergence curve records model performance over
several training cycles. The best model is usually recognized
when the curve hits a plateau or stabilizes, signifying peak
performance. This is characterized by little loss or error and
high accuracy or other related measures. In contrast, the
poorest model is generally identified by erratic or
consistent bad performance, with the curve failing to
converge or even deteriorating with time. In most cases,
such events indicate inadequate learning or model
overfitting. Monitoring the convergence curve allows
practitioners to determine where the model achieves its
maximum efficacy and suggest areas for development or
intervention.
The convergence curve depicted in Fig. 5 serves as a
comprehensive illustration of the performance exhibited by
the LGHS, LGMO, and LGFF models. Upon scrutiny, it
becomes evident that the LGHS model attains its optimal
state starting from the 100th iteration, boasting an
impressive accuracy of 0.938. Notably, this model initiates
its journey with a modest accuracy of 0.4 in the initial
iteration, steadily advancing to 0.8 by the 80th iteration. In
contrast, the LGMO model, with its accuracy peaking at
0.952 in the 100th iteration, showcases superior
functionality when compared to the LGHS model. Despite
starting at an accuracy of 0.5, it progressively improves to
0.8 by the 80th iteration, indicating commendable
advancement. On a different note, the LGFF model stands
out with its remarkable accuracy of 0.968, underscoring its
heightened predictive potential relative to its counterparts.
However, its accuracy at the 100th iteration registers at 0.7,
albeit with a notable acceleration in development compared
to the other models.
They show that while the LGBC model initially
improves rapidly, its convergence stabilizes at a moderate
accuracy level. In contrast, models utilizing hybrid
(24)
When a scenario ends negatively, the acronym FP
stands for a positive projection, while TP indicates a
positive forecast that matches the fortunate occurrence.
When TN is used for a negative prediction, the predicted
result is similar to the actual negative event. When things
work out well, the FN signal indicates a bleak future.
Accuracy measures the proportion of correctly
predicted outcomes (both positive and negative) relative to
the total number of predictions. In this study, accuracy
indicates the overall correctness of relationship predictions
made by each model, reflecting its ability to correctly
identify successful and unsuccessful matches. Precision
quantifies the proportion of true positive predictions
(correctly predicted successful relationships) out of all
positive predictions made by the model. It emphasizes the
model's ability to avoid false positives, crucial in ensuring
the reliability of matchmaking predictions during speed
dating events. Recall (also known as sensitivity) calculates
the proportion of true positive predictions identified by the
model out of all actual positive instances. In the context of
this study, recall highlights the model's effectiveness in
capturing all potential successful matches, minimizing the
risk of overlooking promising connections. The F1-score
represents the harmonic mean of precision and recall,
offering a balanced assessment of a model's performance by
considering both false positives and false negatives. It
provides a consolidated measure of predictive accuracy that
considers both the completeness (recall) and correctness
(precision) of the model's predictions.
3. Results and Discussion
10
optimization algorithms like LGHS and LGMO exhibit
consistent and accelerated convergence, achieving higher
accuracies within fewer iterations. The hybrid HGFF model
emerges as the top performer, demonstrating both rapid
convergence and superior accuracy throughout training.
This convergence analysis informs the preference for the
HGFF model due to its effective integration of advanced
optimization techniques, highlighting its robustness in
predicting relationship outcomes from initial speed dating
interactions.
Fig. 5. 3D wall plot for the convergence curve of the hybrid models.
Table 1 presents the outcomes of the developed models
across training, testing, and overall phases. Notably, in the
training phase, the LGFF model emerges as the optimal
performer, achieving an accuracy of 0.956. Furthermore,
both the LGMO and LGHS models exhibit commendable
performance with accuracies of 0.956 and 0.945,
respectively, positioning them as models with good and
acceptable performance. Conversely, the LGBC model,
identified as the base model, displays comparatively weaker
performance, boasting an accuracy of 0.938. This
observation underscores the varying degrees of
effectiveness among the models, with LGFF leading the
pack and LGBC trailing behind in terms of accuracy and
predictive capability.
In the testing phase, it is observed that the precision
values of 0.912 and 0.929 for the LGBC and LGHS models,
respectively, indicate weaker performance compared to the
LGMO model, which achieves a precision value of 0.948.
However, it is noted that the precision value of the LGMO
model is surpassed by that of the LGFF model, which
attains a precision value of 0.975. Thus, the relative
performance of the models is highlighted, with LGFF
demonstrating superior precision compared to LGMO,
LGHS, and LGBC. In the All phase, it is noted that the LGFF
and LGMO models, boasting recall values of 0.968 and
0.953, respectively, are identified as the best and secondbest models in this comparison. Following them, the LGHS
and LGBC models, with recall values of 0.938 and 0.927,
respectively, are indicated to lack significant potential in
the prediction process. Thus, the relative rankings of the
models are underscored, with LGFF and LGMO emerging
as top performers, while LGHS and LGBC exhibit
comparatively lower effectiveness in terms of recall.
Table 1. The outcome of the showcased developed models.
Section
Model
Train
LGBC
LGHS
LGMO
LGFF
LGBC
LGHS
LMGO
LGFF
LGBC
Test
All
Metric values
Accuracy
0.938
0.945
0.956
0.965
0.903
0.923
0.945
0.974
0.927
Precision
0.942
0.947
0.958
0.967
0.912
0.929
0.948
0.975
0.933
11
Recall
0.938
0.945
0.956
0.965
0.903
0.923
0.945
0.974
0.927
F1-score
0.939
0.946
0.956
0.966
0.906
0.925
0.946
0.974
0.929
LGHS
LGMO
LGFF
0.938
0.953
0.968
0.941
0.955
0.969
In the line symbol plot depicted in Fig. 6, the
performance of hybrid models in various phases is
illustrated. For instance, it is observed that the LGBC model
exhibits its weakest performance in the testing phase, with
a recall value of 0.903. Conversely, the highest performance
of this model is demonstrated in the training phase, where
it attains a precision value of 0.942. This highlights the
varying degrees of performance across different phases,
with the LGBC model showcasing notably different
outcomes between the testing and training phases. The best
performance of LGHS models is observed in the training
phase, where it achieves a precision value of 0.947,
contrasting with its recall value of 0.923 in the testing
phase. Overall, the performance of LGHS models in the
training phase surpasses that in other phases. However, it
is noted that in general, the performance of these models is
weaker compared to LGFF and LGMO models. This
underscores the relative effectiveness of LGHS models in
0.938
0.953
0.968
0.939
0.953
0.969
comparison to their counterparts across different phases,
with LGFF and LGMO models demonstrating superior
performance overall.
The performance of LGFF models maintains a
remarkable level of consistency throughout both the
training and all phases, as demonstrated by their consistent
accuracy of 0.968. However, this consistency sharply
contrasts with the accuracy of the current model in the
testing phase, which stands notably higher at 0.974.
Despite the overall consistency, LGFF models showcase
their peak functionality during the testing phase,
showcasing an impressive precision value of 0.975. This
stands in stark contrast to their performance in the training
phase, where their achieved recall value is 0.965. Such
variance underscores the dynamic nature of model
performance across different phases, ultimately
highlighting LGFF models' superior functionality in the
testing phase relative to the training phase.
12
Fig. 6. Line-symbol plot for the performance of the models through the metrics
The comparison of model performance, as illustrated
in Table 2, reveals insights into their efficacy under both
Matched and Unmatched conditions. Notably, both the
LGBC and LHHS models exhibit identical functionality,
each achieving a precision value of 0.97 in the unmatched
condition. This parity underscores their comparable
predictive capabilities in scenarios where matching
conditions are not met. Following this, the LGMO model
displays a slightly elevated precision value of 0.98,
suggesting only a marginal difference compared to the
LGFF model's precision value of 0.99 within the same
condition. This subtle variance underscores the nuanced
distinctions between the models, with LGMO closely
trailing behind LGFF in predictive accuracy under such
conditions.
This analysis sheds light on the intricate dynamics of
model performance, revealing both commonalities and
divergences among them. While LGBC and LHHS
demonstrate consistent outcomes, LGMO and LGFF
emerge as closely competitive alternatives, with LGFF
exhibiting a slightly superior predictive accuracy. Such
insights are invaluable for refining model selection and
deployment strategies in various contexts, emphasizing the
importance of thorough evaluation and comparison. In the
context of the matched condition, discernible disparities in
precision values among the models come to light. Notably,
the precision value of the LGFF models reaches an
exceptional 0.99, showcasing a high degree of accuracy in
predicting outcomes. This nuanced understanding
underscores the importance of precision in model
evaluation and selection, as even slight differences can
significantly impact the reliability of predictions in realworld applications
Table 2. Condition-based categorization for the performance of the developed models
Metric values
Precision
Recall
F1-score
Condition
Unmatched
Matched
Unmatched
Matched
Unmatched
Matched
Model
LGBC
0.97
0.76
0.94
0.87
0.96
0.81
LGHS
0.97
0.80
0.95
0.87
0.96
0.83
LGMO
0.98
0.84
0.96
0.91
0.97
0.87
LGFF
0.99
0.88
0.97
0.95
0.98
0.91
LGMO model secures the second position, attaining a
commendable 820 out of 852 measured values. While
slightly trailing the LHFF model, the LGMO model's
performance remains noteworthy, reflecting its proficiency
in predicting outcomes.
In the unmatched condition, scrutiny reveals distinct
patterns in the performance of various predictive models.
The LGHS model, with a tally of 159 out of 182 measured
In Fig. 7, the radar plot provides a visual
representation of the predictive models' performance,
showcasing their values against measured metrics. For
instance, when examining the unmatched condition, the
LHFF model emerges as the leading performer, having
successfully achieved 829 out of 852 measured values. This
positioning underscores its efficacy in predictive accuracy
within this particular context. Following closely behind, the
13
values, emerges as a frontrunner, indicating its superior
functionality over the LGBC models, which manage to
attain 158 out of 182 measured values. However, despite its
commendable performance, the LGHS model is outshined
by the LGMO model, which achieves a higher count of 165
out of 182 measured values, thereby showcasing its
heightened predictive efficacy. It is noteworthy that the
LGFF model demonstrates the highest level of accuracy
among all models, boasting an impressive tally of 172 out of
182 measured values. This observation highlights the
diverse range of predictive capabilities among the models
and underscores the significance of accurate measurement
in evaluating their performance under varying conditions.
Fig. 7. Radar plot for the model performance considering the separated conditions
contrast, the LGFF model outshines others in the
"Unmatched" condition with an exceptional accuracy rate
of 99%, misclassifying only 10 participants. This
underscores the LGFF model's superior predictive
performance in identifying successful matches, making it
the preferred choice in this comparison. However, the
LGMO model maintains its competitive edge by
demonstrating consistent accuracy across both conditions,
reinforcing its reliability in predicting relationship
outcomes. Under the "Matched" condition, it achieves an
impressive accuracy of 87% with 23 misclassifications,
further solidifying its effectiveness in diverse prediction
scenarios.
Fig. 8 presents the confusion matrix, offering insights
into the accuracy and misclassifications of different models
under both "Matched" and "Unmatched" conditions in
predicting relationship outcomes from speed dating
interactions. The confusion matrix not only illustrates the
overall accuracy of each model but also delineates specific
types of misclassifications. In the "Unmatched" condition,
the LGMO model achieves a notable accuracy of 97%,
misclassifying 17 participants. Conversely, under the
"Matched" condition, the LGMO model shows slightly
reduced accuracy at 83.75%, with 32 misclassifications.
This highlights the model's robust predictive potential,
particularly in scenarios where matches are less evident. In
14
Fig. 8. Constructing a confusion matrix to evaluate the accuracy of each model
3.2.
Limitations and Future study
This study employs advanced machine learning to
predict relationship outcomes from speed dating events.
However, its findings are limited by a dataset from specific
locations, requiring validation across diverse demographics
for broader applicability. Improving predictive models
involves refining feature engineering and integrating
additional data types like social media interactions.
Complex models, while accurate, lack interpretability,
suggesting a need for explainable AI techniques.
Longitudinal research is needed to understand relationship
dynamics over time for better predictive accuracy. Ethical
considerations, including privacy and fairness, must be
addressed in deploying these models. Future research
should validate models across diverse cultural contexts and
explore
interventions
to
enhance
matchmaking
effectiveness in online dating and speed dating contexts.
4. Conclusion
In this research, the application of Light Gradient
Boost Classification (LGBC) combined with optimization
algorithms specifically, Mayflies Optimization (MO), Henry
Gass Solubility Optimization Algorithm (HGOA), and
Flying Fox Optimization (FFO) were investigated to
improve prediction accuracy in the context of speed dating.
The goal was to use these algorithms together to improve
LGBC's prediction skills and give useful insights for
enhancing matching outcomes. The experimentation
results revealed significant discoveries, notably during the
training period. Both the LGBC and LGHS models had
worse performance, with accuracies of 0.938 and 0.945,
15
respectively. In striking contrast, the LGMO model
emerged as the top performer, with an accuracy of 0.956
suggesting higher predictive efficacy. However, the LGFF
model outperformed all others with an accuracy of 0.965.
This highlights the need to use optimization methods, such
as FFO, in combination with LGBC to improve forecast
accuracy. The findings highlight the significance of using
sophisticated optimization approaches to improve the
forecasting capabilities of ML models in speeddating settings. By combining LGBC's synergistic power
with optimization algorithms, the potential to drastically
enhance matching outcomes is revealed. These findings
have ramifications beyond speed dating, extending to a
variety of fields where predictive accuracy is critical for
decision-making. Moving ahead, further study may look
deeper into the processes by which optimization techniques
improve LGBC performance. Furthermore, more research
into the applicability of these methodologies in real-world
speed dating situations might yield useful insights for
matchmaking services and associated sectors. Overall, the
work adds to the expanding body of knowledge in predictive
modeling and optimization, opening the way for more
efficient matching algorithms and decision-making
processes in a variety of scenarios.
Not applicable
Ethical approval
All authors have been personally and actively involved
in substantial work leading to the paper, and will take
public responsibility for its content.
REFERENCES
[1]
[2]
[3]
[4]
Competing of interests
The authors declare no competing of interests.
[5]
Authorship Contribution Statement
Muthumani Muralidharan: Writing-Original draft
preparation, Conceptualization, Supervision, Project
administration.
Karthikeyan palanisamy: Methodology, Software.
[6]
[7]
Data Availability
On Request
[8]
Declarations
Not applicable
[9]
Conflicts of Interest
The authors declare that there is no conflict of interest
regarding the publication of this paper.
[10]
Author Statement
The manuscript has been read and approved by all the
authors, the requirements for authorship, as stated earlier
in this document, have been met, and each author believes
that the manuscript represents honest work.
[11]
Funding
16
P. W. Eastwick, L. B. Luchies, E. J. Finkel, and L. L.
Hunt, “The predictive validity of ideal partner
preferences: a review and meta-analysis.,” Psychol
Bull, vol. 140, no. 3, p. 623, 2014.
E. J. Finkel, P. W. Eastwick, and J. Matthews,
“Speed‐dating as an invaluable tool for studying
romantic attraction: A methodological primer,”
Pers Relatsh, vol. 14, no. 1, pp. 149–166, 2007.
S. Davidoff, M. K. Lee, A. K. Dey, and J.
Zimmerman, “Rapidly exploring application design
through speed dating,” in UbiComp 2007:
Ubiquitous
Computing:
9th
International
Conference, UbiComp 2007, Innsbruck, Austria,
September 16-19, 2007. Proceedings 9, Springer,
2007, pp. 429–446.
J. B. Asendorpf, L. Penke, and M. D. Back, “From
dating to mating and relating: Predictors of initial
and long–term outcomes of speed–dating in a
community sample,” Eur J Pers, vol. 25, no. 1, pp.
16–30, 2011.
J. Turowetz and M. M. Hollander, “Assessing the
experience of speed dating,” Discourse Stud, vol. 14,
no. 5, pp. 635–658, 2012.
O. Muurlink and C. Poyatos Matas, “From romance
to rocket science: Speed dating in higher education,”
Higher Education Research & Development, vol.
30, no. 6, pp. 751–764, 2011.
S. Bhargava and R. Fisman, “Contrast effects in
sequential decisions: Evidence from speed dating,”
Review of Economics and Statistics, vol. 96, no. 3,
pp. 444–457, 2014.
N. Korobov and J. Laplante, “Using improprieties to
pursue intimacy in speed-dating interactions,” Stud
Media Commun, vol. 1, no. 1, pp. 15–33, 2013.
J. Zimmerman and J. Forlizzi, “Speed dating:
providing a menu of possible futures,” She Ji: The
Journal of Design, Economics, and Innovation, vol.
3, no. 1, pp. 30–50, 2017.
N. D. Tidwell, P. W. Eastwick, and E. J. Finkel,
“Perceived, not actual, similarity predicts initial
attraction in a live romantic context: Evidence from
the speed‐dating paradigm,” Pers Relatsh, vol. 20,
no. 2, pp. 199–215, 2013.
I. Großmann, A. Hottung, and A. KrohnGrimberghe, “Machine learning meets partner
matching: Predicting the future relationship quality
based on personality traits,” PLoS One, vol. 14, no.
3, p. e0213569, 2019.
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
A. Baxter, J. A. Maxwell, K. L. Bales, E. J. Finkel, E.
A. Impett, and P. W. Eastwick, “Initial impressions
of compatibility and mate value predict later dating
and romantic interest,” Proceedings of the National
Academy of Sciences, vol. 119, no. 45, p.
e2206925119, 2022.
J. H. Cho, J. Lee, and S. Y. Sohn, “Predicting future
technological convergence patterns based on
machine learning using link prediction,”
Scientometrics, vol. 126, pp. 5413–5429, 2021.
J. L. Goetz, D. Keltner, and E. Simon-Thomas,
“Compassion: an evolutionary analysis and
empirical review.,” Psychol Bull, vol. 136, no. 3, p.
351, 2010.
A. A. Eaton and S. Rose, “Has dating become more
egalitarian? A 35-year review using sex roles,” Sex
Roles, vol. 64, pp. 843–862, 2011.
S. Dreiseitl, L. Ohno-Machado, H. Kittler, S.
Vinterbo, H. Billhardt, and M. Binder, “A
comparison of machine learning methods for the
diagnosis of pigmented skin lesions,” J Biomed
Inform, vol. 34, no. 1, pp. 28–36, 2001.
L. Munkhdalai, T. Munkhdalai, O.-E. Namsrai, J. Y.
Lee, and K. H. Ryu, “An empirical comparison of
machine-learning methods on bank client credit
assessments,” Sustainability, vol. 11, no. 3, p. 699,
2019.
C. Kampichler, R. Wieland, S. Calmé, H.
Weissenberger,
and
S.
Arriaga-Weiss,
“Classification in conservation biology: a
comparison of five machine-learning methods,”
Ecol Inform, vol. 5, no. 6, pp. 441–450, 2010.
J. Alzubi, A. Nayyar, and A. Kumar, “Machine
learning from theory to algorithms: an overview,” in
Journal of physics: conference series, IOP
Publishing, 2018, p. 012012.
B. Mahesh, “Machine learning algorithms-a
review,” International Journal of Science and
Research (IJSR). [Internet], vol. 9, no. 1, pp. 381–
386, 2020.
P. Flach, Machine learning: the art and science of
algorithms that make sense of data. Cambridge
university press, 2012.
J. G. Carbonell, R. S. Michalski, and T. M. Mitchell,
“An overview of machine learning,” Mach Learn,
pp. 3–23, 1983.
A. Dagliati et al., “Machine learning methods to
predict diabetes complications,” J Diabetes Sci
Technol, vol. 12, no. 2, pp. 295–302, 2018.
Y. Luo, P. Szolovits, A. S. Dighe, and J. M. Baron,
“Using machine learning to predict laboratory test
results,” Am J Clin Pathol, vol. 145, no. 6, pp. 778–
788, 2016.
A. Mackenzie, “The production of prediction: What
does machine learning want?” European Journal of
Cultural Studies, vol. 18, no. 4–5, pp. 429–445,
2015.
L. Sandra, F. Lumbangaol, and T. Matsuo, “Machine
Learning
Algorithm
to
Predict
Student’s
[27]
[28]
[29]
17
Performance: A Systematic Literature Review.,”
TEM Journal, vol. 10, no. 4, 2021.
N. Jean, M. Burke, M. Xie, W. M. Davis, D. B. Lobell,
and S. Ermon, “Combining satellite imagery and
machine learning to predict poverty,” Science
(1979), vol. 353, no. 6301, pp. 790–794, 2016.
A. Sharma and B. Singh, “AE-LGBM: Sequencebased novel approach to detect interacting protein
pairs via ensemble of autoencoder and LightGBM,”
Comput Biol Med, vol. 125, p. 103964, 2020.
M. A. El-Shorbagy, A. Bouaouda, H. A. Nabwey, L.
Abualigah, and F. A. Hashim, “Advances in Henry
Gas Solubility Optimization: A Physics-Inspired
Metaheuristic Algorithm with Its Variants and
Applications,” IEEE Access, 2024.