Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Research Article – Vol. 01, Issue 03, No. 01 Journal of Artificial Intelligence and System Modelling Journal Web Page: https://jaism.bilijipub.com Predicting the Matching possibility of Online Dating youths using Novel Machine Learning Algorithm Karthikeyan palanisamy1, Muthumani Muralidharan 2* 1.Assistant 2 PPG Professor, School of CS & IT, JAIN (Deemed-to-be University), Bengaluru College of Arts & Science, Coimbatore, Tamilnadu, India Highlights ➢ ➢ ➢ ➢ ➢ Speed dating offers efficient and convenient face-to-face connections for busy singles. Study uses LGBC combined with HGSOA, FFO, and MO for hybrid relationship forecasting. LGBC model shows lower accuracy (0.938) compared to LGHS (0.945) and LGMO (0.956). Achieves remarkable R2 value of 99.7% in training for both Hydrogen (H_2) and Nitrogen (N_2). HGFF model excels with 0.965 accuracy, best for predicting early relationship dynamics. Article Info Received: 29 April 2024 Received in revised: 26 June 2024 Accepted: 29 June 2024 Available online: 30 June 2024 Keywords Speed Dating Online Dating Henry Gass Solubility Optimization Algorithm Flying Fox Optimization Mayflies Optimization Machine Learning. Abstract In today's fast-paced society, many choose speed dating since it is efficient and convenient. Speed dating events are organized to allow busy singles to meet a variety of potential partners in a short timeframe, thereby maximizing their chances of making connections. It creates an organized setting that encourages brief but significant contacts, allowing people to quickly assess chemistry and compatibility. Furthermore, in the digital age, when online dating can be impersonal, speed dating provides face-to-face connection, which increases authenticity and reduces the ambiguity of online profiles. In general, speed dating appeals to modern daters who want quick and tangible results in their search for romance. This research project aims to gain insights into forecasting the course of relationships created during initial meetings utilizing cutting-edge Machine Learning (ML) approaches. Light Gradient Boosting Classification (LGBC) serves as a foundational framework, and an innovative approach is introduced by combining it with the Henry Gass Solubility Optimization Algorithm (HGSOA), Flying Fox Optimization (FFO), and Mayflies Optimization (MO), resulting in a hybrid model. Investigation reveals that throughout the training phase, the LGBC model achieved a small accuracy of 0.938, suggesting its comparative inferiority to the LGHS and LGMO models, which achieved accuracies of 0.945 and 0.956, respectively. Nonetheless, the hybrid HGFF model emerged as the clear accurate model, outperforming all other competitors with an astounding accuracy of 0.965. As a result, it is often regarded as the best model for anticipating relationship dynamics during early meetings, providing vital insights into the complexities of relationships on first dates. 1. Introduction Speed dating events are typically set up with circular or semi-circular tables and pairs of seats facing each other [2]. Participants are allocated a table number or location and cycle across the room to meet new individuals. Before the event begins, each participant is handed a scorecard or form on which they can record the names or identification numbers of those they are interested in [3]. Speed dating is a planned matchmaking technique that allows people to quickly meet and connect with possible romantic partners. Companies or groups typically sponsor speed dating events, which are generally hosted in pubs, restaurants, or community centers [1]. The concept consists of a series of brief, timed interactions between individuals, each lasting ranging from three to ten minutes. Corresponding Author: Muthumani Muralidharan Email: principalppgcas@ppg.edu.in * 1 Once the event begins, attendees participate in chats with one another, getting to know one another in the allotted time. These chats are usually informal and lighthearted, including themes including hobbies, interests, and goals [4]. The restricted time restriction pushes participants to create a rapid impression and determine whether they have a connection with the person they are speaking with. At the end of each round, participants receive a signal to go to the next table or exchange partners. This rotation repeats until every person has had a chance to meet everyone at the event. After that, attendees send their scorecards to the event planners, indicating which people they want to see again [4]. If two participants have a common interest, the event organizers promote the sharing of contact information, allowing them to interact beyond the speed dating event. If no mutual interest is demonstrated, the participants part ways, and the organizers may offer options for future events or matching services [5], [6], [7]. Overall, speed dating provides a quick and effective means for singles to meet possible love partners, as opposed to traditional dating techniques, which may be time-consuming and unexpected. It creates a calm and social environment in which people may develop important relationships in a short period [8], [9], [10]. ML has significant benefits for forecasting the continuation of partnerships after a first date [11]. ML algorithms excel in detecting small patterns indicative of relationship potential by analyzing large datasets containing diverse behavioral, vocal, and nonverbal signs seen during interactions [12]. These algorithms use complex analytical approaches to interpret data quickly and effectively, resulting in accurate predictions that outperform human capabilities [12]. One key advantage is the objectivity inherent in ML models. Without personal human judgment, which may be influenced by prejudices and preconceptions, algorithms rely only on observable facts, reducing the possibility of personal biases clouding the research. This objective review improves decisionmaking by giving people a better grasp of the relationship's possibilities. Furthermore, ML enables individualized predictions based on the unique qualities and preferences of the persons involved [13]. By examining varied datasets that include a variety of demographic and behavioral factors, computers can account for individual variances, resulting in more nuanced and accurate projections. This tailored method improves the relevance and application of forecasts, allowing individuals to make educated decisions based on their circumstances. ML models may also be refined and improved continually. Algorithms improve their forecast accuracy and flexibility over time through iterative learning procedures that incorporate fresh data, feedback, and results. This continuous development guarantees that forecasts remain current and responsive to changing dating dynamics and social trends, increasing their usefulness and dependability over time [11]. Although many researchers have embraced speeddating's scientific promise, such approaches, like other methodological breakthroughs, should be approached with caution (Finkel et al [14]). For example, while speed dating has significant external validity in certain areas, it may lack it in others [15]. After all, speed dating events differ significantly from traditional methods of meeting love partners, and these distinctions may appeal mainly to a minority of singles. Such external validity difficulties, however, are not unique to speed dating. Scholars have yet to establish (a) how romantic relationships begin at church socials differ from those starting at work, at the beach, or on the subway (e.g., perhaps interactions starting at church benefit from spiritual rather than sexual reliability, whereas interactions starting at the beach show the opposite pattern); or (b) how the character traits of individuals who meet partners in one setting differ from the personalities of people who meet partner Future study might reveal whether particular methods of finding partners are more appropriate for some people than others. Another possible risk is that speed dating may not result in romantic attraction. The academic usefulness of speed-dating methods would be significantly reduced if speed-daters were only infrequently attracted to each other or initiated post-event contact (compared to parallel frequency in other contexts). Fortunately, preliminary data shows that speed dating may be an exceptionally efficient way of introducing people who then pursue follow-up dates with one other [14]. 1.1. Objective In today’s fast-paced society, speed dating has become a popular method for individuals to efficiently meet potential partners amidst their busy schedules. These events offer participants a structured environment to engage in brief, face-to-face interactions aimed at assessing compatibility and chemistry quickly. Unlike online dating platforms, which can sometimes feel impersonal and fraught with uncertainties related to profile accuracy and authenticity, speed dating provides immediate interpersonal feedback. This direct interaction appeals to modern daters seeking tangible results in their quest for meaningful relationships. While existing research acknowledges the effectiveness of speed dating in facilitating initial connections, this study seeks to advance the field by integrating cutting-edge Machine Learning 2 (ML) techniques to predict the trajectory of relationships formed during these brief encounters. Specifically, we employ the Light Gradient Boosting Classification (LGBC) as a foundational framework and introduce a novel hybrid model that incorporates the Henry Gass Solubility Optimization Algorithm (HGSOA), Flying Fox Optimization (FFO), and Mayflies Optimization (MO). This hybrid approach aims to enhance predictive accuracy beyond what traditional models achieve, offering insights into the complex dynamics at play during initial meetings. Furthermore, this study contributes to the literature by not only forecasting relationship outcomes but also by demonstrating the efficacy of hybrid ML models in enhancing prediction accuracy. By comparing the hybrid model (HGFF) against standard LGBC and other variants (LGHS and LGMO), significant improvements were illustrated in accuracy, underscoring the applicability of advanced ML techniques in understanding and predicting relationship development from initial interactions. tweaking, or algorithm selection can enhance predicted accuracy [26]. ML may help forecast the continuation of speed dating encounters by studying participant interactions. Using historical data and prediction algorithms, event organizers may improve matching success and provide a more enjoyable experience for attendees [27]. 2. Datasets and Methods 2.1. Data Collection The dataset consists of interactions between participants during these events, focusing on various attributes relevant to initial impressions and compatibility assessment. In total, the dataset comprises 1000 samples. Data preprocessing involves several steps to ensure quality and consistency. Firstly, missing data points, primarily from rating scales and demographic information, were handled using mean imputation for numeric variables and mode imputation for categorical variables. Outliers, identified through box plots and domain knowledge, were either corrected or removed to prevent skewing results during model training. Furthermore, to maintain data integrity, categorical variables were encoded using one-hot encoding, while numeric variables were standardized to a mean of 0 and a standard deviation of 1. This normalization step aimed to mitigate the influence of varying scales and magnitudes across different features, ensuring fair representation in our predictive models. 1.2. Study Procedure ML can predict whether or not speed dating sessions will continue by examining multiple data points about the interactions of the participants [16]. Initially, data is gathered from previous speed dating events, including demographics, hobbies, discussion duration, mutual interest, and body language, indicators [17]. These data points are used as features to train prediction algorithms [18]. During the training phase, ML methods such as logistic regression, decision trees, and neural networks are used. These algorithms learn from previous data, recognizing patterns and links between characteristics and the outcome variable whether participants choose to maintain their connection after the event [19], [20]. Feature engineering is critical for retrieving useful data. Participants' ages, genders, common interests, conversation quality, and nonverbal signs such as eye contact are all possible features [21]. The algorithm learns to weigh these factors and determine which combinations indicate sustained interest [22]. Throughout prediction, the trained model examines real-time or post-event data to determine the likelihood of sustained interest between pairings [23], [24]. This prediction helps event organizers find possible matches, which improves the speed dating experience. Using ML, organizers may provide targeted follow-up services or suggestions, enhancing the success rate of matchmaking during the event [25]. The model's performance is assessed using evaluation criteria such as accuracy, precision, recall, and area under the ROC curve. Iterative use of refinement approaches like as feature selection, hyperparameter 2.2. Description On the first date, couples usually pay attention to a variety of aspects. These include physical attractiveness, conversational flow, common interests, and mutual compatibility. Body language and nonverbal clues play an important part in determining interest and comfort levels. Additionally, people frequently evaluate their date's abilities to communicate, politeness, and overall attitude. Emotional connection and chemistry are quite important, and they frequently influence whether a second date is wanted. The atmosphere and activities chosen for the date may have a big influence on the entire experience and impression left. Finally, first dates provide a chance for initial impressions and future relationship prospects to develop spontaneously. When comparing the components included in a first date to one another, it is clear that they are interconnected and contribute to the entire experience. Physical appeal frequently sparks attention, acting as an initial lure. Conversational flow and common interests, on the other hand, strengthen the bond and promote emotional involvement. Body language and nonverbal clues 3 supplement spoken communication by providing insight into mutual compatibility. Similarly, measuring manners and behavior indicates common ideals and respect. "Together" stresses the combination of these elements, emphasizing the synergy and coherence among persons. In essence, each aspect influences the quality of the contact and the possibility of forming a meaningful relationship. The influence of additional factors, such as shared interests, partner intelligence, and physical attractiveness, on the sustenance of a relationship is delineated in Fig. 1. Fig. 1. Correlation of the input and outputs 2.1.1. Feature Selection Feature selection is a critical aspect of the ML process, aimed at identifying and utilizing the most relevant features from the dataset to improve model performance and interpretability. The primary goal of feature selection is to reduce the dimensionality of the dataset by eliminating irrelevant or redundant features while retaining those that contribute the most to the predictive power of the model. This process offers several benefits, including improved model accuracy, reduced overfitting, faster training times, and enhanced interpretability of the model. There are various techniques for feature selection, ranging from simple filter methods based on statistical tests or correlation analysis to more complex wrapper methods, which involve evaluating different subsets of features using a specific ML algorithm. Fig. 2 presents the F-statistic feature selection result of the input variables. The F-statistic feature selection method, also known as analysis of variance (ANOVA), is a statistical technique used to assess the significance of individual features in a dataset concerning the target variable. It is commonly employed in regression and classification tasks to identify the most relevant features for predicting the target variable. The F-statistic assesses the relationship between each predictor variable and the target variable by evaluating the variability explained by the predictor compared to the variability not explained. This method is particularly suited for the dataset, which includes a mix of numeric and categorical features related to participant demographics, ratings, and perceived attributes. This approach not only streamlined model complexity but also improved interpretability by focusing on variables that most significantly influence relationship predictions. The empirical evaluation indicated that feature selection using the F-statistic method enhanced model 4 performance by reducing overfitting and increasing prediction accuracy. In this study, a selection of six features has been meticulously curated to optimize results and circumvent the inclusion of irrelevant features during the training process. Fig. 2. Feature Selection result of the input variables 2.3. Light Gradient Boosting Classification (LGBC) into Because ensemble classifiers perform better in classification than individual classifiers, they have attracted more interest in the fields of ML and pattern recognition. To improve classification accuracy, predictions from different (𝑠𝑖𝑛𝑔𝑙𝑒) classifiers are combined using the majority voting procedure. Several techniques, such as Random Forest (RF), bagging, boosting, stacking, and others, are frequently used to build ensemble classifiers. In the context of this research, which focuses only on using (𝐿𝑖𝑔ℎ𝑡𝐺𝐵𝑀) as a group strategy, the fundamentals of boosting will be particularly examined. The boosting strategy involves training a set of separate classifiers one at a time to improve the performance of the weaker ones. Training data with the same weights is used in the first step of the iteration, and these weights are recalibrated during the training phase. Because previous iterations' classifiers were weaker, incorrectly identified pixels are given higher weights, which corrects their classification in the next iteration. Various boosting strategies are used in the field of remote sensing and ML. Two examples of these techniques are gradient-boosted decision trees (GBDT) and gradientboosting machines (GBM) A large number of recent studies have investigated and evaluated new ensemble learning methods for remotely sensed picture categorization. The CCF algorithm (2015) and 𝑋𝑔𝐵𝑜𝑜𝑠𝑡 (2016) are two examples. 𝐿𝐺𝐵𝑀 is a relative newcomer to the ensemble learning scene and has attracted a lot of interest from the ML field. Recent ML and data science contests have shown that it performs better than other boosting frameworks, especially when dealing with massive datasets. Please refer to the next paragraph for an overview of 𝐿𝐺𝐵𝑀; Ke et al's study [28]. contains more detailed information. The 𝐿𝑖𝑔ℎ𝑡𝐺𝐵𝑀 classifications were carried out using the current section of the Light GBM Python Package. Decision tree algorithms serve as the foundation for the gradient-boosting framework known as LGBM. Unlike previous ensemble learning algorithms, which use the level-wise method, the leaf-wise technique is used for tree development. Two cutting-edge techniques that set the LGBM platform apart are exclusive feature bundling (EFB) and gradient-based one-side sampling (GOSS). Rather than using all instances, GOSS uses a subset made up of smaller instances, whereas EFB combines exclusive properties into smaller bundles. When these techniques are used in 𝐿𝐺𝐵𝑀, it produces benefits like faster learning 5 𝑋𝑖𝑗 (𝑡 + 1) = 𝑋𝑖,𝑗 (𝑡) + 𝐹 × 𝑟 × 𝛾 × (𝑋𝑗,𝑏𝑒𝑠𝑡 (𝑡) − times and more accuracy than other gradient boosting frameworks. Specifically, the short training time and low memory use have led to the term "𝐿𝐺𝐵𝑀." For best results, it is necessary to modify the model's parameters, which include things like boosting type, maximum depth, learning rate, and leaf count. 𝑋𝑖,𝑗 (𝑡)) + 𝐹 × 𝑟 × 𝛼 × (𝑆𝑖,𝑗 (𝑡) × 𝑋𝑏𝑒𝑠𝑡 (𝑡) − 𝑋𝑖,𝑗 (𝑡)) 𝛾 = 𝛽 × 𝑒𝑥𝑝 ( 𝐹𝑏𝑒𝑠𝑡 (𝑡)+𝜀 𝐹𝑖,𝑗 (𝑡)+𝜀 (4) ),𝜀 With r: random number, α, β, and γ: constants. 𝐹𝑖,𝑗 (𝑡) is the fitness of gas 𝑖 in cluster 𝑗, and 𝐹𝑏𝑒𝑠𝑡 (𝑡) is the best fit value across all of the clusters typically, the fitness is interpreted as being equal to the value of the objective function. 2.4. The Henry Gas Solubility Optimization Algorithm (HGSO) The HGSO's basic idea is covered in this section (HGSO). The HGSO was created lately by Hashim et ali [29] and is based on Henry's law. The procedures in mathematics as follows can be explained in the HGSO: Initialization: Select the number of gases (𝑁) and their initial locations. Initializing the partial pressure on the gases is necessary. The gas's location, indicated by 𝑥𝑖 , 𝑗, may be found using the formula below. 𝑋𝑖 (𝑡 + 1) = 𝑋𝑚𝑖𝑛 + 𝑟(𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛 ) (1) 2.4.1. Escape from local optimum This step's objective is to go away from the local optimum. The following formula is used to determine the number of worst agents 𝑁𝑤 . 𝑁𝑤 = 𝑁 × (𝑟𝑎𝑛𝑑(𝑐2− 𝑐1 + 𝑐1 ), 𝑐1 = 0.1 𝑎𝑛𝑑𝑐2 = 0.2 (5) The number of search agents is denoted by 𝑁. 2.4.2. Update the position of the worst agents For calculating the new ranks of the 𝑁𝑤 worst gases, an alternative formula has been utilized. The following is the formula: 𝐺(𝑖,𝑗) = 𝐺𝑚𝑖𝑛 (𝑖,𝑗) + 𝑟 × (𝐺𝑚𝑎𝑥(𝑖,𝑗) − 𝐺𝑚𝑖𝑛 (𝑖,𝑗) ) (6) Using 𝑋𝑚𝑎𝑥 and 𝑋𝑚𝑖𝑛 at the upper and lower limits, and 𝑟: a random number between 0 and 1. Clustering: Selecting the number of clusters is an essential process. The same kind of gases ought to be in the same group. Evaluation: There are two steps to this. Finding the greatest gas inside each cluster is the first step toward identifying the best gas overall. The gases can be ranked using the objective function. 𝑋𝑗,𝑏𝑒𝑠𝑡 indicates the location of the best gas within each cluster (𝑗), while 𝑋𝑏𝑒𝑠𝑡 indicates the position of the best gas across clusters. Update Henry’s coefficient: Each cluster's Henry's coefficient is calculated using the following formula: 1 1 𝐻𝑗 (𝑡 + 1) = 𝐻𝑗 𝑥 𝑒𝑥𝑝 (−𝐶𝑗 ( − 𝜃 )) 𝑇(𝑡) 𝑇 (2) 𝑡 ) T(t)=𝑒𝑥𝑝 (− With 𝑟: random number 𝑚, 𝐺min (𝑖,𝑗) ) and 𝐺max(𝑖,𝑗) : lower and upper bounds. 2.5. Flying Foxes Optimization Algorithm (FFO) FFO, or flying fox’s optimization, is a populationbased stochastic method that makes use of strategies that FF employs to withstand extreme heat. It makes use of a hybrid algorithm structure that is dependent on the attraction constant, replacement list, and population size. These variables affect how well the algorithm performs. 𝑖𝑡𝑒𝑟 With T(t): the iteration's temperature in the 𝑡 𝑡ℎ , 𝑖𝑡𝑒𝑟: the maximum number of times 𝑡 𝜃 displays a constant value throughout an iteration. Update solubility: Each gas's solubility is determined using the formula below: 𝑆𝑖,𝑗 (𝑡) = 𝐾𝐻𝑗 (t+1)𝑥𝐻𝑖𝑗 (𝑡) (3) 2.5.1. Functioning of Flying Foxes Algorithm Some of the biggest bat species on the planet are represented by FF. Since they are unable to echolocate, their ability to move about in space depends on their awareness of their surroundings. They return to their habitat trees after their evening feeding. Foxes that fly look for cooler trees to rest upon to shield themselves from heat waves that arise in the morning. The majority of the time, FFs suffocate each other and perish when they find a tree with a suitable level of heat first. 𝑃𝑖𝑗 (𝑡) represents the partial pressure on gas 𝑖 in cluster 𝑗, whereas 𝐾 is a constant. Update position: The updated position is given below. The following equation is used to update the particle positions: 2.5.2. 6 The Application of FFO Algorithm The starting point of this new paradigmatic method is an arbitrary collection of several places of each FF. A vector with m-dimensional components,𝑥 = (𝑥1 , … , 𝑥𝑚 ), is used to show these placements. After that, the objective function assesses the answers. for the roles that were previously stated. To ensure their life in the event of extreme heat, FF searches for a cooler tree. once it has located the tree with the lowest temperature. If not, it returns to the most recent place. 2.5.4. Death and Replacement Flying Foxes FF die for a variety of reasons. For example, in their search for the coolest tree, they can wind up in a very distant area with extreme heat. If so, they will not be able to avoid dying. An alternative is to use a replacement List (𝑅𝐿). using the special optimum solutions of the 𝑁𝐿. this results in the creation of an arbitrary number 𝑛 ∈ [2, 𝑁𝐿], and the position of a 𝑛𝑒𝑤𝑙𝑦 − 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑒𝑑 FF is denoted by the equation that follows: ∑𝑛𝑘=1 𝑅𝐿𝑡𝑘,𝑗 𝑡+1 𝑥𝑖,𝑗 = (10) 𝑛 2.5.3. Movement of FF Given that FF follow one another's trails and seek the closest tree, it is likely that they will migrate to a new tree in order to get away from the excessive heat if their habitat tree does not offer a comfortable minimum temperature. The following equation may serve as an illustration of how this movement was formulated. 𝑡 𝑡+1 𝑡 𝑥𝑖,𝑗 = 𝑥𝑖,𝑗 + 𝑎. 𝑟𝑎𝑛𝑑(𝑐𝑜𝑜𝑙 𝑗 − 𝑥𝑖,𝑗) (7) At reiteration 𝑡, an is a stable value, 𝑟𝑎𝑛𝑑 ∼ 𝑈(0,1), and cool denotes the position of the FF in the tree with the 𝑡 lowest temperature. With𝑤𝑖𝑡ℎ 𝑥𝑖0 ~(𝑥𝑚𝑖𝑛 , 𝑥𝑚𝑎𝑥 )), 𝑥𝑖,𝑗 is the 𝑗 − 𝑡ℎ member of FF(i). The application of Eq. (7) occurs 𝛿 when |𝑓(𝑐𝑜𝑜𝑙) − 𝑓(𝑥𝑖 |) > 1, where cool is the flying fox's Thus, at the t reiteration, 𝑅𝐿𝑡𝑘 represents the 𝑘 − 𝑡ℎ FF on the 𝑅𝐿. The goal of Eq. (10) is to increase the likelihood of finding a suitable location. Suffocation by other members of the colony is another way that FF might perish. In this instance, a probability is established based on the number of flying foxes detected in the areas with the lowest temperatures prior to the conclusion of an iteration. It is explained as follows: 𝑛𝑐 − 1 𝑝𝐷 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒 (11) 2 position vector, located at the coolest spot ever identified. This is the best answer to date and parameter 𝛿1 corresponds to the maximum distance at which two flying foxes may be considered to be close to one another. An FF looks for the closest space to avoid suffocating as it approaches a tree with the lowest temperature|𝑓(𝑐𝑜𝑜𝑙) − 𝛿 𝑓(𝑥𝑖 |) > 1. The phenomena are further explained by the Where the number of FF with an objective function comparable to the ideal solution is directly related to 𝑛𝑐. 2 2.5.5. Crossover Process The process of genetic crossover is used to help two FF mates. First, two parents are chosen at random from the population; make sure they are not the same. Two children are formed by this crossover procedure, and they are as follows: 𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔1 = 𝐿. 𝑅1 + (1 − 𝐿). 𝑅2 𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔2 (12) (1 = 𝐿. 𝑅2 + − 𝐿). 𝑅2 following equations: 𝑡+1 𝑡 𝑡 𝑛𝑥𝑖,𝑗 = 𝑥𝑖,𝑗 + 𝑟𝑎𝑛𝑑1,𝑗 . (𝑐𝑜𝑜𝑙𝑗 − 𝑥𝑖,𝑗 ) (8) 𝑡 𝑡 + 𝑟𝑎𝑛𝑑2,𝑗 . (𝑥𝑅1𝑗 − 𝑥𝑅2𝑗 ) 𝑡+1 𝑛𝑥𝑖,𝑗 , 𝑖𝑓 𝑗 = 𝑘 𝑜𝑟 𝑟𝑛𝑑𝑗 ≥ 𝑝𝑎 𝑡+1 𝑥𝑖,𝑗 ={ (9) 𝑡 ′ 𝑥𝑖,𝑗 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Where 𝑝𝑎 is a probability constant, rndj is an arbitrary number between 0 and 1, 𝑟𝑎𝑛𝑑 ∼ 𝑈(0,1), and 𝑥𝑅𝑡 1 and 𝑥𝑅𝑡 2 are two arbitrary members of the current population. Eventually, 𝑘, selected at random in {1,2, … , 𝑚} and 𝑡+1 ensuring that a minimum 𝑥𝑖,𝑗 selects one ingredient from 𝑡+1 𝑛𝑥𝑖,𝑗 to ensure that the new solution and the existing one do not duplicate each other. An evaluation of the computed solutions is conducted. The FF is accepted as a new solution Randomly picked members of the population 𝑅1 and 𝑅2 are identified, whereas a randomly produced value 𝐿 falls between 0 and 1. Fig. 3 illustrates the process of the FFO algorithm. 7 Fig. 3. Process of the FFO. trajectories, 𝑓(𝑥ℎ𝑖 ), would be used to adjust the velocity. Male mayflies would update their velocities based on their current velocities, the distance between them and the global best location, and the historical best trajectories 𝑖𝑓 𝑓(𝑥𝑖 ) > 𝑓(𝑥ℎ𝑖 ). 2 𝑣𝑖 (𝑡 + 1) = 𝑔. 𝑣𝑖 (t)+𝑎1 𝑒 −𝛽𝑟𝑝 [𝑥ℎ𝑖 − 𝑥𝑖 (𝑡)] + 2 (14) 𝑎2 𝑒 −𝛽𝑟𝑔 [𝑥𝑔 − 𝑥𝑖 (𝑡)] 2.6. Mayflies Optimization Algorithm (MO) For the MO algorithm, male and female mayflies in swarms would be distinguished. Additionally, the male mayflies would always be stronger, which would improve their optimization. The people in the MO algorithm would update their locations based on their current positions 𝑝𝑖 (𝑡) and velocity 𝑣𝑖 (𝑡) at the current iteration, just like the individuals in swarms of the 𝑃𝑆𝑂 𝑎𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚: 𝑝𝑖 (𝑡 + 1) = 𝑝𝑖 (𝑡) + 𝑣𝑖 (𝑡 + 1 (13) Eq. (13) would be used by all male and female mayflies to update their locations. Their velocities would be updated in various ways, though. 2.6.1. Movements of Male Mayflies During iterations, MM in swarms would continue the process of exploration or exploitation. Their present fitness values, 𝑓(𝑥𝑖 ), and their history of best fitness values in The system described in the literature has 𝑎 variable called g that decreases linearly from its highest value, and values are balanced by constants called 𝑎1, 𝑎2, and 𝛽. The second norm is the Cartesian distance between individuals and their historical optimal location within swarms. 𝑛 ‖𝑥𝑖 − 𝑥𝑗 ‖ = √∑(𝑥𝑖𝑘 − 𝑥𝑗𝑘 )2 𝑘=1 8 (15) In contrast, the MM would update their velocities from the present one with a random dance coefficient 𝑑 if 𝑖𝑓 (𝑥𝑖 ) < 𝑓(𝑥ℎ𝑖 ) 𝑣𝑖 (𝑡 + 1) = 𝑔. 𝑣𝑖 (𝑡) + 𝑑. 𝑟1 (16) In this case, an additional constant that is also utilized to balance the velocities is𝑎3 . The Cartesian distance between them is represented by 𝑟𝑚 . In contrast, the FM would update their velocities from the present one with another arbitrary 𝑑𝑎𝑛𝑐𝑒 𝑓𝑙 𝑖𝑓 (𝑦𝑖 ) < 𝑓(𝑥)𝑖 . 𝑣𝑖 (𝑡) = 𝑔. 𝑣𝑖 (𝑡) + 𝑓𝑙. 𝑟2 (18) In this case, 𝑟1 represents a uniformly distributed random number chosen from the domain [−1, 1]. 2.6.2. Movements of Female Mayflies The FM would alter the way they updated their velocities. The FM would be in a hurry to locate the MM in order to marry and procreate since, according to biology, they can only live for one to seven days at most. Consequently, they would revise their velocity according to the MM with whom they wish to mate. The first mate in the MO algorithm would be the best female and MM, and the second mate would be the best female and MM, and so on. Thus, in the case of the 𝑖 − 𝑡ℎ FM, 𝑖𝑓 𝑓(𝑦𝑖 ) < 𝑓(𝑥𝑖 ): 2 𝑣𝑖 (𝑡 + 1) = 𝑔. 𝑣𝑖 (𝑡) + 𝑎3 𝑒 −𝛽𝑟 𝑚𝑓 [𝑥𝑖 (𝑡) − 𝑦𝑖 (𝑡)] (17) In the domain [−1, 1], where 𝑟2 is likewise a random integer with a uniform distribution. 2.6.3. Mating of mayflies The top half of mayflies would mate and produce a pair of offspring, who would randomly develop from their parents. 𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔1 = 𝑙 ∗ 𝑚𝑎𝑙𝑒 + (1 − 𝐿) ∗ 𝑓𝑒𝑚𝑎𝑙𝑒 (19) 𝑜𝑓𝑓𝑠𝑝𝑟𝑖𝑛𝑔2 = 𝐿 ∗ 𝑓𝑒𝑚𝑎𝑙𝑒 + (1 − 𝐿) ∗ 𝑚𝑎𝑙𝑒 (20) Hence, in a Gauss distribution, 𝐿 are random integers. The procedure of the MO algorithm has been presented in Fig. 4. Fig. 4. MOA procedure. 9 2.7. Performance Criteria The efficacy of any classification method hinges on a thorough assessment of its performance, unveiling its strengths and weaknesses. This evaluation process demands a careful selection of metrics, guided by various factors including data characteristics, error costs, and project objectives. Eqs. (21)-(24) presents the formula of the utilized metrics: 𝑇𝑃 + 𝑇𝑁 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 (21) 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑃 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃𝑅 = 𝐹1 𝑠𝑐𝑜𝑟𝑒 = 𝑇𝑃 𝑇𝑃 = 𝑃 𝑇𝑃 + 𝐹𝑁 2 × 𝑅𝑒𝑐𝑎𝑙𝑙 × 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑅𝑒𝑐𝑎𝑙𝑙 + 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 Initiating and nurturing a romantic bond is seen as a paramount preoccupation for individuals. Speed dating, serving as a facilitator, provides individuals with the opportunity to cultivate meaningful connections. However, the primary aim of this paper is to delve into the realm of ML algorithms to predict the compatibility of couples upon their inaugural rendezvous. The outcomes generated by the assembled models, which integrate diverse models with specified optimizers, are subjected to meticulous scrutiny via an array of plots and tables. The overarching objective is to pinpoint the most efficacious model endowed with supreme functionality, thus advancing the understanding of relationship dynamics through computational methodologies. (22) (23) 3.1. Convergence Curve A convergence curve records model performance over several training cycles. The best model is usually recognized when the curve hits a plateau or stabilizes, signifying peak performance. This is characterized by little loss or error and high accuracy or other related measures. In contrast, the poorest model is generally identified by erratic or consistent bad performance, with the curve failing to converge or even deteriorating with time. In most cases, such events indicate inadequate learning or model overfitting. Monitoring the convergence curve allows practitioners to determine where the model achieves its maximum efficacy and suggest areas for development or intervention. The convergence curve depicted in Fig. 5 serves as a comprehensive illustration of the performance exhibited by the LGHS, LGMO, and LGFF models. Upon scrutiny, it becomes evident that the LGHS model attains its optimal state starting from the 100th iteration, boasting an impressive accuracy of 0.938. Notably, this model initiates its journey with a modest accuracy of 0.4 in the initial iteration, steadily advancing to 0.8 by the 80th iteration. In contrast, the LGMO model, with its accuracy peaking at 0.952 in the 100th iteration, showcases superior functionality when compared to the LGHS model. Despite starting at an accuracy of 0.5, it progressively improves to 0.8 by the 80th iteration, indicating commendable advancement. On a different note, the LGFF model stands out with its remarkable accuracy of 0.968, underscoring its heightened predictive potential relative to its counterparts. However, its accuracy at the 100th iteration registers at 0.7, albeit with a notable acceleration in development compared to the other models. They show that while the LGBC model initially improves rapidly, its convergence stabilizes at a moderate accuracy level. In contrast, models utilizing hybrid (24) When a scenario ends negatively, the acronym FP stands for a positive projection, while TP indicates a positive forecast that matches the fortunate occurrence. When TN is used for a negative prediction, the predicted result is similar to the actual negative event. When things work out well, the FN signal indicates a bleak future. Accuracy measures the proportion of correctly predicted outcomes (both positive and negative) relative to the total number of predictions. In this study, accuracy indicates the overall correctness of relationship predictions made by each model, reflecting its ability to correctly identify successful and unsuccessful matches. Precision quantifies the proportion of true positive predictions (correctly predicted successful relationships) out of all positive predictions made by the model. It emphasizes the model's ability to avoid false positives, crucial in ensuring the reliability of matchmaking predictions during speed dating events. Recall (also known as sensitivity) calculates the proportion of true positive predictions identified by the model out of all actual positive instances. In the context of this study, recall highlights the model's effectiveness in capturing all potential successful matches, minimizing the risk of overlooking promising connections. The F1-score represents the harmonic mean of precision and recall, offering a balanced assessment of a model's performance by considering both false positives and false negatives. It provides a consolidated measure of predictive accuracy that considers both the completeness (recall) and correctness (precision) of the model's predictions. 3. Results and Discussion 10 optimization algorithms like LGHS and LGMO exhibit consistent and accelerated convergence, achieving higher accuracies within fewer iterations. The hybrid HGFF model emerges as the top performer, demonstrating both rapid convergence and superior accuracy throughout training. This convergence analysis informs the preference for the HGFF model due to its effective integration of advanced optimization techniques, highlighting its robustness in predicting relationship outcomes from initial speed dating interactions. Fig. 5. 3D wall plot for the convergence curve of the hybrid models. Table 1 presents the outcomes of the developed models across training, testing, and overall phases. Notably, in the training phase, the LGFF model emerges as the optimal performer, achieving an accuracy of 0.956. Furthermore, both the LGMO and LGHS models exhibit commendable performance with accuracies of 0.956 and 0.945, respectively, positioning them as models with good and acceptable performance. Conversely, the LGBC model, identified as the base model, displays comparatively weaker performance, boasting an accuracy of 0.938. This observation underscores the varying degrees of effectiveness among the models, with LGFF leading the pack and LGBC trailing behind in terms of accuracy and predictive capability. In the testing phase, it is observed that the precision values of 0.912 and 0.929 for the LGBC and LGHS models, respectively, indicate weaker performance compared to the LGMO model, which achieves a precision value of 0.948. However, it is noted that the precision value of the LGMO model is surpassed by that of the LGFF model, which attains a precision value of 0.975. Thus, the relative performance of the models is highlighted, with LGFF demonstrating superior precision compared to LGMO, LGHS, and LGBC. In the All phase, it is noted that the LGFF and LGMO models, boasting recall values of 0.968 and 0.953, respectively, are identified as the best and secondbest models in this comparison. Following them, the LGHS and LGBC models, with recall values of 0.938 and 0.927, respectively, are indicated to lack significant potential in the prediction process. Thus, the relative rankings of the models are underscored, with LGFF and LGMO emerging as top performers, while LGHS and LGBC exhibit comparatively lower effectiveness in terms of recall. Table 1. The outcome of the showcased developed models. Section Model Train LGBC LGHS LGMO LGFF LGBC LGHS LMGO LGFF LGBC Test All Metric values Accuracy 0.938 0.945 0.956 0.965 0.903 0.923 0.945 0.974 0.927 Precision 0.942 0.947 0.958 0.967 0.912 0.929 0.948 0.975 0.933 11 Recall 0.938 0.945 0.956 0.965 0.903 0.923 0.945 0.974 0.927 F1-score 0.939 0.946 0.956 0.966 0.906 0.925 0.946 0.974 0.929 LGHS LGMO LGFF 0.938 0.953 0.968 0.941 0.955 0.969 In the line symbol plot depicted in Fig. 6, the performance of hybrid models in various phases is illustrated. For instance, it is observed that the LGBC model exhibits its weakest performance in the testing phase, with a recall value of 0.903. Conversely, the highest performance of this model is demonstrated in the training phase, where it attains a precision value of 0.942. This highlights the varying degrees of performance across different phases, with the LGBC model showcasing notably different outcomes between the testing and training phases. The best performance of LGHS models is observed in the training phase, where it achieves a precision value of 0.947, contrasting with its recall value of 0.923 in the testing phase. Overall, the performance of LGHS models in the training phase surpasses that in other phases. However, it is noted that in general, the performance of these models is weaker compared to LGFF and LGMO models. This underscores the relative effectiveness of LGHS models in 0.938 0.953 0.968 0.939 0.953 0.969 comparison to their counterparts across different phases, with LGFF and LGMO models demonstrating superior performance overall. The performance of LGFF models maintains a remarkable level of consistency throughout both the training and all phases, as demonstrated by their consistent accuracy of 0.968. However, this consistency sharply contrasts with the accuracy of the current model in the testing phase, which stands notably higher at 0.974. Despite the overall consistency, LGFF models showcase their peak functionality during the testing phase, showcasing an impressive precision value of 0.975. This stands in stark contrast to their performance in the training phase, where their achieved recall value is 0.965. Such variance underscores the dynamic nature of model performance across different phases, ultimately highlighting LGFF models' superior functionality in the testing phase relative to the training phase. 12 Fig. 6. Line-symbol plot for the performance of the models through the metrics The comparison of model performance, as illustrated in Table 2, reveals insights into their efficacy under both Matched and Unmatched conditions. Notably, both the LGBC and LHHS models exhibit identical functionality, each achieving a precision value of 0.97 in the unmatched condition. This parity underscores their comparable predictive capabilities in scenarios where matching conditions are not met. Following this, the LGMO model displays a slightly elevated precision value of 0.98, suggesting only a marginal difference compared to the LGFF model's precision value of 0.99 within the same condition. This subtle variance underscores the nuanced distinctions between the models, with LGMO closely trailing behind LGFF in predictive accuracy under such conditions. This analysis sheds light on the intricate dynamics of model performance, revealing both commonalities and divergences among them. While LGBC and LHHS demonstrate consistent outcomes, LGMO and LGFF emerge as closely competitive alternatives, with LGFF exhibiting a slightly superior predictive accuracy. Such insights are invaluable for refining model selection and deployment strategies in various contexts, emphasizing the importance of thorough evaluation and comparison. In the context of the matched condition, discernible disparities in precision values among the models come to light. Notably, the precision value of the LGFF models reaches an exceptional 0.99, showcasing a high degree of accuracy in predicting outcomes. This nuanced understanding underscores the importance of precision in model evaluation and selection, as even slight differences can significantly impact the reliability of predictions in realworld applications Table 2. Condition-based categorization for the performance of the developed models Metric values Precision Recall F1-score Condition Unmatched Matched Unmatched Matched Unmatched Matched Model LGBC 0.97 0.76 0.94 0.87 0.96 0.81 LGHS 0.97 0.80 0.95 0.87 0.96 0.83 LGMO 0.98 0.84 0.96 0.91 0.97 0.87 LGFF 0.99 0.88 0.97 0.95 0.98 0.91 LGMO model secures the second position, attaining a commendable 820 out of 852 measured values. While slightly trailing the LHFF model, the LGMO model's performance remains noteworthy, reflecting its proficiency in predicting outcomes. In the unmatched condition, scrutiny reveals distinct patterns in the performance of various predictive models. The LGHS model, with a tally of 159 out of 182 measured In Fig. 7, the radar plot provides a visual representation of the predictive models' performance, showcasing their values against measured metrics. For instance, when examining the unmatched condition, the LHFF model emerges as the leading performer, having successfully achieved 829 out of 852 measured values. This positioning underscores its efficacy in predictive accuracy within this particular context. Following closely behind, the 13 values, emerges as a frontrunner, indicating its superior functionality over the LGBC models, which manage to attain 158 out of 182 measured values. However, despite its commendable performance, the LGHS model is outshined by the LGMO model, which achieves a higher count of 165 out of 182 measured values, thereby showcasing its heightened predictive efficacy. It is noteworthy that the LGFF model demonstrates the highest level of accuracy among all models, boasting an impressive tally of 172 out of 182 measured values. This observation highlights the diverse range of predictive capabilities among the models and underscores the significance of accurate measurement in evaluating their performance under varying conditions. Fig. 7. Radar plot for the model performance considering the separated conditions contrast, the LGFF model outshines others in the "Unmatched" condition with an exceptional accuracy rate of 99%, misclassifying only 10 participants. This underscores the LGFF model's superior predictive performance in identifying successful matches, making it the preferred choice in this comparison. However, the LGMO model maintains its competitive edge by demonstrating consistent accuracy across both conditions, reinforcing its reliability in predicting relationship outcomes. Under the "Matched" condition, it achieves an impressive accuracy of 87% with 23 misclassifications, further solidifying its effectiveness in diverse prediction scenarios. Fig. 8 presents the confusion matrix, offering insights into the accuracy and misclassifications of different models under both "Matched" and "Unmatched" conditions in predicting relationship outcomes from speed dating interactions. The confusion matrix not only illustrates the overall accuracy of each model but also delineates specific types of misclassifications. In the "Unmatched" condition, the LGMO model achieves a notable accuracy of 97%, misclassifying 17 participants. Conversely, under the "Matched" condition, the LGMO model shows slightly reduced accuracy at 83.75%, with 32 misclassifications. This highlights the model's robust predictive potential, particularly in scenarios where matches are less evident. In 14 Fig. 8. Constructing a confusion matrix to evaluate the accuracy of each model 3.2. Limitations and Future study This study employs advanced machine learning to predict relationship outcomes from speed dating events. However, its findings are limited by a dataset from specific locations, requiring validation across diverse demographics for broader applicability. Improving predictive models involves refining feature engineering and integrating additional data types like social media interactions. Complex models, while accurate, lack interpretability, suggesting a need for explainable AI techniques. Longitudinal research is needed to understand relationship dynamics over time for better predictive accuracy. Ethical considerations, including privacy and fairness, must be addressed in deploying these models. Future research should validate models across diverse cultural contexts and explore interventions to enhance matchmaking effectiveness in online dating and speed dating contexts. 4. Conclusion In this research, the application of Light Gradient Boost Classification (LGBC) combined with optimization algorithms specifically, Mayflies Optimization (MO), Henry Gass Solubility Optimization Algorithm (HGOA), and Flying Fox Optimization (FFO) were investigated to improve prediction accuracy in the context of speed dating. The goal was to use these algorithms together to improve LGBC's prediction skills and give useful insights for enhancing matching outcomes. The experimentation results revealed significant discoveries, notably during the training period. Both the LGBC and LGHS models had worse performance, with accuracies of 0.938 and 0.945, 15 respectively. In striking contrast, the LGMO model emerged as the top performer, with an accuracy of 0.956 suggesting higher predictive efficacy. However, the LGFF model outperformed all others with an accuracy of 0.965. This highlights the need to use optimization methods, such as FFO, in combination with LGBC to improve forecast accuracy. The findings highlight the significance of using sophisticated optimization approaches to improve the forecasting capabilities of ML models in speeddating settings. By combining LGBC's synergistic power with optimization algorithms, the potential to drastically enhance matching outcomes is revealed. These findings have ramifications beyond speed dating, extending to a variety of fields where predictive accuracy is critical for decision-making. Moving ahead, further study may look deeper into the processes by which optimization techniques improve LGBC performance. Furthermore, more research into the applicability of these methodologies in real-world speed dating situations might yield useful insights for matchmaking services and associated sectors. Overall, the work adds to the expanding body of knowledge in predictive modeling and optimization, opening the way for more efficient matching algorithms and decision-making processes in a variety of scenarios. Not applicable Ethical approval All authors have been personally and actively involved in substantial work leading to the paper, and will take public responsibility for its content. REFERENCES [1] [2] [3] [4] Competing of interests The authors declare no competing of interests. [5] Authorship Contribution Statement Muthumani Muralidharan: Writing-Original draft preparation, Conceptualization, Supervision, Project administration. Karthikeyan palanisamy: Methodology, Software. [6] [7] Data Availability On Request [8] Declarations Not applicable [9] Conflicts of Interest The authors declare that there is no conflict of interest regarding the publication of this paper. [10] Author Statement The manuscript has been read and approved by all the authors, the requirements for authorship, as stated earlier in this document, have been met, and each author believes that the manuscript represents honest work. [11] Funding 16 P. W. Eastwick, L. B. Luchies, E. J. Finkel, and L. L. Hunt, “The predictive validity of ideal partner preferences: a review and meta-analysis.,” Psychol Bull, vol. 140, no. 3, p. 623, 2014. E. J. Finkel, P. W. Eastwick, and J. Matthews, “Speed‐dating as an invaluable tool for studying romantic attraction: A methodological primer,” Pers Relatsh, vol. 14, no. 1, pp. 149–166, 2007. S. Davidoff, M. K. Lee, A. K. Dey, and J. Zimmerman, “Rapidly exploring application design through speed dating,” in UbiComp 2007: Ubiquitous Computing: 9th International Conference, UbiComp 2007, Innsbruck, Austria, September 16-19, 2007. Proceedings 9, Springer, 2007, pp. 429–446. J. B. Asendorpf, L. Penke, and M. D. Back, “From dating to mating and relating: Predictors of initial and long–term outcomes of speed–dating in a community sample,” Eur J Pers, vol. 25, no. 1, pp. 16–30, 2011. J. Turowetz and M. M. Hollander, “Assessing the experience of speed dating,” Discourse Stud, vol. 14, no. 5, pp. 635–658, 2012. O. Muurlink and C. Poyatos Matas, “From romance to rocket science: Speed dating in higher education,” Higher Education Research & Development, vol. 30, no. 6, pp. 751–764, 2011. S. Bhargava and R. Fisman, “Contrast effects in sequential decisions: Evidence from speed dating,” Review of Economics and Statistics, vol. 96, no. 3, pp. 444–457, 2014. N. Korobov and J. Laplante, “Using improprieties to pursue intimacy in speed-dating interactions,” Stud Media Commun, vol. 1, no. 1, pp. 15–33, 2013. J. Zimmerman and J. Forlizzi, “Speed dating: providing a menu of possible futures,” She Ji: The Journal of Design, Economics, and Innovation, vol. 3, no. 1, pp. 30–50, 2017. N. D. Tidwell, P. W. Eastwick, and E. J. Finkel, “Perceived, not actual, similarity predicts initial attraction in a live romantic context: Evidence from the speed‐dating paradigm,” Pers Relatsh, vol. 20, no. 2, pp. 199–215, 2013. I. Großmann, A. Hottung, and A. KrohnGrimberghe, “Machine learning meets partner matching: Predicting the future relationship quality based on personality traits,” PLoS One, vol. 14, no. 3, p. e0213569, 2019. [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] A. Baxter, J. A. Maxwell, K. L. Bales, E. J. Finkel, E. A. Impett, and P. W. Eastwick, “Initial impressions of compatibility and mate value predict later dating and romantic interest,” Proceedings of the National Academy of Sciences, vol. 119, no. 45, p. e2206925119, 2022. J. H. Cho, J. Lee, and S. Y. Sohn, “Predicting future technological convergence patterns based on machine learning using link prediction,” Scientometrics, vol. 126, pp. 5413–5429, 2021. J. L. Goetz, D. Keltner, and E. Simon-Thomas, “Compassion: an evolutionary analysis and empirical review.,” Psychol Bull, vol. 136, no. 3, p. 351, 2010. A. A. Eaton and S. Rose, “Has dating become more egalitarian? A 35-year review using sex roles,” Sex Roles, vol. 64, pp. 843–862, 2011. S. Dreiseitl, L. Ohno-Machado, H. Kittler, S. Vinterbo, H. Billhardt, and M. Binder, “A comparison of machine learning methods for the diagnosis of pigmented skin lesions,” J Biomed Inform, vol. 34, no. 1, pp. 28–36, 2001. L. Munkhdalai, T. Munkhdalai, O.-E. Namsrai, J. Y. Lee, and K. H. Ryu, “An empirical comparison of machine-learning methods on bank client credit assessments,” Sustainability, vol. 11, no. 3, p. 699, 2019. C. Kampichler, R. Wieland, S. Calmé, H. Weissenberger, and S. Arriaga-Weiss, “Classification in conservation biology: a comparison of five machine-learning methods,” Ecol Inform, vol. 5, no. 6, pp. 441–450, 2010. J. Alzubi, A. Nayyar, and A. Kumar, “Machine learning from theory to algorithms: an overview,” in Journal of physics: conference series, IOP Publishing, 2018, p. 012012. B. Mahesh, “Machine learning algorithms-a review,” International Journal of Science and Research (IJSR). [Internet], vol. 9, no. 1, pp. 381– 386, 2020. P. Flach, Machine learning: the art and science of algorithms that make sense of data. Cambridge university press, 2012. J. G. Carbonell, R. S. Michalski, and T. M. Mitchell, “An overview of machine learning,” Mach Learn, pp. 3–23, 1983. A. Dagliati et al., “Machine learning methods to predict diabetes complications,” J Diabetes Sci Technol, vol. 12, no. 2, pp. 295–302, 2018. Y. Luo, P. Szolovits, A. S. Dighe, and J. M. Baron, “Using machine learning to predict laboratory test results,” Am J Clin Pathol, vol. 145, no. 6, pp. 778– 788, 2016. A. Mackenzie, “The production of prediction: What does machine learning want?” European Journal of Cultural Studies, vol. 18, no. 4–5, pp. 429–445, 2015. L. Sandra, F. Lumbangaol, and T. Matsuo, “Machine Learning Algorithm to Predict Student’s [27] [28] [29] 17 Performance: A Systematic Literature Review.,” TEM Journal, vol. 10, no. 4, 2021. N. Jean, M. Burke, M. Xie, W. M. Davis, D. B. Lobell, and S. Ermon, “Combining satellite imagery and machine learning to predict poverty,” Science (1979), vol. 353, no. 6301, pp. 790–794, 2016. A. Sharma and B. Singh, “AE-LGBM: Sequencebased novel approach to detect interacting protein pairs via ensemble of autoencoder and LightGBM,” Comput Biol Med, vol. 125, p. 103964, 2020. M. A. El-Shorbagy, A. Bouaouda, H. A. Nabwey, L. Abualigah, and F. A. Hashim, “Advances in Henry Gas Solubility Optimization: A Physics-Inspired Metaheuristic Algorithm with Its Variants and Applications,” IEEE Access, 2024.