As a result, the BPS model recognizes the interdependence of biological, psychological, and social factors in understanding and treating pain [
17]. Language has lately been demonstrated to be useful in interpreting and quantifying a range of pain experience aspects other than qualia or severity by researchers. The psycholinguistic and affective characteristics of pain-related words were examined in order to set appropriate guidelines for future research on painful emotions [
18]. Detecting rumors in Arabic tweets poses challenges due to linguistic nuances. Gumaei et al. [
19] proposed the XGBoost-based approach, leveraging diverse features, achieving a top accuracy of 97.18% on a public dataset, surpassing recent methods in rumor detection. Using a large-scale dataset comprising 2.5 million surveys and 1.8 million tweets, a study by Aggarwal et al. [
20] investigates the prediction of community-level pain using Twitter posts and reveals significant variations in pain expressions across different communities in the United States, highlighting the potential for Twitter-based interventions in community-focused pain management. Sawhney et al. [
21] provide SISMO, a hierarchical attention model that considers the ordinal structure of social media suicide risk assessment. It uses soft probability distribution to accommodate for different risk levels and shows good results on real-world Reddit posts that have been annotated by professionals. Patients with chronic pain were evaluated for their reaction to placebo using quantitative language patterns collected from semi-structured interviews [
22], while pain disparities in underprivileged communities were discovered using comprehensive text mining of EHRs [
23]. Social media posts have also been studied for a variety of pain-related purposes, including tracking patients over time and identifying new pain phenotypes [
24], geographically monitoring and characterizing opioid use [
25], exploring how pain is socially [
26], and identifying population-level increases in pain conditions and symptoms [
27]. A study conducted by Kumar and Albuquerque [
28] utilized the XLM-R transformer with zero-shot transfer learning for sentiment analysis in resource-poor Indian languages. A study by Caldo et al. [
29] investigates the emotional patterns of effective spine pathology web pages, indicating their potential importance in comprehending chronic pain and affecting health-related behaviors. The findings highlight the necessity of examining the BPS components of pain and present ethical problems for digital health information providers. Mullins et al. [
30] examined tweets about pain in Ireland over a 2-week period and found that the most common terms were headache (90%) and migraine (66%). The majority of tweets were from women and identifying the dominant category of advice on back pain management. A longitudinal study by Deng et al. [
31] used NLP tools to analyze tweets regarding migraines. User behavior profiles were reported and examined, such as tweeting frequencies, popular words, and sentimental presentations. Many expressive tweets had a negative emotion, particularly those with a high frequency and severe sentiment, including the use of profanity. Guo et al. [
32] examined the utilization of social networking platforms (Reddit and Twitter) as a valuable resource for studying migraine, including the availability of relevant discussions and the development of a text classification system. The use of DL neural networks and NLP to text data from medical discharge summaries is investigated in the study proposed by Yang et al. [
33], with an emphasis on patient phenotyping. The study emphasizes the importance of data quality, quantity, and token selection in obtaining higher performance, particularly in Chronic Pain classification. To examine headache and migraine discussions in Japan, Germany, and France, researchers analyzed social media data from several platforms, revealing linguistic trends, treatment references, and demographic information [
34]. Table
1 demonstrates a thorough examination of the latest advancements in the field.
In our research, we showcase the efficacy of using deep neural networks with static embeddings to identify pain in Hindi text data. To capture the temporal connections between words and discern the nuanced patterns related to pain characteristics in text, we employ the IndicBERT model. IndicBERT, a specialized variant of the BERT model for Indic languages such as Hindi, contributes to pain detection in social media posts by comprehending the contextual nuances and linguistic patterns specific to social media phrases. IndicBERT incorporates word embeddings as an integral part of its model architecture. It recognizes pain indicators and emotional language in social media text by capturing the semantic meaning of words and phrases. The incorporation of these embeddings in pain detection models improves their performance by facilitating a better understanding of word meanings within the context of pain-related text.