Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unobtrusive Monitoring of Physical Weakness: A Simulated Approach

Longfei Chen longfei.chen@ed.ac.uk 0000-0002-3935-802 Muhammad Ahmed Raza m.a.raza@ed.ac.uk 0000-0003-4477-7375 Craig Innes craig.innes@ed.ac.uk 0000-0002-6329-4136 Subramanian Ramamoorthy s.ramamoorthy@ed.ac.uk 0000-0002-6300-5103  and  Robert B. Fisher rbf@inf.ed.ac.uk 0000-0001-6860-9371 The University of Edinburgh10 Crichton StreetEdinburghScotlandUKEH8 9AB
(2018; 20 February 2007; 12 March 2009; 5 June 2009)
Abstract.

Aging and chronic conditions affect older adults’ daily lives, making early detection of developing health issues crucial. Weakness, common in many conditions, alters physical movements and daily activities subtly. However, detecting such changes can be challenging due to their subtle and gradual nature. To address this, we employ a non-intrusive camera sensor to monitor individuals’ daily sitting and relaxing activities for signs of weakness. We simulate weakness in healthy subjects by having them perform physical exercise and observing the behavioral changes in their daily activities before and after workouts. The proposed system captures fine-grained features related to body motion, inactivity, and environmental context in real-time while prioritizing privacy. A Bayesian Network is used to model the relationships between features, activities, and health conditions. We aim to identify specific features and activities that indicate such changes and determine the most suitable time scale for observing the change. Results show 0.97 accuracy in distinguishing simulated weakness at the daily level. Fine-grained behavioral features, including non-dominant upper body motion speed and scale, and inactivity distribution, along with a 300-second window, are found most effective. However, individual-specific models are recommended as no universal set of optimal features and activities was identified across all participants.

Unobtrusive Monitoring, Computer Vision, Weakness, Behavior, Bayesian Network
copyright: acmlicensedjournalyear: 2018doi: XXXXXXX.XXXXXXXjournal: POMACSjournalvolume: 37journalnumber: 4article: 111publicationmonth: 8ccs: Applied computing Health informatics

1. Introduction

With natural aging and chronic health conditions, older adults’ daily lives are significantly affected (UK, 2019; (2022), WHO). The morbidity associated with aging and the decline in physical and mental abilities are among the main causes of their suffering (Xavier et al., 2003). Weakness (or asthenia) is a common phenotype that often accompanies many prevalent health conditions (Patient, 2021; healthline, 2023; medicalnewstoday, 2023). It manifests as physical weakness or a lack of energy in specific body parts or the entire body. Such weakness can lead to alterations in the way older adults move and perform daily tasks, as they may need to compensate for their physical limitations by using different muscles or adopting altered postures (Patient, 2021). This behavior change provides valuable insights into the health status of older adults. Early detection of the signs of developing health conditions can be crucial for prompt early intervention and treatment (Owen et al., 2022). However, detecting behavioral changes related to weakness can be challenging, particularly because the signs of long-term progression conditions are often subtle, especially in the early stages (Fogg et al., 2022; Rantz et al., 2015). As a result, such changes may not be immediately noticeable through snapshot clinical assessments, or through observations by caregivers, families, or even the individuals themselves (Cook et al., 2018, 2022; König et al., 2015b).

Sensors provide a reliable and objective means of assessing the condition of elderly individuals. There is a growing emphasis on innovative smart medical devices with care transitioning from hospitals to homes. According to (Sahni et al., 2023), the integration of AI-powered healthcare monitoring devices has the potential to reduce annual healthcare spending in the US by 5 to 10 percent. Health monitoring devices enhance the sense of safety and reduce isolation among older adults, while also saving caregivers time through automation (Pol et al., 2016). These devices offer unobtrusive, long-term monitoring, outperforming traditional manual approaches (Schmitter-Edgecombe et al., 2022).

Among the array of non-intrusive health monitoring devices available, cameras are deemed to be one of the suitable options for long-term monitoring of the progression condition of older adults. Cameras provide abundant information about individuals and their surroundings by capturing data in the form of videos or images. They can extract the kinematics of the subject at a clinically acceptable level (Nakano et al., 2020; Scano et al., 2014), provide functionality to detect the presence of humans, objects, or pets, and facilitate an analysis of the environmental context. These attributes make camera-based monitoring highly interpretable, since the extracted information can be used to provide semantic explanations (Schmitter-Edgecombe et al., 2022). For instance, if a person experiences a fall or loses consciousness (manifested by an absence of body movement), the camera can promptly convey a visual message of the post-event scenario to a designated individual, to assess the situation and provide help. It’s also important to note that camera sensors still pose challenges (Schmitter-Edgecombe et al., 2022), including privacy concerns, measurement accuracy, and adaptability to diverse environments, e.g., occlusions, low light, etc.

Simulated health conditions are becoming increasingly common in aging research, offering valuable insights into various conditions and contributing to improved care for older adults. Weakness in older adults often occurs alongside multimorbidity111Multimorbidity is the coexistence of two or more chronic conditions. A total of 67% of older adults have multimorbidity, with the prevalence increasing with age: 50% for those under 65, 62% for those aged 65-74, and 81.5% for those aged 85 and over. (Fried et al., 2001; Larson and Wilbur, 2020a), which means it is influenced by various confounding health factors. To specifically focus on the phenotype of weakness and observe related physical behavioral changes, we simulate weakness in healthy subjects by having them perform physical exercise workouts, and then observe their behavioral changes before and after the workout. We believe that the weakness observed post-workout serves as a reasonable approximation to natural weakness resulting from older adults’ aging and health conditions. Despite having differing underlying causes, they share similar physical phenotypes, such as physical discomfort or pain, reduced energy, and slowed movement, decreased strength and endurance, altered gait or movement patterns, decreased range of motion, increased resting or recovery time, and low physical activity (Fried et al., 2001; Larson and Wilbur, 2020a). The simulation on healthy subjects attempted to control confounding variables, enabling us to focus on the phenotype of weakness while minimizing the influence of other concurrent health conditions, and inter-personal differences.

To capture behavioral data, we employ a single fixed RGB-D camera (See Fig 1 for an example scenario). Subjects are monitored performing common daily activities within a designated area in a room. The camera extracts body motion, body inactivity features, and environmental context in real time, with no visual information stored to prioritize privacy. We model the dependencies between behavior observations, activity types, and environmental context with a Bayesian Network. Furthermore, multiple classifiers and information-based methods are used to assess the importance of features and activities related to the weakness condition. We also explore various temporal windows to determine the optimal time scale for effectively observing behavioral changes and classifying between normal and weakness states.

This study aims to address the following research questions:

1) Can we accurately monitor changes in people’s behavior, specifically those caused by simulated weakness, during common daily activities?

2) What specific behavioral features and activities demonstrate the most significant changes?

3) What is the optimal time scale, or temporal period, for observing these changes?

By investigating these questions, our objective is to gain insights into whether automatically quantifying behavioral changes in a simulated scenario is an effective way of explaining changes in long-term health conditions. Section 4.2 shows that it is possible to distinguish the changes in behavior related to weakness using a participant-specific trained model. Section 4.4 shows that there are preferable features and activities, but these are participant-specific. Section 4.3 shows that the 300-second time window had the best results.

Refer to caption
Figure 1. A compact system designed for monitoring older adults in their homes employs an RGB-D camera and a computer processor. This system prioritizes privacy by discarding image/video data after extracting the necessary information. Motion, inactivity features, and environmental context are extracted in real time. Detected movement pixels are shown in red.

2. Related Works

Changes in physical mobility can serve as valuable indicators for various common conditions in older adults (Grimmer et al., 2019), including frailty, stroke, arthritis, Alzheimer’s, depression, natural aging, and Parkinson’s (Larson and Wilbur, 2020b). For example, frailty is characterized by weight loss, weakness, exhaustion, slowness, and low activity (Panhwar et al., 2019a). Stroke can lead to motor problems such as weakness, paralysis, and coordination and balance issues (Institute, 2023). Arthritis involves joint inflammation, leading to pain, stiffness, and reduced mobility, predominantly in the hands, knees, and hips. Multiple sclerosis may cause muscle weakness, stiffness, and tremors (Barhum, 2022). Moreover, Alzheimer’s disease can also impact motor function, including coordination and balance difficulties (Nakanishi et al., 2021). These conditions significantly influence the behaviors of older adults, ranging from mild to severe impairment, depending on the severity and progression of the diseases. As a result, older adults may encounter difficulties in performing daily activities that require strength and coordination. While many studies focus on detecting significant events, like falls (Igual et al., 2013), it is also crucial to recognize gradual health changes, especially considering the well-known phenomenon of change-blindness, where sufficiently slow changes are not perceptible by humans (but could be detected through automated record-keeping).

Sensors enable the creation of a comprehensive behavioral profile generated from continuous monitoring over the long-term (Cook, 2020). These sensors encompass a variety of devices, including wearables (Jansen et al., 2022; Picerno et al., 2021), smart home devices (Turjamaa et al., 2019; Sprint et al., 2016), and cameras (Scott et al., 2022; König et al., 2015b), among others. They have demonstrated their effectiveness in various healthcare applications. Some older adults view sensor-based monitoring as a highly beneficial approach, which not only enhances their safety but also promotes independent living, enabling them to remain active while respecting their natural lifestyle choices (Pol et al., 2016; Rantz et al., 2015). In particular, ‘zero-interaction’ sensing technologies are favored (Schütz et al., 2022).

2.1. Sensors

Wearable sensors have emerged as valuable tools for measuring healthcare-related parameters in older adults. These sensors accurately measure human motion, localization, and tracking, making them suitable for various applications, such as frailty assessment, fall risk evaluation, monitoring chronic neurological diseases, promoting active living, and cognitive assessment(Cook, 2020; Picerno et al., 2021). A study demonstrated that inertial sensors effectively assess frailty, providing an objective measure of an individual’s physical condition (Panhwar et al., 2019b). Wearable inertial sensors have also been utilized to predict Functional Independence Measure Scores in patients undergoing inpatient rehabilitation (Sprint et al., 2015). Moreover, they have been used to monitor turning movements associated with cognitive function in elderly participants (Mancini et al., 2016). However, the adoption of wearable sensor-based monitoring hinges on the ability of older adults with varying cognitive abilities to consistently wear and charge these devices, as well as issues of stigma, which may impede widespread acceptance (Narasimhan et al., 2021).

Ambient sensors, often integrated into the living environment, encompass a range of devices such as passive infrared (PIR) motion sensors, magnet/contact switches, temperature sensors, light sensors, humidity sensors, vibration sensors, pressure sensors, and radio-frequency identification (RFID) sensors (Cook, 2020). These sensors offer the capability to monitor diverse health-related parameters in older adults over extended periods. Smart home technology has been leveraged for healthcare applications, including assessing social isolation, cognitive health, functional health, and behavioral changes related to conditions such as radiation treatment, insomnia, depression, or dementia (Prabhu et al., 2022; Dawadi et al., 2013a, b; Aramendi et al., 2018; Sprint et al., 2016; Cook et al., 2018). However, smart homes face challenges due to the inherent complexity of the sensor system, where each sensor typically serves a specific function, potentially leading to reliability and accuracy issues (Wagner et al., 2022). Monitoring multiple individuals simultaneously and distinguishing the impact of pets within the environment are also challenging tasks, primarily due to the limited ability to understand high-level semantic context (Majumder et al., 2017).

Camera sensors have emerged as powerful tools for monitoring the physical and contextual behavior of individuals, offering extensive and high-dimensional data capture capabilities. Recent advancements in computer vision algorithms and hardware have significantly enhanced their potential. In comparison to wearable sensors and basic ambient sensors, cameras provide a more comprehensive view of the monitored subject, offering rich, interpretable information about the individual, the surrounding environment, and their interactions (Scott et al., 2022). Notably, camera sensors are employed in various healthcare applications, including Parkinson’s disease, stroke, epilepsy, and frailty assessments, as they can capture both upper and lower limb kinematic measurements (Scott et al., 2022). For example, researchers have utilized the Kinect RGBD camera to monitor the body movements of pre-frail and frail elderly individuals (Liao et al., 2019). Wearable cameras have been employed to automatically identify sedentary periods in older adults, which have been associated with poor health outcomes (Leask et al., 2015). The fusion of camera data with other sensors, such as accelerometers, light and door sensors, microphones, and wearable devices, has been shown to effectively monitor dementia patients and detect long-term health changes in older adults (Karakostas et al., 2016; Rantz et al., 2015). Emerging impulse-radar sensors and depth sensors show promise for healthcare purposes as they do not require individuals to wear or operate any devices, offering similar information as video cameras while providing enhanced privacy protection (Momin et al., 2022; Wagner et al., 2022).

Sensor accuracy: Sensor selection plays a crucial role in smart home data analysis. Studies have shown that motion sensors are the most informative for activity recognition, with areas of high movement, such as the kitchen and living room, being particularly significant (Cook and Holder, 2011). Some research endeavors have employed multiple motion sensors with small fields of view to estimate walking speed when individuals pass by, alongside passive infrared motion sensors to detect changes in heat sources (Hayes et al., 2008). However, ambient motion sensors may struggle to distinguish different levels of motion intensity, such as mild and high exertion of body parts (Hayes et al., 2008). Wearable sensors exhibit high accuracy in estimating limb motion (Auepanwiriyakul et al., 2020), but they can be inconvenient for long-term use. Conversely, cameras can provide reasonably accurate motion estimations. For instance, a 3D markerless motion capture technique demonstrated it could correctly reproduce the movements of participants within the accuracy of 30 mm (Nakano et al., 2020). Another study compared upper-limb kinematics collected from a Kinect v1 against a 6 Camera 3D marker-based system, where shoulder elevation angle had a reported difference of 3.32 degrees ± 2.80 degrees (Scano et al., 2014). Both errors of measurement were reported to be clinically acceptable.

2.2. Features

Here, features are essential numerical descriptions extracted from the health monitoring data. They serve as indicators for identifying signs of diseases, monitoring chronic conditions, and establishing personalized healthcare records. These features encompass various aspects of an individual’s behavior and physiology.

Motor-related features encompass a wide range of parameters that shed light on an individual’s physical capabilities. Metrics like walk distance and walking speed are key indicators of mobility and overall health (Aramendi et al., 2018; Schmitter-Edgecombe, 2015). Gait length is an important measure related to walking patterns (Rantz et al., 2015). Unconventional indicators like patterns of using a computer mouse can provide insights into cognitive function (Cook et al., 2018). Parameters such as speed, acceleration, and frequency of hand and wrist movements, trunk speed, and wrist speed offer valuable information (Scott et al., 2022). Studies have shown that slower execution speed may be detectable in the early stages of cognitive decline (König et al., 2015b). For instance, in one study (Romdhane et al., 2012), participants’ task execution time, walk speed, step length, and the amount of error or omissions in task completion were used to assess Alzheimer’s disease symptoms.

Time-related features include measures such as activity density map by hours and days (Dawadi et al., 2013a), event/activity duration (Dawadi et al., 2013b; Aramendi et al., 2018; Schmitter-Edgecombe, 2015), time of the day (Sprint et al., 2016; Leask et al., 2015), the size of sliding window (Sprint et al., 2016), time between two events (Sprint et al., 2016); amount of time spent outside the home (Prabhu et al., 2022; Rantz et al., 2015; Aramendi et al., 2018), amount of time sitting (Dahmen and Cook, 2021; Leask et al., 2015), the distribution of time spent in different home areas (Prabhu et al., 2022), etc. Studies have suggested that features like extended periods of sitting may be indicative of weakness (Prabhu et al., 2022).

Activity-level features include information such as activity type (Leask et al., 2015), duration of activities (Dawadi et al., 2013a), number of sensor/event logs (Dawadi et al., 2013a, b), number of complete tasks/interruptions/omissions (Dawadi et al., 2013a), activity regularity (Cook, 2020), etc. Additionally, statistics such as minimum, maximum, sum, median, standard deviation, zero crossings, correlation, and skewness are often computed for these features (Cook et al., 2018; Schmitter-Edgecombe, 2015). Leask et al. (Leask et al., 2015) used activity type, environment, and interactions to understand sedentary behaviors in older adults. König et al. (König et al., 2015a) did feature selection on gait and event features extracted from a video event monitoring system for assessment of autonomy.

Lower-body features are widely employed for diverse purposes in older adult monitoring; however, they may not be suitable for some occasions such as when the older adults spend most of their time sitting at their favorite easy chair during the day (Leask et al., 2015). Moreover, the motion features extracted in the above-mentioned works are often quite coarse, and existing studies typically do not explore the relationships among these variables or model their dependencies, even though some variables may be highly correlated with others. In our study, we aim at daily sitting scenarios at home and extract fine-grained motion features of individuals down to the finger movement level, while modeling the relationship among behavioral features, the health states, and the environmental context.

2.3. Activities

Activities also need to be carefully chosen to effectively monitor people’s health states (Dawadi et al., 2013b). Some research uses specifically designed tasks to assess health states, which are clinically verified. In the study (Dawadi et al., 2013b), participants were asked to perform a sequence of 8 activities representing instrumental activities of daily living (IADLs) that can be disrupted in Mild Cognitive Impairment (MCI) and are more significantly disrupted in Alzheimer’s Disease (AD). In another study (Aramendi et al., 2018), a predefined set of activities, including basic activities (such as walking or sitting) and IADLs (e.g., cooking, eating, or personal hygiene activities), reflecting money/self-management skills and travel/event memory abilities, were found to be most related to the sensor behavior data. Studies (Joumier et al., 2011) and (Romdhane et al., 2012) used an automatic video monitoring system in a room to compare the motor abilities (e.g., walking speed) of a controlled healthy group and Alzheimer’s disease patients with three designed tasks that consider different levels of autonomy (both walking and IADLs); statistically significant differences were observed. However, walking exercise in the room is limited to a short distance and time, which is less reliable (Joumier et al., 2011). Another study (König et al., 2015b) monitored three instrumental activities, such as preparing the pillbox, preparing tea, and making/receiving a phone call, and the results showed that the system could distinguish between healthy and mild cognitive impairment patients based on task execution time.

While designed tasks and environment modifications of usual behavior have their merits, the ideal strategy to capture functional decline accurately and reliably is to observe the daily behavior of individuals where they spend most of their time: at home (Aramendi et al., 2018).

Most smart homes can monitor natural daily activities for the long term. A report (Burwell and Jackson, 1994) shows that functioning in five core Activities of Daily Living (ADLs) is typically used to describe the extent of chronic disability among the elderly. These core ADLs include (1) bathing; (2) dressing; (3) using the toilet; (4) transferring from bed to chair, and (5) feeding oneself. In the study (Dahmen and Cook, 2021), the focus was on activities such as cooking, eating, sleeping, personal hygiene, taking medicine, working, leaving home, entering home, bathing, relaxing, bed-toilet transition, washing dishes, and other activities. In the work (Rantz et al., 2015), activities monitored included bathroom use, sleep, eating, drinking, gait, falls, and others. However, these activities occur in different rooms, which increases the complexity of the monitoring system by requiring many sensors to be placed around. In the work (Schmitter-Edgecombe, 2015), five activities of daily living (ADLs) were used for monitoring, including relaxing activities, such as watching TV, reading, and napping, which typically take place in a single location other than the bedroom and are important for characterizing daily routines and assessing functional independence. A study (Leask et al., 2015) shows that sedentary periods are common in older adults’ daily lives and are related to their health status. This provides a good opportunity to monitor older adults’ health conditions when they typically remain sitting at a fixed location at home with their upper body visible and moving most of the time. These sitting and relaxing activities informed our decisions when choosing monitoring scenarios.

2.4. Time Scale

Various time scales are employed for different monitoring purposes. In the study (Sprint et al., 2016), smart home sensor data are utilized, and a 1-week time window is selected to compare behaviors between windows, with the aim of detecting health events such as radiation treatment, insomnia, and falls. In the work (Rantz et al., 2015), the sensors generate an event every 7 seconds when continuous motion is detected. Additionally, physiological parameters are calculated during sleep, time spent away from home, gait speed, stride length, and stride time on a 24-hour basis. In the study (Aramendi et al., 2018), changing time-series statistics for each variable are computed using a sliding window of length 7 days. Each designed activity for participants in the study (Dawadi et al., 2013b) takes an average of 4 minutes to complete, while the testing session for eight activities lasts approximately 1 hour. Romdhane et al. (Romdhane et al., 2012) employ the same automatic video system to track healthy and Alzheimer’s participants. Three tasks, considering different levels of autonomy, are designed, ranging from 10 minutes to 30 minutes.

Only a few studies have investigated the appropriate time scale for monitoring performance. In one study, the researchers explored how the activity features extracted at different window sizes affected the performance of predicting standard clinical assessment scores (Schmitter-Edgecombe, 2015). They found that a window of 30 days was suitable for extracting features for this purpose. Another study used smartphone data and broke the continuous data into windows of certain durations, ranging from 1 to 16 seconds (Dernbach et al., 2012). It was shown that shorter windows performed better in classifying activities. In a study by Johnson et al. (Dahmen and Cook, 2021), different window sizes for the sensor events were considered, ranging from 10 to 150 events. However, the window was fixed based on the number of sensor events, not on actual time, which may not work well when the types of sensors or scenarios are changed. These studies focused on the time scale clue for detector performance, and significant variations among detectors, features, and window sizes have been observed for the best performance among different (or similar) conditions and across different subjects (Schmitter-Edgecombe, 2015). In this work, we explore various temporal windows for feature extraction, and we aggregate temporal windows to longer time spans to determine the optimal time scale for effectively classifying health states.

3. Method

Our goal is to automatically identify behavioral changes related to changes in the health or weakness of an individual using camera data. To achieve this objective, we first capture data on the daily behaviors of individuals using an RGB-D video camera (Section 3.1). Next, we extract important behavioral features from this data (Section 3.2). To infer the weakness states of an individual, we frame it as a classification problem (Section 3.3). Following this, we employ multiple classifiers and information-based methods to rank the importance of the features and activities with respect to the weakness health condition (Section 3.4). Furthermore, we identify the optimal time scale for effectively observing behavioral changes by investigating various temporal windows (Section 3.5). Finally, we model the dependencies among features using a Bayesian Network (Section 3.6). We build models for each individual and aggregate the results for a more comprehensive analysis of the weakness condition’s impact.

3.1. Data Capture

Five healthy participants (age/gender/handedness of P1–P5: 35’M/right, 35’F/right, 25’M/right, 60’M/right, 25’F /right) were monitored in designated room areas while engaged in five common daily activities. An RGB-D camera was positioned to observe the designated room areas, which were equipped with a chair, a desk, and other home items, as depicted in Fig. 1. The five activities included: 1) reading a book, 2) using a personal computer (PC), 3) eating, 4) taking a nap, and 5) watching television (TV). Participants were predominantly seated while performing these activities, with most of the time their upper body visible. The camera facilitated the monitoring of both participants’ physical body movements and their surroundings, extracting pertinent features. The camera was connected to a computer processor for real-time processing of captured visual data.

Participants initiated monitoring by pressing the start on the processor whenever they commenced a daily activity among the listed five. The camera then captured their behaviors in real time throughout the monitoring period. Participants had the flexibility to cease an activity at any time, resulting in varying durations for each trial. The duration of activities ranged from minutes to hours based on individual preferences and circumstances. In order to label the undergoing activity, the participants manually assigned an activity label to the monitoring period after each recording session, exclusively selected from the five available activities. Occasionally, two activities might coincide, such as eating while watching TV. In such cases, participants are assigned only a single dominant activity label as preferred.

To simulate weakness, participants underwent a workout, and their behavior was monitored for periods both before and after the workout to ascertain behavioral disparities. Since the fitness level and exercise habits varied widely among participants, ranging from rarely exercising to over two times regular gym training weekly, we asked participants to choose a workout based on their exercise level, whether weight training or cardio, to perform for up to 30 minutes until they felt tired subjectively. Similarly, according to whether or not they had undergone a workout, participants assigned a health state label to each monitoring period, choosing from three options: “normal” (before the workout), “same-day weakness” (on the same day after the workout), and “next-day weakness” (on the next day after the workout) 222As the volunteer knew whether or not they had exercised, there is probably some bias in the self-reported labels.. The monitoring process occurred in a naturalistic setting, devoid of scripting, interruption, or intervention. To simulate a naturalistic setting rather than a designed task, participants were not bound by constraints; they had the liberty to skip activities or monitoring days from their daily routine. For instance, some participants never take a nap during the day, while others regularly exercise at night (i.e., no record of “same-day weakness”). So, we did not force them to carry out all activities, which is easily applicable to real-life monitoring and easier for participants to enter data. Consequently, the distribution of activity and health state labels is imbalanced, closely resembling natural daily routines. The distribution of these labels is depicted in Table 1.

This study has received approval from the School of Informatics Ethics Committee and individual agreements with the participants have been made. To ensure personal privacy, the features are extracted in real time and saved as text logs. These logs maintained the anonymity of the participants; no image or video data was retained. Personally identifiable information, including names, faces, addresses, and other sensitive details, remained unrecorded. Overall, a total of 260 activity monitoring records spanning 67 days were monitored for all the participants.

Table 1. Statistics of the monitoring records from participants. Age, handedness, & gender are reported in the main text.
ID Monitor Records Total Hours Across Days Health (%) Activity (%) Mean Duration (minutes)
normal weak 1.read 2.PC 3.eat 4.nap 5.watch 1.read 2.PC 3.eat 4.nap 5.watch
P1 178 56.8 22 33.7 66.3 24.2 33.2 15.7 10.1 16.9 19.8 24.3 9.9 13.5 20
P2 47 13.7 23 46.8 53.2 10.6 40.4 - 12.8 36.2 13.8 20.7 - 11 17.1
P3 16 5.4 10 31 69 18.75 18.75 18.75 18.75 25 26 21.3 8.6 22.1 22.9
P4 7 3 5 42.9 57.1 57.1 - - - 42.9 23.3 - - - 28.7
P5 12 4.8 7 33.3 66.7 25 25 25 - 25 29.8 22.8 7.4 - 52.5
ΣΣ\Sigmaroman_Σ 260 83.7 67 37.5 63.5 27.1 23.5 11.9 6.5 29.2 22.5 22.3 8.6 15.5 28.2

3.2. Feature Extraction

Utilizing the rich visual information available, we extract 62 behavioral features, including aspects of human body movement and inactivity. Additionally, we examine four environmental contexts potentially related to an individual’s behavior: objects, weather conditions, room lighting, and the time of day. For a comprehensive list of these features, refer to Table 2. To enhance the interpretability of the impact of health conditions, all features have semantic meaning. The objective is to identify effective descriptors for behavioral changes associated with health conditions, differentiating between normal and weakness states by discerning the significance of various features from the captured data.

Inactivity detection/ motion estimation: Movement and inactivity features are crucial for health monitoring purposes, as they provide insights into physical activity levels, mobility, frailty, fall risk, cognitive health, and more, as mentioned in Section 2. The inactivity detection algorithm involves several key steps. First, it employs non-parametric background modeling, then performs human region detection and non-human region suppression, with a pixel-level motion existence detection (Chen and Fisher, 2023). The period between two motion events (greater than or equal to 1 second) is detected as inactivity. This approach is sensitive enough to detect subtle movements like finger motions and remains robust under various environmental lighting conditions, including low-light situations and the presence of TV light. The method achieved a 0% false positive rate with a ±3 frames temporal tolerance and a 3% false negative rate for motion detection, as reported in (Chen and Fisher, 2023).

Table 2. List of extracted features from monitoring records.
Size Description
Health (H) 1x2 normal, weakness
Activity (A) 1x5 read, PC, eat, nap, watch
1. Lighting of room R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT luma Y’601 of the captured scene
2. Time of the day 1x3 6 am - 2 pm, 2 pm-10 pm, 10 pm-6 am
3. Weather 1x2 suitable* / not suitable (for outdoor activity)
4. Objects R20superscript𝑅20R^{20}italic_R start_POSTSUPERSCRIPT 20 end_POSTSUPERSCRIPT likelihood of presence of the 20 most frequent home objects
5. Duration of activity R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT seconds (see main text for method)
6. Ratio of inactivity R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT %
7. No. of inactivity R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT count of inactivity events (per min.) (see main text)
8. Duration of movement R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT seconds (see main text)
9. Ratio of movement R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT %
10. Movement pixel count R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT 104superscript10410^{4}10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT pixels/s (see main text)
12. Density of movement R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT % (movement pixel count / human pixel count)
14. Scale/Spread of movement R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT % (movement pixels bbox size/ human bbox size)
16. Mean speed of movement R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT pixel/s
19. Distance of movement R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT pixel/5min.
20-23. Mean speed quartiles R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT Q1 - Q4
28. STD of speed R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT pixel/s
Fine-grained features (R1superscript𝑅1R^{1}italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT)
(See footnotes for explanation.)
11. 10 in MPO*
13. 12 in MPO
15. 14 in MPO
17. 16 in MPO
18. 28 in MPO
24 – 27. 20 – 23 in MPO
44 – 46. 12 in TRL*
35 – 37. 12 in TRL-MPO
41 – 43. 14 in TRL
32 – 34. 14 in TRL-MPO
38 – 40. 16 in TRL
29 – 31. 16 in TRL-MPO
47 – 56. inactivity duration distribution (%)
Ranges (seconds): [1,2), [2,5), [5,10), [10,30),
[30,60), [60,), [2,), [5,), [10,), [30,)
57 – 66. movement duration distribution (%)
Ranges (seconds): [1,2), [2,5), [5,10), [10,30),
[30,60), [60,), [2,), [5,), [10,), [30,)

MPO* (Movement Period Only) refers to statistics computed only from the period during which movement is considered, excluding data from inactive periods. For instance, the mean speed during MPO is typically higher compared to the mean speed over the entire video duration because there are instances when the person is not moving. TRL* (Top, Right, and Left of body regions) involves dividing the detected human region (bounding box) into 2x2 sub-boxes. The Top body region contains sub-boxes 1 and 2, the Right body region contains sub-boxes 1 and 3, and the Left body region contains sub-boxes 2 and 4. Suitable* weather conditions are selected from a list of 27 weather types. The ‘suitable’ conditions include Clear, Fair, Cloudy, and Overcast, while the other 23 weather types are considered ‘not suitable’ for outdoor activity, including Fog, Rain, Sleet, Snowfall, Storm, Hail, and so on. (See full list at (Meteostat, 2021)).

Once the moving pixels are extracted, a dense 2D optical flow estimator (Lucas and Kanade, 1981) is applied to them. This optical flow analysis helps describe the intensity of the body movement, in terms of 2D velocity. This is advantageous for capturing small movements in the upper body, such as those involving the head and hands, which can be significant in daily upper-body activities. Then, various statistics are derived from the data to provide valuable features. These include metrics like inactivity count, inactivity duration, movement distance, standard deviation, movement distribution, and interquartile range (IQR).

The combination of inactivity detection with optical flow is particularly beneficial because it can effectively represent small movements at the pixel-level, and simultaneously ensures the accuracy of pixels with true motion, ignoring the noises from the dense optical flow estimation. This makes it suitable for activities involving numerous fine motor movements. Additionally, the advantage of employing a fixed camera for monitoring a stationary area ensures a relatively consistent distance between the camera and the person being monitored. This consistency makes 2D optical flow magnitude a reliable estimator of motion intensity. This contrasts with the potential inaccuracies associated with estimating 3D joint coordinates from noisy depth measurements, as well as the challenges of detecting key joints in occlusion situations, for subtle movements, and in low-lighting environments.

Fine-grained descriptors: Fine-grained descriptors allow additional insights into the spatial density and scale of movement, enhancing our ability to characterize the range and intensity of body movements. The density of movement is the proportion of movement pixels to the total number of pixels within the body region. Conversely, the scale of movement is the ratio between the size of the movement bounding box and the size of the human bounding box. Furthermore, we extract features from different body regions, including the Top, Right, and Left sides (TRL) of the detected human body. This approach enables us to focus on parts of the body separately. Additionally, we employ a fine-grained temporal descriptor that emphasizes statistics derived specifically from the periods of movement only (MPO), rather than considering the entire period. Further, the inactivity and motion durations are separated into multiple intervals to compute their detailed distributions. See Table 2 for details of the fine-grained descriptors. These fine-grained descriptors help to develop deeper insights into the nature of the movements. For instance, it helps where a single hand movement dominates the activity or where head movement takes precedence. It can also highlight significant temporal pauses within an activity.

Environmental context: The environmental context is also an essential aspect of the monitoring system. We employ a pre-trained YOLOv5 model (Jocher, 2020) capable of detecting common objects, including humans. The model exhibits a mean Average Precision (mAP) of 56.8% on the COCO val2017 dataset (Lin et al., 2014). Real-time local weather information is accessed through an online Weather API (Meteostat, 2021). The room illumination is calculated from the image by referencing the Rec. 601 luma standard (ITU-R, 2008). Time of the day is divided into three 8-hour segments, enabling us to capture temporal variations in daily routines. These environmental context features enrich our understanding of the monitored person.

\stackunder

[5pt]Refer to captionInactivity ratio (Act.)   \stackunder[5pt]Refer to captionLeft body speed (Act.)   \stackunder[5pt]Refer to captionInactivity ratio (Hea.)   \stackunder[5pt]Refer to captionLeft body speed (Hea.)

Figure 2. Example of two features that differ among activities and between health states for a participant in all activities. The left two plots show how the feature values vary according to the activity. The right two plots show how the feature values vary (slightly) between the Healthy and Weakness states.

3.3. Health State Classification

Our initial inquiry revolves around the ability to distinguish patterns associated with weakness from typical behaviors using the measured features. In this context, this is a classification problem, with the goal of discerning between states of normal health and instances of weakness. Given the substantial variability in the duration of post-workout weakness, which can manifest and dissipate within 72 hours333The duration of post-workout weakness can depend on various factors, including an individual’s fitness level, the intensity of the workout, the type of exercise performed, and overall health. Generally, post-workout weakness may be experienced immediately after an intense exercise session and can persist for several hours to a couple of days (Mizumura and Taguchi, 2016)., here we have merged the categories of “same-day weakness” and “next-day weakness” into a unified “weakness” label, which is distinct from the label signifying “normal” health status (i.e., pre-workout).

For each monitoring record, we denote the health states as ={H1,H2}subscript𝐻1subscript𝐻2\mathcal{H}=\{H_{1},H_{2}\}caligraphic_H = { italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }, the activities as 𝒜={A1,A2,,A5}𝒜subscript𝐴1subscript𝐴2subscript𝐴5\mathcal{A}=\{A_{1},A_{2},...,A_{5}\}caligraphic_A = { italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT }, and the extracted features (i.e., behavioral and environmental) as ={F1,F2,,F66}subscript𝐹1subscript𝐹2subscript𝐹66\mathcal{F}=\{F_{1},F_{2},...,F_{66}\}caligraphic_F = { italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_F start_POSTSUBSCRIPT 66 end_POSTSUBSCRIPT }. Then the record data can be denoted as 𝒟={(H,A,F)H,A𝒜,F}𝒟conditional-set𝐻𝐴𝐹formulae-sequence𝐻formulae-sequence𝐴𝒜𝐹\mathcal{D}=\{(H,A,F)\mid H\in\mathcal{H},A\in\mathcal{A},F\in\mathcal{F}\}caligraphic_D = { ( italic_H , italic_A , italic_F ) ∣ italic_H ∈ caligraphic_H , italic_A ∈ caligraphic_A , italic_F ∈ caligraphic_F }. The goal is to identify the combination of these elements that maximizes the performance of the classifiers for health state classification. The optimal set of features and activities can be succinctly expressed as:

(1) (SA,SF)=argmaxd𝒟(SA,SF)HSA(HdAd,Fd)superscriptsubscript𝑆𝐴superscriptsubscript𝑆𝐹𝑎𝑟𝑔𝑚𝑎subscript𝑥𝑑𝒟superscriptsubscript𝑆𝐴superscriptsubscript𝑆𝐹HSAconditionalsubscript𝐻𝑑subscript𝐴𝑑subscript𝐹𝑑(S_{A}^{*},S_{F}^{*})=argmax_{d\in\mathcal{D}{(S_{A}^{*},S_{F}^{*})}}\text{HSA% }(H_{d}\mid A_{d},F_{d})( italic_S start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = italic_a italic_r italic_g italic_m italic_a italic_x start_POSTSUBSCRIPT italic_d ∈ caligraphic_D ( italic_S start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT HSA ( italic_H start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ∣ italic_A start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , italic_F start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT )

where HSA() estimates the health state classification accuracy, SAsuperscriptsubscript𝑆𝐴S_{A}^{*}italic_S start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is a subset of 𝒜𝒜\mathcal{A}caligraphic_A, SFsuperscriptsubscript𝑆𝐹S_{F}^{*}italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is a subset of \mathcal{F}caligraphic_F, and 𝒟(SA,SF)𝒟superscriptsubscript𝑆𝐴superscriptsubscript𝑆𝐹\mathcal{D}{(S_{A}^{*},S_{F}^{*})}caligraphic_D ( italic_S start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) is a subset of 𝒟𝒟\mathcal{D}caligraphic_D. We require the selected data size, n=|𝒟(SA,SF)|𝑛𝒟superscriptsubscript𝑆𝐴superscriptsubscript𝑆𝐹n=|\mathcal{D}{(S_{A}^{*},S_{F}^{*})}|italic_n = | caligraphic_D ( italic_S start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) | to satisfy n>0.4n𝒟𝑛0.4subscript𝑛𝒟n>0.4n_{\mathcal{D}}italic_n > 0.4 italic_n start_POSTSUBSCRIPT caligraphic_D end_POSTSUBSCRIPT, where n𝒟=|𝒟|subscript𝑛𝒟𝒟n_{\mathcal{D}}=|\mathcal{D}|italic_n start_POSTSUBSCRIPT caligraphic_D end_POSTSUBSCRIPT = | caligraphic_D |. This ensures that the selected samples are adequately representative despite the large imbalance in activity types within the monitoring data.

To address this classification task, we examine all possible combinations of activities performed by the participants. This amounts to a total of 2msuperscript2𝑚2^{m}2 start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT combinations, where m𝑚mitalic_m is the number of activity types monitored for a participant. Although it would be ideal to find a single, generalized set of activities and features that allow reliable health state classification for all current participants (and future users as well), limitations in the current data require an alternative approach. We instead investigate the possibility of estimating individual health states through personalized sets of features and activities. This also makes practical sense because not all people would do the same activities, and not all people would respond in the same way to a given activity. Healthcare applications tailored to each individual are a growing trend (Kosorok and Laber, 2019). For each combination of activities, we gradually introduce features for training and testing the classifiers, following a forward selection approach. The results of classifying normal and weakness health states with three different classifiers are presented in Fig. 4.

Table 3. Feature ranking and Activity ranking methods. The top block lists classifiers used for feature ranking evaluations, the middle block lists information-based feature ranking methods, and the bottom block gives the rank aggregation methods used.
Method Description Score Calculator
SVM (Cortes and Vapnik, 1995) Support Vector Machine: find the optimal decision boundary that maximizes the margin between the classes. si=accuracy of classifying health states H by feature Fisubscript𝑠𝑖accuracy of classifying health states 𝐻 by feature subscript𝐹𝑖s_{i}=\text{accuracy of classifying health states }H\text{ by feature }F_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = accuracy of classifying health states italic_H by feature italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
BN (Friedman et al., 1997) Bayesian Network: model the probabilistic dependencies between the features and the target class.
RF (Breiman, 2001) Random Forest: build an ensemble of decision trees to improve generalization and reduce overfitting.
FDR (Fisher, 1936) Fisher Discriminate Ratio: emphasize the discriminative power of each feature in distinguishing between the classes. si=(μ(H1)(Fi)μ(H2)(Fi))2/(σ(H1)(Fi)+σ(H2)(Fi))subscript𝑠𝑖superscriptsuperscript𝜇subscript𝐻1subscript𝐹𝑖superscript𝜇subscript𝐻2subscript𝐹𝑖2superscript𝜎subscript𝐻1subscript𝐹𝑖superscript𝜎subscript𝐻2subscript𝐹𝑖s_{i}={(\mu^{(H_{1})}(F_{i})-\mu^{(H_{2})}(F_{i}))^{2}}/{(\sigma^{(H_{1})}(F_{% i})+\sigma^{(H_{2})}(F_{i}))}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_μ start_POSTSUPERSCRIPT ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_μ start_POSTSUPERSCRIPT ( italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_σ start_POSTSUPERSCRIPT ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) + italic_σ start_POSTSUPERSCRIPT ( italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) )
MI (Cover, 1999) Mutual Information: measure the amount of information that a feature provides for class prediction. si=P(Fi,H1)logP(Fi,H1)P(Fi)P(H1)+P(Fi,H2)logP(Fi,H2)P(Fi)P(H2)subscript𝑠𝑖𝑃subscript𝐹𝑖subscript𝐻1𝑃subscript𝐹𝑖subscript𝐻1𝑃subscript𝐹𝑖𝑃subscript𝐻1𝑃subscript𝐹𝑖subscript𝐻2𝑃subscript𝐹𝑖subscript𝐻2𝑃subscript𝐹𝑖𝑃subscript𝐻2s_{i}=P(F_{i},H_{1})\log\frac{P(F_{i},H_{1})}{P(F_{i})P(H_{1})}+P(F_{i},H_{2})% \log\frac{P(F_{i},H_{2})}{P(F_{i})P(H_{2})}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_P ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) roman_log divide start_ARG italic_P ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_P ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_P ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG + italic_P ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) roman_log divide start_ARG italic_P ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_P ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_P ( italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG
CFS (Hall, 1999) Correlation-based Feature Selection: consider the relevance and redundancy of features in relation to the target class and each other. si=abs(corr(Fi,H)/mean(corr(Fi,Fj))s_{i}={\text{abs}(\text{corr}}(F_{i},H)/\text{mean}(\text{corr}(F_{i},F_{j}))italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = abs ( corr ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_H ) / mean ( corr ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) )
BC (Emerson, 2013) Borda Count: aggregates the scores by assigning points to each feature based on its rank from each method and summing up the points to obtain the final ranking. Ragg=k=1K(nrk+1)subscript𝑅𝑎𝑔𝑔superscriptsubscript𝑘1𝐾𝑛subscript𝑟𝑘1R_{agg}=\sum_{k=1}^{K}(n-r_{k}+1)italic_R start_POSTSUBSCRIPT italic_a italic_g italic_g end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT ( italic_n - italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + 1 ), where n𝑛nitalic_n is the number of features and rksubscript𝑟𝑘r_{k}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is the rank of the feature according to method k𝑘kitalic_k.
NWA (Cochran, 1977) Normalization and Weighted Average: the scores from each method are normalized to a common scale and then combined using weighted averaging. Ragg=k=1Kwknormalize(sk)/k=1Kwksubscript𝑅𝑎𝑔𝑔superscriptsubscript𝑘1𝐾subscript𝑤𝑘normalizesubscript𝑠𝑘superscriptsubscript𝑘1𝐾subscript𝑤𝑘R_{agg}=\sum_{k=1}^{K}w_{k}\cdot\text{normalize}(s_{k})/{\sum_{k=1}^{K}w_{k}}italic_R start_POSTSUBSCRIPT italic_a italic_g italic_g end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⋅ normalize ( italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) / ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, where wksubscript𝑤𝑘w_{k}italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is the weight assigned to method k and normalize(sksubscript𝑠𝑘s_{k}italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT) is the normalized score of method k𝑘kitalic_k.
Cb (Woehr et al., 2015) Consensus-based: find a consensus among the rankings from different methods. Ragg=mode(r11:m,r21:m,,rk1:m)subscript𝑅𝑎𝑔𝑔modesuperscriptsubscript𝑟1:1𝑚superscriptsubscript𝑟2:1𝑚superscriptsubscript𝑟𝑘:1𝑚R_{agg}=\text{mode}(r_{1}^{1:m},r_{2}^{1:m},...,r_{k}^{1:m})italic_R start_POSTSUBSCRIPT italic_a italic_g italic_g end_POSTSUBSCRIPT = mode ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 : italic_m end_POSTSUPERSCRIPT , italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 : italic_m end_POSTSUPERSCRIPT , … , italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 : italic_m end_POSTSUPERSCRIPT ), where mode()𝑚𝑜𝑑𝑒mode()italic_m italic_o italic_d italic_e ( ) returns the most frequent features that appear among the top rankings (with top-m𝑚mitalic_m features for activity ranking).

3.4. Feature and Activity Ranking

After extracting features from participants’ daily common activities considering both their behavioral and environmental context, our second objective is to identify the most relevant features and activities that are effective indicators of the health states. To accomplish this, we rank activities and features based on their capacity to distinguish weakness from normal states. To ensure the stability and consistency of these rankings, multiple methods are typically employed (Mohdiwale et al., 2021).

In this context, we first explore the relationship between the i-th feature, denoted as Fisubscript𝐹𝑖F_{i}\in\mathcal{F}italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_F, and the health state variable H𝐻Hitalic_H. To rank these features, we utilize a ranking method referred to as M𝑀Mitalic_M, which provides a score sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT defined as:

(2) si=M(Fi,H).subscript𝑠𝑖𝑀subscript𝐹𝑖𝐻s_{i}=M(F_{i},H).italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_M ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_H ) .

The scores can be obtained using various approaches. Once we have calculated the scores for all features, we sort them to derive the rank of all features:

(3) =sort(𝒮),where 𝒮={s1,s2,,s||}.formulae-sequencesort𝒮where 𝒮subscript𝑠1subscript𝑠2subscript𝑠\mathcal{R}=\mathrm{sort}(\mathcal{S}),\quad\text{where }\mathcal{S}=\{s_{1},s% _{2},\ldots,s_{|\mathcal{F}|}\}.caligraphic_R = roman_sort ( caligraphic_S ) , where caligraphic_S = { italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT | caligraphic_F | end_POSTSUBSCRIPT } .

However, different ranking methods often take into account specific properties of the features, resulting in inconsistent results. Here, we utilize a diverse set of techniques for feature ranking, including three classifiers, Bayesian Network (BN), Random Forest (RF), and Support Vector Machine (SVM), and three information-based methods, Fisher Discriminate Ratio (FDR), Mutual Information (MI), and Correlation-based Feature Selection (CFS). Each of these basic ranking methods considers distinct aspects of the features, as outlined in Table 3.

Considering scores derived from the k𝑘kitalic_kth basic method as 𝒮ksubscript𝒮𝑘\mathcal{S}_{k}caligraphic_S start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and the rank derived from this method as ksubscript𝑘\mathcal{R}_{k}caligraphic_R start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, we use aggregation techniques to integrate the ranking results from all the above basic methods. The three aggregate approaches include Borda Count (BC), Normalization and Weighted Average (NWA), and Consensus-based (Cb). Each of these aggregation approaches combines the score derived from the individual methods.

A final score Saggsubscript𝑆aggS_{\mathrm{agg}}italic_S start_POSTSUBSCRIPT roman_agg end_POSTSUBSCRIPT that represents the overall importance of the features for the health states is derived as an unweighted sum of normalized scores:

(4) Sagg=normalize(BC(1,2,,k))+normalize(NWA(𝒮1,𝒮2,,𝒮k))+normalize(Cb(1,2,,k)).subscript𝑆aggnormalizeBCsubscript1subscript2subscript𝑘normalizeNWAsubscript𝒮1subscript𝒮2subscript𝒮𝑘normalizeCbsubscript1subscript2subscript𝑘S_{\mathrm{agg}}=\mathrm{normalize}(\mathrm{BC}(\mathcal{R}_{1},\mathcal{R}_{2% },\ldots,\mathcal{R}_{k}))+\mathrm{normalize}(\mathrm{NWA}(\mathcal{S}_{1},% \mathcal{S}_{2},\ldots,\mathcal{S}_{k}))+\mathrm{normalize}(\mathrm{Cb}(% \mathcal{R}_{1},\mathcal{R}_{2},\ldots,\mathcal{R}_{k})).italic_S start_POSTSUBSCRIPT roman_agg end_POSTSUBSCRIPT = roman_normalize ( roman_BC ( caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , caligraphic_R start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) + roman_normalize ( roman_NWA ( caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , caligraphic_S start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) + roman_normalize ( roman_Cb ( caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , caligraphic_R start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) .

The same pipeline is employed for activity ranking. The importance scores for each activity are determined by examining the top features within each activity that demonstrate the ability to distinguish between the health states. By leveraging a diverse range of techniques, we aim to provide a comprehensive and accurate ranking of the most significant factors contributing to the identification of weakness health states. The top-ranked activities and features for each participant are shown in Fig. 7 and Fig. 8 respectively. The figures show that detection of weakness in every person is possible, but is best characterized by a different set of features and activities for each participant.

3.5. Optimal Time Scale

Normal activities vary in duration, ranging from minutes to hours. Therefore, it is crucial to determine the optimal time scale for effectively monitoring statistics related to health states during these activities. By systematically analyzing different time windows, the time scales that yield the most informative and discriminative features can be estimated, thereby enhancing the precision of the health state classification and behavioral change detection.

Different temporal windows for feature extraction, and different time spans for aggregating the results from temporal windows, are considered with durations ranging from { 30s, 60s, 120s, 300s, 600s, 1200s }. Short windows (i.e., ¡ 30s) increase the calculation burden and make it hard to capture longer-term characteristics, while too long windows (¿ 20 minutes) may mix different activities together. For the longer time scales, infrequent activities may have too few samples to train the classifiers effectively.

Subsequently, we aggregate the results obtained from temporal windows to longer time spans, such as 8-hours (refer to the ‘time-of-day’ feature in Table 2) and a full-day, for a more robust result, assuming that the health state stays consistent during the time span. The effectiveness of temporal windows for classifying health states is illustrated in Fig. 5.

3.6. Modelling with Bayesian Network

A Bayesian Network is used to model the relationship between health states, environmental factors, behaviors, and activity types. This network captures the probabilistic dependencies among these variables, enabling inferences and predictions about the most likely health states based on observations. Furthermore, the learned model facilitates causal inference, allowing identification of the health-related features that are most likely to influence health outcomes.

Refer to caption
Figure 3. Structure of the Bayesian Network. Continuous variables are circles and discrete variables are rectangles. Shaded nodes represent observable features. The unobservable ‘Goal’ node is omitted. The network can be extended to a dynamic model by adding temporal links.

The structure of the Bayesian Network is depicted in Fig. 3. This model incorporates health states and behavioral features into the classic Goal-Environment-Activity model. The dependencies of the observed variables are as follows:

‘Activity’ is influenced by both the individual’s ‘health state’ and the ‘environment’. For example, when an individual feels weak, they may tend to choose less intense activity types. Additionally, different environmental factors, such as the presence of specific objects or the time of day, can also impact on the selection of activities. For instance, the presence of specific objects may be a good clue to indicate an individual is engaged in eating or working on the PC. Certain activities may be preferred at specific times of the day. Weather conditions, such as good weather, may make individuals more likely to engage in outdoor activities, while they may opt for indoor activities during rainy or cold weather.

The individual’s ‘behavior’ is dependent on both the ‘activity type’ and their ‘health state’. For example, when feeling weak, the individual’s movements may be slower, and they may take more rest time. On the other hand, when performing activities such as napping or working on a PC, the frequency and pattern of their movements may differ significantly. Fig. 2 shows an example of how behavioral features differ among different activities.

The ‘goal’ (i.e., user intention) is considered unobservable and is omitted from the network. While it may play a role in influencing activity choices and behavior, it is not directly observed or measured in this context. The ‘health state’ and the ‘environment’ are considered independent of each other in the model. This means that changes in the environment do not directly influence the individual’s health state, and vice versa.

Based on the dependencies described above, the Bayesian Network equation for the health state (H𝐻Hitalic_H), environment (E𝐸Eitalic_E), activity type (A𝐴Aitalic_A), and features (F𝐹Fitalic_F) can be written as follows:

(5) P(H,E,A,F)=P(H)P(E)P(AH,E)P(FA,H),𝑃𝐻𝐸𝐴𝐹𝑃𝐻𝑃𝐸𝑃conditional𝐴𝐻𝐸𝑃conditional𝐹𝐴𝐻P(H,E,A,F)=P(H)\cdot P(E)\cdot P(A\mid H,E)\cdot P(F\mid A,H),italic_P ( italic_H , italic_E , italic_A , italic_F ) = italic_P ( italic_H ) ⋅ italic_P ( italic_E ) ⋅ italic_P ( italic_A ∣ italic_H , italic_E ) ⋅ italic_P ( italic_F ∣ italic_A , italic_H ) ,

where P(H)𝑃𝐻P(H)italic_P ( italic_H ) represents the probability distribution of the health state, P(E)𝑃𝐸P(E)italic_P ( italic_E ) represents the probability distribution of the environment, P(AH,E)𝑃conditional𝐴𝐻𝐸P(A\mid H,E)italic_P ( italic_A ∣ italic_H , italic_E ) represents the conditional probability of the activity type given the health state and the environment. P(FA,H)𝑃conditional𝐹𝐴𝐻P(F\mid A,H)italic_P ( italic_F ∣ italic_A , italic_H ) represents the conditional probability of the features given the activity type and the health state, as:

(6) P(FA,H)=P(mfA,H)P(ifA,H),𝑃conditional𝐹𝐴𝐻𝑃conditional𝑚𝑓𝐴𝐻𝑃conditional𝑖𝑓𝐴𝐻P(F\mid A,H)=P(mf\mid A,H)\cdot P(if\mid A,H),italic_P ( italic_F ∣ italic_A , italic_H ) = italic_P ( italic_m italic_f ∣ italic_A , italic_H ) ⋅ italic_P ( italic_i italic_f ∣ italic_A , italic_H ) ,

where mf𝑚𝑓mfitalic_m italic_f and if𝑖𝑓ifitalic_i italic_f are motion features and inactivity features, respectively.

To infer the health state (H) given the observed variables, we can use Bayes’ theorem. Bayes’ theorem allows us to update our beliefs about the health state based on the observed evidence. The equation for performing inference in this case is as follows:

(7) P(HA,E,F)=P(H,E,A,F)/P(A,E,F)=P(H)P(E)P(AH,E)P(FA,H)/P(A,E,F),𝑃conditional𝐻𝐴𝐸𝐹𝑃𝐻𝐸𝐴𝐹𝑃𝐴𝐸𝐹𝑃𝐻𝑃𝐸𝑃conditional𝐴𝐻𝐸𝑃conditional𝐹𝐴𝐻𝑃𝐴𝐸𝐹P(H\mid A,E,F)={P(H,E,A,F)}/{P(A,E,F)}={P(H)\cdot P(E)\cdot P(A\mid H,E)\cdot P% (F\mid A,H)}/{P(A,E,F)},italic_P ( italic_H ∣ italic_A , italic_E , italic_F ) = italic_P ( italic_H , italic_E , italic_A , italic_F ) / italic_P ( italic_A , italic_E , italic_F ) = italic_P ( italic_H ) ⋅ italic_P ( italic_E ) ⋅ italic_P ( italic_A ∣ italic_H , italic_E ) ⋅ italic_P ( italic_F ∣ italic_A , italic_H ) / italic_P ( italic_A , italic_E , italic_F ) ,

where P(HA,E,F)𝑃conditional𝐻𝐴𝐸𝐹P(H\mid A,E,F)italic_P ( italic_H ∣ italic_A , italic_E , italic_F ) represents the posterior probability of the health state given the observed activity type, environment, and behavioral features, and P(A,E,F)𝑃𝐴𝐸𝐹P(A,E,F)italic_P ( italic_A , italic_E , italic_F ) is the evidence, which is calculated as the sum of the joint probabilities of all possible health states as P(A,E,F)=HP(H)P(E)P(AH,E)P(FA,H)𝑃𝐴𝐸𝐹subscript𝐻𝑃𝐻𝑃𝐸𝑃conditional𝐴𝐻𝐸𝑃conditional𝐹𝐴𝐻P(A,E,F)=\sum_{H}P(H)\cdot P(E)\cdot P(A\mid H,E)\cdot P(F\mid A,H)italic_P ( italic_A , italic_E , italic_F ) = ∑ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT italic_P ( italic_H ) ⋅ italic_P ( italic_E ) ⋅ italic_P ( italic_A ∣ italic_H , italic_E ) ⋅ italic_P ( italic_F ∣ italic_A , italic_H ).

If the activity type is unknown on some occasions it also needs to be inferred:

(8) P(A,HE,F)=P(H)P(E)P(AH,E)P(FA,H)/P(E,F),𝑃𝐴conditional𝐻𝐸𝐹𝑃𝐻𝑃𝐸𝑃conditional𝐴𝐻𝐸𝑃conditional𝐹𝐴𝐻𝑃𝐸𝐹P(A,H\mid E,F)={P(H)\cdot P(E)\cdot P(A\mid H,E)\cdot P(F\mid A,H)}/{P(E,F)},italic_P ( italic_A , italic_H ∣ italic_E , italic_F ) = italic_P ( italic_H ) ⋅ italic_P ( italic_E ) ⋅ italic_P ( italic_A ∣ italic_H , italic_E ) ⋅ italic_P ( italic_F ∣ italic_A , italic_H ) / italic_P ( italic_E , italic_F ) ,

where P(E,F)=A,HP(H)P(E)P(AH,E)P(FA,H)𝑃𝐸𝐹subscript𝐴𝐻𝑃𝐻𝑃𝐸𝑃conditional𝐴𝐻𝐸𝑃conditional𝐹𝐴𝐻P(E,F)=\sum_{A,H}P(H)\cdot P(E)\cdot P(A\mid H,E)\cdot P(F\mid A,H)italic_P ( italic_E , italic_F ) = ∑ start_POSTSUBSCRIPT italic_A , italic_H end_POSTSUBSCRIPT italic_P ( italic_H ) ⋅ italic_P ( italic_E ) ⋅ italic_P ( italic_A ∣ italic_H , italic_E ) ⋅ italic_P ( italic_F ∣ italic_A , italic_H ).

4. Results

4.1. Experiment Methodology

Health state classification For health state classification, the training data is drawn from the monitoring records of each participant (details are shown in Table 1). To achieve optimal performance, activities, and features are iteratively added to the classifiers using a forward selection approach. At each stage of the iterative process, all classifiers were trained and tested using 5-fold cross-validation with shuffled samples, where 4 folds were used for training and 1 for validation. For the Bayesian network, the structure is predefined as shown in Fig. 3. The size of the nodes is based on the sizes of the selected features, as shown in Table 2, and the initial probabilities of all nodes are randomly set. The parameters of each adjustable node are then set to their ML/MAP values using batch EM (Murphy, 2002). For the SVM, a two-class SVM is used to classify normal and weakness health states. For the RF, a parameter of 100 trees is chosen. For the CNN-GRU, one convolution layer (kernel size 3) and one GRU layer (hidden units 128) are used. The initial learning rate is set to 0.001, and the model is trained for 100 epochs. For the LSTM, one layer with 120 hidden units is used. The hyperparameters are the same as those used for the CNN-GRU. The models with the best validation loss are selected.

Due to the inherent imbalance in activity classes, the training process prioritized F1-macro as the primary evaluation metric, aiming for robust performance across all activity classes. In each training loop, the feature or activity with the highest F1-macro score was added to the classifiers. This process continued until a predetermined number of features (30 in this case) were included and all activity combinations were represented. Finally, the best performing model was selected, typically after less than 19 features were added. To minimize results bias caused by a lack of data from infrequently performed activities, we only selected combinations of activities that cover over 40% of the training data (Equation 1) for optimal classification performance. This approach ensures that the classification model is trained on a representative sample of activities, preventing it from being overly influenced by data from a single activity (top-1).

Optimal time scale To determine the optimal timescale, each monitoring record is segmented into temporal windows of fixed length (ranging from 30 seconds to 1200 seconds), and the features within each temporal window are utilized for classification. The classification labels obtained from each window are aggregated to a record-level label using majority voting. The record-level labels are further aggregated to 8-hour and daily levels based on their timestamps, also using majority voting.

Feature and activity ranking For ranking features and activities, the same set of basic classifiers (BN, SVM, RF) are utilized. Both forward and backward selection are applied to each activity to assess the performance of the classifiers. For each information-based ranking method (FDR, MI, CFS), the top 5 and 10 features are selected for consideration. The ranking scores generated by different basic methods are illustrated in Fig. 10. Different basic ranking methods yield slightly different scores, indicating that a single method lacks robustness for activity ranking due to inconsistencies in the results from multiple methods. Consequently, aggregating different methods can lead to a more reliable ranking.

Subsequently, aggregation methods (BC, NWA, Cb) are utilized to derive the aggregated scores from the aforementioned six methods. Then, all scores from different aggregation methods are normalized to a range between zero and one and then summed to obtain the final score (Equation 4).

4.2. Health State Classification Results

Fig. 4 presents the results for classifying normal and weakness health states at the record-level, with each sample representing a complete monitoring record corresponding to an entire activity duration. When training our models with all activities at the record-level, we achieved average accuracy (F1-micro) of 0.71, 0.84, and 0.82 for BN, RF, and SVM, respectively, distinguishing between normal and weakness among all participants. By exploring different temporal windows for feature extraction, we observed an improvement in the average accuracy of the three classifiers to 0.84 when considering all activities across all participants. Upon selecting representative activities by assessing all possible activity combinations for each participant, we identified certain combinations of activities444Selected combinations of activities must have over 40% data coverage. Details are given in Section 4.1 Implementation Details. that yielded an average of 0.89 for the three classifiers among all participants. Furthermore, by carefully choosing both the optimal activity combinations and timescales for feature extraction, we achieved an enhanced accuracy of 0.94 for the three classifiers.

Refer to caption
Figure 4. F1-scores for inferring health states on all monitoring records with three classifiers (Bayesian Network (BN), Random Forest (RF), and Supportive Vector Machine (SVM)). allAct: classify with all activities; actUnk: classify with all activities with activity labels unknown; + TWin: classify with all activities with optimal temporal windows for feature extraction; + actSel: classify with optimally selected combination of activities (¿40% data coverage, details are given in Section 4.1 Implementation Details.); + actSel + TWin: classify with optimally selected combination of activities and optimal temporal windows for feature extraction.

4.3. Optimal Time Scale Results

Fig. 5 provides a comprehensive overview of the Bayesian Network’s performance at different time scales, both for training and testing across all activities. Employing different temporal windows from 30s to 1200s affects the performance of the model for health state inference. Notably, in our experiments, the 300s temporal window exhibited the best performance, achieving the highest accuracy across all five participants (P1–P5) when distinguishing between normal and weakness health states across all activity types. For three participants (P1, P3, and P5), the 600s window performed equally well.

\stackunder

[5pt]Refer to captionRecord-level \stackunder[5pt]Refer to caption8-hour level \stackunder[5pt]Refer to captionDaily-level

Figure 5. F1-scores of classifying health states using the Bayesian Network across different temporal windows (30 to 1200 seconds) and aggregated for different time spans (record-level, 8-hour level, and daily-level).
Refer to caption
Figure 6. Comparison of the optimal performance of three models (Bayesian Network, CNN-GRU, and LSTM) for health state classification on our dataset, at record-level, 8-hour level, and daily-level, following activity selection and temporal window selection.

Moreover, further refinement of the model by selecting the best activity combinations alongside the optimal temporal window of 300s significantly improved the accuracy. It reached 0.95 (σ=𝜎absent\sigma=italic_σ = 0.07) at the 8-hour level and 0.97 (σ=𝜎absent\sigma=italic_σ = 0.04) at the daily-level, averaged for all participants. For comparison, we employed two deep convolutional neural networks, CNN-GRU and LSTM, to implicitly learn features from the data without feature engineering. Our results indicate that BN outperformed the deep models on our dataset with hand-crafted features (see Fig. 6), when considering both optimal time scale and activity combinations. This highlights the robust performance of the BN model and its ability to interpret complex relationships between variables, making it a good choice for modeling features, activities, and health conditions in this context.

4.4. Feature and Activity Ranking Results

The results of the activity ranking can be found in Fig. 7. The result shows activity ‘nap’ appeared three times in the top 2 rankings (i.e., highest ranking scores among activities), and ‘watch’ also appeared three times. On the contrary, ‘PC’ was consistently ranked as the least important activity, appearing three times as such. Two participants (P1 and P3) participated in all five types of activities, while the others were not available for some of the activities.

\stackunder

[5pt]Refer to captionP1 \stackunder[5pt]Refer to captionP2 \stackunder[5pt]Refer to captionP3 \stackunder[5pt]Refer to captionP4 \stackunder[5pt]Refer to captionP5

Figure 7. Activity ranking results for each participant, after aggregation with Borda Count (BC), Normalization and Weighted Average (NWA), and Consensus-based (Cb). The horizontal axis displays the final ranking scores by Equation 4. The vertical axis lists each activity, arranged from top to bottom in ascending order of their ranking scores, where larger scores are better. The results indicate that ”watch” and ”nap” ranked among the top two most important activities across participants. Conversely, ”PC” was consistently ranked as the least important activity.
\stackunder

[5pt]Refer to captionP1 \stackunder[5pt]Refer to captionP2 \stackunder[5pt]Refer to captionP3 \stackunder[5pt]Refer to captionP4 \stackunder[5pt]Refer to captionP5

Figure 8. Feature ranking results for each participant. Aggregated with Borda Count (BC), Normalization and Weighted Average (NWA), and Consensus-based (Cb). The horizontal axis displays the final ranking scores by Equation 4, where larger scores are better. The vertical axis lists top-20 features for each participant, arranged from top to bottom in increasing order of their final ranking scores. Fine-grained motion features (scale, density, and speed) and inactivity distributions emerged as significant indicators of health states, with notable variations in the top features across participants.

The top-20 ranked features for each participant are displayed in Fig. 8, while the overall feature ranking for all participants is presented in Table 4. The results highlight the significance of both movement and inactivity features. Among the top 10 features averaged for all participants, movement speed appeared three times, movement density also occurred three times, and movement scale appeared four times. Additionally, among the top 20 features, inactivity distribution was observed three times.

When comparing weakness states to normal states within each individual, specific changes were observed for each participant based on their own feature rankings. Table 4 illustrated the percentage change of top-ranked groups of features for each participant. The results so far suggest that there are no features that are generally useful across all participants. See Section 5 for more discussion.

For Participant 1, the feature rankings suggested a reduction in the movement scale on the non-dominant side of the body and a decrease in peak movement speed when comparing weakness to normal states. More specifically, the movement scale and speed in the left side of the body exhibited reductions of -13.1% and -14.9%, respectively, across all activities.

Participant 2’s feature rankings indicated a reduction in short to middle inactivity periods but an increase in long inactivity periods, with a decrease of -31.9% in the 0- to 2-second range and a remarkable increase of +196.1% in the greater-than-or-equal-to-60-second range.

Participant 3’s feature rankings suggested paying attention to the movement speed during the movement period only, which decreased by -18.1% (and -7.4% on the left side of the body). Similar to Participant 2, there were evident changes in inactivity distribution.

For Participant 4, the feature rankings suggested focusing on the movement speed during the movement period only, revealing increases for the second and third quartiles, and on the left side of the body. Movement distribution also changed, suggesting an increase in middle-duration movement and a reduction in long-duration movement, relative to an increase in long-duration inactivity.

Finally, for Participant 5, the feature rankings highlighted reductions in movement scale and density during the movement period only. Notably, the movement scale of the left side of the body decreased by -20.5%, and the speed of the upper body part decreased by -5.4%. Furthermore, the first to fourth quartiles of movement speed in the movement period only exhibited obvious decreases.

4.5. Anomaly Detection using a Bayes Net

The preceding sections investigated the distinction between normal and weak health states by framing it as a classification problem with simulated weakness data. We then demonstrated the model’s ability to differentiate between these states and identified promising indicators for detecting such changes. However, real-world health data often presents challenges: controlling health states is difficult, and imbalanced data is common, with fewer abnormal cases than normal ones. This makes traditional classification approaches less practical.

Anomaly detection, where a model is trained only on normal data is used to identify abnormal situations, offers a more suitable solution in these scenarios. We can leverage it to detect rare and imbalanced abnormality instances. Therefore, a Bayesian network using only data from normal health states was constructed. This “normal model” aims to accurately capture the typical patterns within normal healthy days. Its ability to identify abnormal cases (weak days) based on their deviations from the expected normal patterns was then tested.

Specifically, we grouped monitoring records at the daily-level and trained a Bayes Net model for each participant solely on samples from normal days. The net can be seen in Fig. 3. The model construction can be represented as:

(9) nor=BN(𝒟nor)subscript𝑛𝑜𝑟𝐵𝑁subscript𝒟𝑛𝑜𝑟\mathcal{M}_{nor}=BN(\mathcal{D}_{nor})caligraphic_M start_POSTSUBSCRIPT italic_n italic_o italic_r end_POSTSUBSCRIPT = italic_B italic_N ( caligraphic_D start_POSTSUBSCRIPT italic_n italic_o italic_r end_POSTSUBSCRIPT )

where 𝒟nor={(=hnor,𝒜,)}subscript𝒟𝑛𝑜𝑟subscript𝑛𝑜𝑟𝒜\mathcal{D}_{nor}=\{\mathcal{(H}=h_{nor},\mathcal{A},\mathcal{F})\}caligraphic_D start_POSTSUBSCRIPT italic_n italic_o italic_r end_POSTSUBSCRIPT = { ( caligraphic_H = italic_h start_POSTSUBSCRIPT italic_n italic_o italic_r end_POSTSUBSCRIPT , caligraphic_A , caligraphic_F ) }. Here, norsubscript𝑛𝑜𝑟\mathcal{M}_{nor}caligraphic_M start_POSTSUBSCRIPT italic_n italic_o italic_r end_POSTSUBSCRIPT denotes the normal model, 𝒟norsubscript𝒟𝑛𝑜𝑟\mathcal{D}_{nor}caligraphic_D start_POSTSUBSCRIPT italic_n italic_o italic_r end_POSTSUBSCRIPT is the data with normal health states hnorsubscript𝑛𝑜𝑟h_{nor}italic_h start_POSTSUBSCRIPT italic_n italic_o italic_r end_POSTSUBSCRIPT.

A leave-one (day)-out strategy was used for training and testing, i.e., during each training iteration, samples from one normal day were excluded for testing purposes. For training, only normal days with multiple monitoring records were used, excluding those with only one sample, to obtain a more stable model with less variation. After training, the model was evaluated using data from weak days and averaging the output log-likelihoods for each day.

Fig.9 shows the test sample output log-likelihoods for normal and weak days for two participants (participants with sufficient data for individualized model building). For personalized models, we first select within the top five ranked features for each participant from the list presented in Table 4 top. Then, to assess feature generalizability, we build normal models using the top five features ranked across all participants, as detailed in the bottom section of Table 4. As visualized, the normal models consistently assign higher log-likelihoods to the normal test sample days compared to weak days. Furthermore, Cohen’s d between the log-likelihood values of the two groups was calculated. The results of over 0.9 and 0.7 indicate large and moderate effect sizes, respectively, signifying noticeable and meaningful differences between the means of the normal and weak groups. Moreover, using personalized feature sets, when supported by sufficient personal data, can improve the detection of abnormal health states compared to generic feature sets.

Refer to caption
Figure 9. Output log-likelihoods generated by the “normal models” for the normal and weak days of two participants. The horizontal axis displays the monitoring day index for each participant (P1 or P2). Normal days with zero or one record were excluded from model training. The solid line depicts the mean log-likelihood for normal days, while the dotted line represents the mean for weak days. The top row displays the output of models built with personalized best features (Top-2 and Top-3 features for P1-P2, respectively). The bottom row shows the output of models built with top-5 generalized features. Both rows demonstrate noticeable and meaningful differences between the means of the normal and weakness groups.

5. Discussion

The focus of this study was on observing natural and common daily scenarios and identifying health state changes caused by weaknesses (simulated by a workout session). The objective was to gain insights into the effectiveness of automatically quantifying behavioral changes and explaining these changes. We specifically targeted common daily activities that people frequently engage in at home, such as reading, napping, watching TV, using a PC, and eating, particularly while sitting on their favorite chairs or couches. This scenario setting provided a simple and feasible solution for monitoring their behaviors unobtrusively over the long term.

We utilized a fixed RGB-D camera with a specific emphasis on capturing upper-body movements. During real-time, anonymous processing, we extracted explicit motion, inactivity, and environmental features while considering their dependencies. The results showed that our method could distinguish between normal and weak states effectively. By selecting the appropriate activity and suitable temporal window for behavioral features and further aggregating data over longer time spans, our method achieved an accuracy of 0.97 (σ=𝜎absent\sigma=italic_σ = 0.04) at the daily-level with 5-fold cross-validation.

When performing long-term monitoring, including all kinds of activities during a monitoring period is not always the best choice. The average F1-micro scores shown in Fig. 4, when averaged across all classifiers, participants, and activities, is 0.79. This might be due to significant variations in feature values among different activities, which can be larger than the differences between health states, as illustrated in Fig. 2. Selecting prominent activities is beneficial for detecting a change in health states as the contribution of each activity is uneven, with participants exhibiting distinct behavioral patterns in each activity. After selecting the optimal combination of activities, the F1-score effectively improved by about 10%. Here, we chose combinations of activities that accounted for more than 40% of the data coverage but did not select the top-1 activity. This decision was made because some activities are naturally rare, which could introduce bias in classification results due to the limited number of samples. Not knowing the activity types slightly reduces the performance of health state classification; however, it may help alleviate the difficulty of automatic activity recognition, especially in cases where manually recording the activity type is challenging. When the activity type is unknown, we achieved an average F1 of 0.819 on the record-level using RF and SVM, representing a 1.4% drop compared to training with known activity types. (BN is excluded here due to model complexity, as the sample size in each activity on the record-level is too small, resulting in overfitting for Participant 5).

Selecting suitable temporal windows also enhances the performance of classifying health states. The results in Fig. 5 show that 5 to 10-minute windows are the best segments for feature extraction using BN. This approach is feasible in terms of computational cost and data storage for long-term monitoring, allowing the method to record statistics at regular intervals, such as the average movement speed or motion and inactivity distributions. Furthermore, aggregating the results from temporal windows into longer time spans effectively enhances performance. When there are more monitoring records within a specific period, such as within 8 hours or a day, the results become more robust. For example, Participant 1 has more than 8 activity monitoring records per day, and aggregation can enhance F1 by more than 15% when moving from the record-level to the daily-level. On average, a 5% improvement across all participants was observed. In conclusion, extracting behavioral features every several minutes and then aggregating them into longer time spans is a sound approach for making decisions when inferring health states.

Table 4. (Top) Top-5 behavioral features ranked for each participant (from Fig. 8), showing their average percentage change (%) from normal to weakness. (Bottom) Top-10 behavioral features ranked across all participants, determined by averaging the ranking scores of each feature among all participants.
P1 (R-handed) P2 (R-handed) P3 (R-handed) P4 (R-handed) P5 (R-handed)
Scale(L-MPO) -13.1
VQ4(MPO) -9.3
VQ4 -17.3
V(L-MPO) -14.9
Pixel(MPO) -18.7
Inact.10-30s -7.9
Density(L-MPO) +21.2
Scale -94.2
Move.geq10s -16.1
V(T-MPO) +29.9
Inact.5-10s +63.3
V(L-MPO) -7.4
V(MPO) -18.1
Scale(T-MPO) -8.6
Inact.No. +22.1
Inact.2-5s -27.0
V(L-MPO) +43.1
Move.geq2s +3.7
VQ2(MPO) +36.8
Move.5-10s +61.6
Scale(MPO) -5.7
Density(MPO) -14.0
Density(T-MPO) +19.7
V(T-MPO) -5.4
Density(L-MPO) +28.1
Top 10 behavioral features ranked among all participants
V(L-MPO): movement speed in the left side of the body in movement period only
Density(L-MPO): movement density in the left side of the body in movement period only
Scale(T-MPO): movement scale in top part in movement period only
VQ4(MPO): top25% fastest speed in movement period only
Density(MPO): movement density in movement period only
Scale(L): movement scale in the left side of the body
Pixel: movement pixel count (similar to density)
Scale(L-MPO): movement scale in the left side of the body in movement period only
V(T-MPO): speed in top body parts in movement period only
Scale(R-MPO): movement scale in right body part in movement period only

Feature ranking reveals the behavior patterns that are crucial in distinguishing between normal and weakness health states. Among all the features, movement speed and inactivity emerge as strong indicators of weakness, followed by movement scale and movement density. Notably, fine-grained behavioral features hold greater significance in this context, particularly those related to MPO (movement period only) and L-MPO (left body region in MPO - the non-dominant side of the 5 participants). These features consistently appear in the top rankings for each participant (see Fig. 8). This underscores the importance of focusing on the non-dominant region of the body for participants who are all right-handed in our experiment. The non-dominant region predominantly encompasses non-essential movements that can be omitted when not required, whereas the dominant body parts involve movements necessary for specific tasks and cannot be omitted as easily. Nonetheless, it’s important to acknowledge that each participant exhibits their own unique set of optimal behavioral features and preferred activities. Substantial variations in behavioral characteristics exist not only among different activities but also among different participants. The ranking results reveal that there is no single set of unified behavioral features applicable to all participants.

For further illustration, Fig. 11 displays the changes in the top features in two activities that all participants have performed. ‘Watch’ exhibits less variation (and more similar trends) among participants than ‘Read’, suggesting it might be a better generalizable indicator of health state. It’s important to note that even within the same activity, the trends in feature changes can vary significantly among participants.

Hence, it becomes evident that an individualized model tailored to each person would be more appropriate for effectively monitoring health states. Interpersonal differences consistently play a significant role in shaping our understanding of the true effects of a health condition (Aramendi et al., 2018). Simulating conditions on healthy subjects and comparing within the same person allows for the removal of the influence of concurrent medical conditions and interpersonal differences, as different individuals possess distinct health baselines (e.g., gender, age, fitness level), and experience varying rates of condition progression. This approach helps to isolate and focus on the effects of weakness on individuals.

Refer to caption
Figure 10. Example of activity ranking results for five activities (horizontal axis) of a participant (P1) using the basic methods (before aggregation). The different colored rectangles refer to the 12 different metrics for ranking. The top 5 and 10 features were used for information-based methods (FDR, MI, CFS), while forward and backward selection methods were applied for classifiers (SVM, BN, RF). (See Table 3 for details.) The height of each rectangle is the normalized score (0-1) of that metric for that activity. The vertical axis represents cumulative scores from all metrics for each activity type. Variations in scores across metrics highlight the need for aggregation to achieve a more reliable ranking. Activities with consistently high scores across metrics likely serve as better indicators of the health states.
\stackunder

[5pt]Refer to captionWatch \stackunder[5pt]Refer to captionRead

Figure 11. Change from normal to weakness (%) in the top 10 behavioral features ranked among all participants in the activities. The color of a cell shows whether the feature value increased or decreased. As there are no consistent color changes when viewed horizontally, it is clear that different participants respond differently to the exercise. Compared to ‘Read,’ the ‘Watch’ activity exhibits less variation and presents more similar trends among participants. This suggests it might be a more reliable indicator of health state change. However, even within the same activity, individual participant’s feature trends can vary significantly.

Several limitations are associated with this study. Firstly, the exploration of environmental features remains incomplete due to constraints imposed by the experiment’s design. The presence of strong “Time-of-Day” features in indicating weakness health states for participants P2 and P3 (see Fig. 8) can be attributed to their tendency to exhibit weakness in the afternoon, likely due to their morning workout routines. We intend to investigate such biases further in future research. Additionally, factors such as room lighting, objects within the environment, and even weather conditions were influenced, as data recording occurred exclusively within participants’ homes. Furthermore, it is important to acknowledge that the recording process itself may have introduced some alterations in the naturalistic observation (Angrosino, 2016) of participants. This impact is particularly evident in the duration of activities, as highlighted in Table 1, where the majority of activities lasted less than half an hour when monitored by the camera. This duration is considered shorter than typical real-world scenarios. Another limitation pertains to the relatively small number of participants involved in the experiments, although we are focusing on individual characteristics, it could potentially limit the generalizability of the findings. Lastly, the accuracy of motion estimation using optical flow has not been rigorously evaluated, primarily due to the challenges associated with obtaining ground truth data for body motion.

6. Conclusion

Weakness is a prevalent symptom among older adults as they age. Detecting subtle, slow, long-term behavioral changes associated with this condition presents a formidable challenge. In this study, we simulated weakness in healthy subjects by exercise, closely monitored their behaviors, and quantified the shift from normal to post-workout weakness. Our investigation specifically targeted common daily activities, and we established a naturalistic, unobtrusive setting for our observations. We designed fine-grained, semantically meaningful features to quantify these behavioral changes. Our research ranked the features most indicative of health behavior changes and identified activities that significantly influenced these alterations. Additionally, we delved into the most effective time scales for detecting changes induced by weakness. Our results demonstrate that by selecting optimal features, activities, and time scales, we achieved an accuracy of 0.95 at an 8-hour interval and 0.97 on a daily basis using a Bayesian Network.

Our approach, which leverages computer vision and machine learning techniques, provides insights into the early detection and management of health conditions in older adults. The ranking of features and activities is invaluable as it guides the design of effective individual monitoring methods and offers valuable explanations. Moreover, the exploration into suitable time scales provides practical insights for task implementation. The proposed methodology potentially extends to the detection of broader physical signs or motor problems linked to aging and common conditions in older adults. To the best of our knowledge, this is the first work to automatically detect weakness using visual cues and provide an explanation for behavioral changes. Future work will concentrate on further investigating the environmental context and conducting longer and larger-scale studies on real data obtained from older adults.

Acknowledgements.
This research was funded by the Legal & General Group (research grant to establish the independent Advanced Care Research Centre at University of Edinburgh). The funder had no role in conduct of the study, interpretation or the decision to submit for publication. The views expressed are those of the authors and not necessarily those of Legal & General. Approval for the experiments was granted by the School of Informatics Ethics Committee.

References

  • (1)
  • Angrosino (2016) Michael V Angrosino. 2016. Naturalistic observation. Routledge.
  • Aramendi et al. (2018) Ane Alberdi Aramendi, Alyssa Weakley, Asier Aztiria Goenaga, Maureen Schmitter-Edgecombe, and Diane J Cook. 2018. Automatic assessment of functional health decline in older adults based on smart home data. Journal of biomedical informatics 81 (2018), 119–130.
  • Auepanwiriyakul et al. (2020) Chaiyawan Auepanwiriyakul, Sigourney Waibel, Joanna Songa, Paul Bentley, and A Aldo Faisal. 2020. Accuracy and acceptability of wearable motion tracking for inpatient monitoring using smartwatches. Sensors 20, 24 (2020), 7313.
  • Barhum (2022) Lana Barhum. 2022. Multiple Sclerosis vs. Reactive Arthritis. Retrieved November 18, 2023 from https://www.verywellhealth.com/multiple-sclerosis-vs-reactive-arthritis-5498585
  • Breiman (2001) Leo Breiman. 2001. Random forests. Machine learning 45 (2001), 5–32.
  • Burwell and Jackson (1994) B Burwell and Beth Jackson. 1994. The disabled elderly and their use of long-term care. US Department of Health and Human Services (Office of Disability, Aging, and Long-Term Care Policy) and SysteMetrics. http://aspe. hhs. gov/daltcp/reports/diseldes. htm (accessed October 16, 2017) (1994).
  • Chen and Fisher (2023) Longfei Chen and Robert B Fisher. 2023. Monitoring Inactivity of Single Older Adults at Home. arXiv preprint arXiv:2311.02249 (2023).
  • Cochran (1977) William Gemmell Cochran. 1977. Sampling techniques. john wiley & sons.
  • Cook (2020) Diane Cook. 2020. Sensors in support of aging-in-place: The good, the bad, and the opportunities. National Academies of Sciences, Engineering, and Medicine; Division of Behavioral and Social Sciences and Education; Board on Behavioral, Cognitive, and Sensory Sciences (2020).
  • Cook et al. (2018) Diane J Cook, Glen Duncan, Gina Sprint, and Roschelle L Fritz. 2018. Using smart city technology to make healthcare smarter. Proc. IEEE 106, 4 (2018), 708–722.
  • Cook and Holder (2011) Diane J Cook and Lawrence B Holder. 2011. Sensor selection to support practical use of health-monitoring smart environments. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1, 4 (2011), 339–351.
  • Cook et al. (2022) Diane J Cook, Miranda Strickland, and Maureen Schmitter-Edgecombe. 2022. Detecting smartwatch-based behavior change in response to a multi-domain brain health intervention. ACM Transactions on Computing for Healthcare (HEALTH) 3, 3 (2022), 1–18.
  • Cortes and Vapnik (1995) Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20 (1995), 273–297.
  • Cover (1999) Thomas M Cover. 1999. Elements of information theory. John Wiley & Sons.
  • Dahmen and Cook (2021) Jessamyn Dahmen and Diane J Cook. 2021. Indirectly supervised anomaly detection of clinically meaningful health events from smart home data. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 2 (2021), 1–18.
  • Dawadi et al. (2013a) Prafulla N Dawadi, Diane J Cook, and Maureen Schmitter-Edgecombe. 2013a. Automated cognitive health assessment using smart home monitoring of complex tasks. IEEE transactions on systems, man, and cybernetics: systems 43, 6 (2013), 1302–1313.
  • Dawadi et al. (2013b) Prafulla N Dawadi, Diane J Cook, Maureen Schmitter-Edgecombe, and Carolyn Parsey. 2013b. Automated assessment of cognitive health using smart home technologies. Technology and health care 21, 4 (2013), 323–343.
  • Dernbach et al. (2012) Stefan Dernbach, Barnan Das, Narayanan C Krishnan, Brian L Thomas, and Diane J Cook. 2012. Simple and complex activity recognition through smart phones. In 2012 eighth international conference on intelligent environments. IEEE, 214–221.
  • Emerson (2013) Peter Emerson. 2013. The original Borda count and partial voting. Social Choice and Welfare 40 (2013), 353–358.
  • Fisher (1936) Ronald A Fisher. 1936. The use of multiple measurements in taxonomic problems. Annals of eugenics 7, 2 (1936), 179–188.
  • Fogg et al. (2022) Carole Fogg, Simon DS Fraser, Paul Roderick, Simon de Lusignan, Andrew Clegg, Sally Brailsford, Abigail Barkham, Harnish P Patel, Vivienne Windle, Scott Harris, et al. 2022. The dynamics of frailty development and progression in older adults in primary care in England (2006–2017): A retrospective cohort profile. BMC geriatrics 22, 1 (2022), 30.
  • Fried et al. (2001) Linda P Fried, Catherine M Tangen, Jeremy Walston, Anne B Newman, Calvin Hirsch, John Gottdiener, Teresa Seeman, Russell Tracy, Willem J Kop, Gregory Burke, et al. 2001. Frailty in older adults: evidence for a phenotype. The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 56, 3 (2001), M146–M157.
  • Friedman et al. (1997) Nir Friedman, Dan Geiger, and Moises Goldszmidt. 1997. Bayesian network classifiers. Machine learning 29 (1997), 131–163.
  • Grimmer et al. (2019) Martin Grimmer, Robert Riener, Conor James Walsh, and André Seyfarth. 2019. Mobility related physical and functional losses due to aging and disease-a motivation for lower limb exoskeletons. Journal of neuroengineering and rehabilitation 16, 1 (2019), 1–21.
  • Hall (1999) Mark A Hall. 1999. Correlation-based feature selection for machine learning. Ph. D. Dissertation. The University of Waikato.
  • Hayes et al. (2008) Tamara L Hayes, Francena Abendroth, Andre Adami, Misha Pavel, Tracy A Zitzelberger, and Jeffrey A Kaye. 2008. Unobtrusive assessment of activity patterns associated with mild cognitive impairment. Alzheimer’s & Dementia 4, 6 (2008), 395–405.
  • healthline (2023) healthline. 2023. Muscle Weakness: 28 Causes, Diagnosis, Treatment & More. Retrieved November 23, 2023 from https://www.healthline.com/health/muscle-weakness#emergency-symptoms
  • Igual et al. (2013) Raul Igual, Carlos Medrano, and Inmaculada Plaza. 2013. Challenges, issues and trends in fall detection systems. Biomedical engineering online 12, 1 (2013), 66.
  • Institute (2023) UPMC Rehabilitation Institute. 2023. What Happens After A Stroke? Retrieved November 18, 2023 from https://www.upmc.com/services/rehab/rehab-institute/conditions/stroke/after-stroke
  • ITU-R (2008) ITU-R. 2008. BT.601. Retrieved Nov, 2023 from https://www.itu.int/rec/R-REC-BT.601/
  • Jansen et al. (2022) Carl-Philipp Jansen, Katharina Gordt-Oesterwind, and Michael Schwenk. 2022. Wearable Motion Sensors in Older Adults: On the Cutting Edge of Health and Mobility Research. , 973 pages.
  • Jocher (2020) Glenn Jocher. 2020. Ultralytics YOLOv5. https://doi.org/10.5281/zenodo.3908559
  • Joumier et al. (2011) Véronique Joumier, Rim Romdhane, Francois Bremond, Monique Thonnat, Emmanuel Mulin, PH Robert, A Derreumaux, Julie Piano, and JR Lee. 2011. Video activity recognition framework for assessing motor behavioural disorders in Alzheimer disease patients. In International Workshop on Behaviour Analysis and Video Understanding (ICVS 2011). 9.
  • Karakostas et al. (2016) Anastasios Karakostas, Alexia Briassouli, Konstantinos Avgerinakis, Ioannis Kompatsiaris, and Magda Tsolaki. 2016. The dem@ care experiments and datasets: a technical report. arXiv preprint arXiv:1701.01142 (2016).
  • König et al. (2015a) Alexandra König, Carlos Fernando Crispim-Junior, Alvaro Gomez Uria Covella, Francois Bremond, Alexandre Derreumaux, Gregory Bensadoun, Renaud David, Frans Verhey, Pauline Aalten, and Philippe Robert. 2015a. Ecological assessment of autonomy in instrumental activities of daily living in dementia patients by the means of an automatic video monitoring system. Frontiers in aging neuroscience 7 (2015), 98.
  • König et al. (2015b) Alexandra König, Carlos Fernando Crispim Junior, Alexandre Derreumaux, Gregory Bensadoun, Pierre-David Petit, François Bremond, Renaud David, Frans Verhey, Pauline Aalten, and Philippe Robert. 2015b. Validation of an automatic video monitoring system for the detection of instrumental activities of daily living in dementia patients. Journal of Alzheimer’s Disease 44, 2 (2015), 675–685.
  • Kosorok and Laber (2019) Michael R Kosorok and Eric B Laber. 2019. Precision medicine. Annual review of statistics and its application 6 (2019), 263–286.
  • Larson and Wilbur (2020a) Scott T Larson and Jason Wilbur. 2020a. Muscle weakness in adults: evaluation and differential diagnosis. American family physician 101, 2 (2020), 95–108.
  • Larson and Wilbur (2020b) Scott T Larson and Jason Wilbur. 2020b. Muscle weakness in adults: evaluation and differential diagnosis. American family physician 101, 2 (2020), 95–108.
  • Leask et al. (2015) Calum F Leask, Juliet A Harvey, Dawn A Skelton, and Sebastien FM Chastin. 2015. Exploring the context of sedentary behaviour in older adults (what, where, why, when and with whom). European Review of Aging and Physical Activity 12, 1 (2015), 1–8.
  • Liao et al. (2019) Ying-Yi Liao, I-Hsuan Chen, and Ray-Yau Wang. 2019. Effects of Kinect-based exergaming on frailty status and physical performance in prefrail and frail elderly: A randomized controlled trial. Scientific reports 9, 1 (2019), 9353.
  • Lin et al. (2014) Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740–755.
  • Lucas and Kanade (1981) Bruce D Lucas and Takeo Kanade. 1981. An iterative image registration technique with an application to stereo vision. In IJCAI’81: 7th international joint conference on Artificial intelligence, Vol. 2. 674–679.
  • Majumder et al. (2017) Sumit Majumder, Emad Aghayi, Moein Noferesti, Hamidreza Memarzadeh-Tehran, Tapas Mondal, Zhibo Pang, and M Jamal Deen. 2017. Smart homes for elderly healthcare—Recent advances and research challenges. Sensors 17, 11 (2017), 2496.
  • Mancini et al. (2016) Martina Mancini, Heather Schlueter, Mahmoud El-Gohary, Nora Mattek, Colette Duncan, Jeffrey Kaye, and Fay B Horak. 2016. Continuous monitoring of turning mobility and its association to falls and cognitive function: a pilot study. Journals of Gerontology Series A: Biomedical Sciences and Medical Sciences 71, 8 (2016), 1102–1108.
  • medicalnewstoday (2023) medicalnewstoday. 2023. Asthenia (weakness): Causes, symptoms, and treatment. Retrieved November 23, 2023 from https://www.medicalnewstoday.com/articles/asthenia-weakness
  • Meteostat (2021) Meteostat. 2021. Weather Condition Codes. Retrieved Nov, 2023 from https://dev.meteostat.net/formats.html#weather-condition-codes
  • Mizumura and Taguchi (2016) Kazue Mizumura and Toru Taguchi. 2016. Delayed onset muscle soreness: Involvement of neurotrophic factors. The journal of physiological sciences 66 (2016), 43–52.
  • Mohdiwale et al. (2021) Samrudhi Mohdiwale, Mridu Sahu, GR Sinha, Humaira Nisar, et al. 2021. Investigating feature ranking methods for sub-band and relative power features in motor imagery task classification. Journal of healthcare engineering 2021 (2021).
  • Momin et al. (2022) Md Sarfaraz Momin, Abu Sufian, Debaditya Barman, Paramartha Dutta, Mianxiong Dong, and Marco Leo. 2022. In-home older adults’ activity pattern monitoring using depth sensors: A review. Sensors 22, 23 (2022), 9067.
  • Murphy (2002) Kevin Patrick Murphy. 2002. Dynamic bayesian networks: representation, inference and learning. University of California, Berkeley.
  • Nakanishi et al. (2021) Kazuki Nakanishi, Harutoshi Sakakima, Kosuke Norimatsu, Shotaro Otsuka, Seiya Takada, Akira Tani, and Kiyoshi Kikuchi. 2021. Effect of low-intensity motor balance and coordination exercise on cognitive functions, hippocampal Aβ𝛽\betaitalic_β deposition, neuronal loss, neuroinflammation, and oxidative stress in a mouse model of Alzheimer’s disease. Experimental Neurology 337 (2021), 113590.
  • Nakano et al. (2020) Nobuyasu Nakano, Tetsuro Sakura, Kazuhiro Ueda, Leon Omura, Arata Kimura, Yoichi Iino, Senshi Fukashiro, and Shinsuke Yoshioka. 2020. Evaluation of 3D markerless motion capture accuracy using OpenPose with multiple video cameras. Frontiers in sports and active living 2 (2020), 50.
  • Narasimhan et al. (2021) Rajaram Narasimhan, Charles McGlade, et al. 2021. Current state of non-wearable sensor technologies for monitoring activity patterns to detect symptoms of mild cognitive impairment to Alzheimer’s disease. International Journal of Alzheimer’s Disease 2021 (2021).
  • Owen et al. (2022) Rebecca Owen, Katherine Berry, and Laura JE Brown. 2022. Enhancing older adults’ well-being and quality of life through purposeful activity: A systematic review of intervention studies. The Gerontologist 62, 6 (2022), e317–e327.
  • Panhwar et al. (2019a) Yasmeen Naz Panhwar, Fazel Naghdy, Golshah Naghdy, David Stirling, and Janette Potter. 2019a. Assessment of frailty: a survey of quantitative and clinical methods. BMC Biomedical Engineering 1, 1 (2019), 1–20.
  • Panhwar et al. (2019b) Yasmeen Naz Panhwar, Fazel Naghdy, Golshah Naghdy, David Stirling, and Janette Potter. 2019b. Assessment of frailty: a survey of quantitative and clinical methods. BMC Biomedical Engineering 1, 1 (2019), 1–20.
  • Patient (2021) Patient. 2021. Muscle Weakness. Retrieved November 18, 2023 from https://patient.info/signs-symptoms/tiredness-fatigue/muscle-weakness
  • Picerno et al. (2021) Pietro Picerno, Marco Iosa, Clive D’Souza, Maria Grazia Benedetti, Stefano Paolucci, and Giovanni Morone. 2021. Wearable inertial sensors for human movement analysis: A five-year update. Expert review of medical devices 18, sup1 (2021), 79–94.
  • Pol et al. (2016) Margriet Pol, Fenna Van Nes, Margo Van Hartingsveldt, Bianca Buurman, Sophia De Rooij, and Ben Kröse. 2016. Older people’s perspectives regarding the use of sensor monitoring in their home. The Gerontologist 56, 3 (2016), 485–493.
  • Prabhu et al. (2022) Deepa Prabhu, Mahnoosh Kholghi, Moid Sandhu, Wei Lu, Katie Packer, Liesel Higgins, and David Silvera-Tawil. 2022. Sensor-based assessment of social isolation and loneliness in older adults: A survey. Sensors 22, 24 (2022), 9944.
  • Rantz et al. (2015) Marilyn J Rantz, Marjorie Skubic, Mihail Popescu, Colleen Galambos, Richelle J Koopman, Gregory L Alexander, Lorraine J Phillips, Katy Musterman, Jessica Back, and Steven J Miller. 2015. A new paradigm of technology-enabled ‘Vital Signs’ for early detection of health change for older adults. Gerontology 61, 3 (2015), 281–290.
  • Romdhane et al. (2012) Rim Romdhane, Emmanuel Mulin, Alexandre Derreumeaux, Nadia Zouba, Julie Piano, L Lee, I Leroi, P Mallea, R David, M Thonnat, et al. 2012. Automatic video monitoring system for assessment of Alzheimer’s disease symptoms. The journal of nutrition, health & aging 16 (2012), 213–218.
  • Sahni et al. (2023) Nikhil Sahni, George Stein, Rodney Zemmel, and David M Cutler. 2023. The potential impact of artificial intelligence on healthcare spending. Technical Report. National Bureau of Economic Research.
  • Scano et al. (2014) Alessandro Scano, Marco Caimmi, Matteo Malosio, and Lorenzo Molinari Tosatti. 2014. Using Kinect for upper-limb functional evaluation in home rehabilitation: A comparison with a 3D stereoscopic passive marker system. In 5th IEEE RAS/EMBS international conference on biomedical robotics and biomechatronics. IEEE, 561–566.
  • Schmitter-Edgecombe (2015) Maureen Schmitter-Edgecombe. 2015. Automated clinical assessment from Smart home-based behavior data. IEEE Journal of Biomedical and Health Informatics 1 (2015).
  • Schmitter-Edgecombe et al. (2022) Maureen Schmitter-Edgecombe, Catherine Luna, and Diane J Cook. 2022. Technologies for health assessment, promotion, and intervention: Focus on aging and functional health. In Positive neuropsychology: Evidence-based perspectives on promoting brain and cognitive health. Springer, 111–138.
  • Schütz et al. (2022) Narayan Schütz, Samuel EJ Knobel, Angela Botros, Michael Single, Bruno Pais, Valérie Santschi, Daniel Gatica-Perez, Philipp Buluschek, Prabitha Urwyler, Stephan M Gerber, et al. 2022. A systems approach towards remote health-monitoring in older adults: Introducing a zero-interaction digital exhaust. NPJ digital medicine 5, 1 (2022), 116.
  • Scott et al. (2022) Bradley Scott, Martin Seyres, Fraser Philp, Edward K Chadwick, and Dimitra Blana. 2022. Healthcare applications of single camera markerless motion capture: a scoping review. PeerJ 10 (2022), e13517.
  • Sprint et al. (2016) Gina Sprint, Diane J Cook, Roschelle Shelly, Maureen Schmitter-Edgecombe, et al. 2016. Using smart homes to detect and analyze health events. Computer 49, 11 (2016), 29–37.
  • Sprint et al. (2015) Gina Sprint, Diane J Cook, Douglas L Weeks, and Vladimir Borisov. 2015. Predicting functional independence measure scores during rehabilitation with wearable inertial sensors. IEEE access 3 (2015), 1350–1366.
  • Turjamaa et al. (2019) Riitta Turjamaa, Aki Pehkonen, and Mari Kangasniemi. 2019. How smart homes are used to support older people: An integrative review. International journal of older people nursing 14, 4 (2019), e12260.
  • UK (2019) Age UK. 2019. Later Life in the United Kingdom 2019. Retrieved November 18, 2023 from https://www.ageuk.org.uk/globalassets/age-uk/documents/reports-and-publications/later_life_uk_factsheet.pdf
  • Wagner et al. (2022) Jakub Wagner, Paweł Mazurek, and Roman Z Morawski. 2022. Introduction to Healthcare-Oriented Monitoring of Persons. In Non-invasive Monitoring of Elderly Persons: Systems Based on Impulse-Radar Sensors and Depth Sensors. Springer, 1–39.
  • (76) World Health Organization (WHO). 2022. Ageing and health. Retrieved November 18, 2023 from https://www.who.int/news-room/fact-sheets/detail/ageing-and-health
  • Woehr et al. (2015) David J Woehr, Andrew C Loignon, Paul B Schmidt, Misty L Loughry, and Matthew W Ohland. 2015. Justifying aggregation with consensus-based constructs: A review and examination of cutoff values for common aggregation indices. Organizational Research Methods 18, 4 (2015), 704–737.
  • Xavier et al. (2003) Flavio MF Xavier, Marcos Ferraz, Norton Marc, Norma U Escosteguy, and Emílio H Moriguchi. 2003. Elderly people s definition of quality of life. Brazilian Journal of Psychiatry 25 (2003), 31–39.