1. Introduction
Advancements in medicine and health care technologies have led to an increase in life expectancy over the years. It is expected that, by 2050, there will be at least 2 billion people over the age of 60 years [
1]. The statistical handbook of Japan released in 2021 by the Statistics Bureau, Ministry of Internal Affairs and Communications, Japan has revealed that, in 2015, there were about 22 million households with residents aged 65 and above, including 6 million who lived alone [
2]. Living independently, especially for the elderly, is risky because, in addition to mental problems such as memory loss, depression, and loneliness, there can be physical problems such as falling down, issues with eyesight, hearing loss, back pain, etc. [
3]. Though different remedies have been developed for different types of physical and mental ailments, with an increasing number of elderly people, it is apparent that there is a need for monitoring and anomaly detection mechanisms. A lot of research has thus contributed to recognizing, predicting, and monitoring activities inside smart homes [
4,
5].
As people get older, their involvement in different physical and mental activities decline [
6]. They go out less, engage in activities related to physical fitness less, have difficulty with reading for a long time due to weakened eyesight, and so on. Similarly, they deal with issues they had not dealt when they were younger, such as the need to take medication every day and the adverse effects of missing a meal. Similarly, falls or any similar incidents tend to make the elderly cautious in their activities, impacting their confidence, activity completion, and social interactions. Therefore, it becomes imperative to track whether the elderly has completed basic day-to-day activities every day in order to detect any abnormal conditions that might have occurred or might occur [
5,
7]. There have been many advancements in human monitoring, collecting vital health statistics and tracking human behavior over the recent years [
8]. Off-the-shelf sensors can be now used in houses that can provide information about light intensity, temperature, and usage of doors and appliances of houses [
9], making it possible to determine activities inside the house.
Research has also been carried out in health care centers, but implementing such technology in the home environment is more suitable for the elderly. The elderlies have made memories over the years in their home and have possessions they cherish [
10]. Hence, they feel more comfortable to live in their own home as well as conduct their basic everyday activities. Moreover, hospitals and health care centers are either expensive or overbooked. The cost can be reduced by up to 52% when patients receive treatment and help in their home compared to hospitals [
11]. It is therefore necessary to develop systems that can help to enhance elderly care in their own home rather than hospitals or support homes. Professional caretakers are expensive as well, and with the increasing number of elderly people they tend to be overbooked and busy [
5]. Home monitoring technologies can help family members and relatives who are far away be assured about the safety and contentment of the elderly [
1]. However, their busy schedule may not allow them to monitor the activities regularly, which is why personnel dedicated to remote monitoring such as remote caretakers or volunteers should be assigned the monitoring responsibilities.
With these issues in consideration, in this paper, we propose a monitoring system, PATROL (Participatory Activity Tracking and Risk assessment for anOnymous eLderly monitoring) that can track basic activities of the elderly anonymously inside their home and detect or prevent any potential risks in their day to day activities using a smartphone application. For the successful implementation of the PATROL system, the following requirements need to be fulfilled: (Req. 1) anonymous monitoring, (Req. 2) timely monitoring and report of activities, and (Req. 3) easy and intuitive risk detection because of the following reasons.
Home monitoring can be considered intrusive as in some cases, the elderly may prefer to hide things in their house if there is a video based monitoring or surveillance system [
12]. Similarly, they are also usually concerned about privacy and security, and the types of information about them that are disclosed [
1]. This is why we propose anonymous monitoring (Req. 1), where any personal details of the elderly being monitored is not disclosed to the monitoring person. Smartphones are a suitable device for regular tracking and monitoring since many people carry them the whole day or they are always in the vicinity of the users. Furthermore, notifications have become an essential feature of most of the smartphone applications [
13]. This is why we propose a smartphone application that can be used by volunteers for tracking and monitoring activities of elderly people. Similarly, we send frequent notifications in the smartphone application, which ensures that the monitors can quickly access information about the activities of the elderly, compared to using web pages (Req. 2). Continuous usage of smartphone applications in general has been attributed to factors such as ease of navigation, ease of carrying out actions within the application, and appropriate visual clues [
14], which is why we focus on the visualization of activities and propose a method for visualizing activities and detecting risks in the daily activities that not only helps to identify risks in the activity visualization easily, but also incurs a lesser burden to the monitoring person (Req. 3).
Therefore, in this paper, we propose an elderly monitoring system that can be used by anonymous volunteers to check everyday activities of the elderly and determine if there are any risky situations in their day to day activities. The anonymity is maintained by not disclosing any personal or private information of the elderly to the volunteers, and similarly by not disclosing any personal or private information of the volunteers to the elderly person. Using volunteers for elderly care is a very common practice in Japan [
15] where part-time civil servants committed by the Minister of Health, Labor, and Welfare as volunteers, locally known as minsei-iin, are assigned to regularly check the elderly people personally, have a conversation with them, etc. These part-time civil servants are people who volunteer themselves in the area of helping children, elderly people, people with disabilities, etc. and have no mandatory obligation to serve in such areas. We believe that our system is an extension of such practice in the field of elderly care. Instead of visiting the elderly, our volunteers can check the elderly by using the smartphone application even if they are not in the vicinity of the elderly. This is helpful in cases when the elderly might not prefer an unknown person to visit them personally, and also in cases where the number of people serving as minsei-iin might not be enough. Since in our system, we aim to use multiple monitors, we ensure that the activities of the elderly are regularly checked. To maintain anonymity, even if the volunteers discover a risky situation in the daily activities of the elderly, the handling of such a situation, in person, is carried out by emergency contacts of the elderly, and not the volunteers themselves. For our system, we define risk as a deviation in start/end time and duration of activities from the usual routine of the elderly people.
We developed an Android based smartphone application that provides information about the completion of seven basic activities: sleep, shower, medication, breakfast, lunch, dinner, and entertainment (use of television (TV)). We created a dataset by including some risky situations in the elderly activity dataset [
16] to determine if those situations can be detected using our application design. To make the monitoring process less burdensome and intuitive, we also included visualization features such as a candlestick chart representation of activities, single interface design, and textual and color codes for their current state, through which it is easy to infer any deviation in the completion time and duration of activities. Similarly, we focused on quick tracking and monitoring of activities by including two types of notifications to trigger frequent use of the smartphone application: one sent every two hours, and another sent immediately after the elderly completed an activity.
The main contributions of this paper are the following:
First, we proposed a novel system that can be used by volunteers to anonymously monitor completion of daily activities of elderly people, and report if they detect any deviation in the activities compared to the usual routine of the elderly. We developed an Android based smartphone application that is designed with numerous visualization features and two types of notification strategies to make activity monitoring and detection of anomalies easy, intuitive, quick, and less burdensome.
Second, we evaluated our smartphone application with visualization features and a two notification strategy by comparing it with baseline methods (the method without the notification strategy or the visualization features) and confirmed that our proposed method not only provided better risk identification, but also incurred lesser burden on the monitoring person. We also show that our proposed method resulted in quick tracking and monitoring of activities.
The rest of the paper is organized as follows:
Section 2 introduces some available research and how they relate to our study. In the next section,
Section 3, we introduce our system followed by the explanation of our smartphone application. We explain the evaluation study and findings of the study in
Section 4 and in
Section 5, and we discuss the significance of the results for our system along with the limitations of this study. Finally, we conclude with our contributions in
Section 6.
4. Implementation and Evaluation
In this section, we will explain the details of the experiment conducted to analyze the application, including the dataset used for the application, multiple versions of PATROL application that we created, and finally explain the result of our study.
4.1. Multiple Versions of PATROL Application
In order to concretely determine that our proposed method of a graphical interface (GI), as shown in
Figure 3b, is intuitive and has a higher degree of user acceptance, we needed to compare that interface with commonly used activity representation techniques. To make that distinction, we created a separate version of our application where activities were shown in a textual interface, rather than graphs.
Figure 7 shows the activity report interface of this kind of version of the application. All the features of the application mentioned in
Section 3.2 are included in this version as well, so the working principle is the same regardless of the interface. This helps create less confusion for the participants and ensures that the performance and perception of users is solely based on the type of interface, and not on other features of the application.
Similarly, we created a third version of our application (GR), in which we did not send notifications to the monitors when the activity was completed by a target. We only send them recurring notifications every two hours. With this version of the application, we aim to determine if the monitors are able to report about activities of the elderly even if they do not receive activity based notifications (abN) and thus our strategy of providing both abN (activity based notification) and rN (recurring notification) can be effective to encourage and motivate monitors to use the application frequently and receive continuous reports of activities of the target.
Table 3 summarizes the three versions of the application created, and we will use the same label for versions (GAR, TAR, and GR) in future discussions. GAR refers to the proposed version of PATROL, which consists of a Graphical interface, Activity based notification, and Recurring notification. We investigate the accuracy of risk identification, and the burden of use of our application by comparing the versions GAR and TAR (Tabular interface, Activity based notification, and Recurring notification). Similarly, we compare the effectiveness of using activity based notifications by comparing GAR with GR (Graphical interface and Recurring notification).
4.2. Dataset
The dataset used in our experiment is taken from a real life experiment conducted in the houses of elderly residents over the age of 60 [
16]. The activity dataset was obtained by Matsui et al. through an extensive research conducted over a period of two months, where motion and environmental sensors were installed in each of the houses. Along with that, a physical button was installed in each of the houses, and the residents were requested to press the button whenever they started and ended an activity [
16]. The original dataset consists of activity recognition data from single as well as two-person households. For the purpose of this research, we selected only single resident households that were three in total. We use cleaned and collected data from the above-mentioned study, and consider that the activity recognition system is 100% accurate (we used ground truth labels of activities in the dataset as the output of the activity recognition method).
The daily activities of the elderly that we want to track and monitor are mentioned in
Table 1. The original dataset, however, does not contain data related to the Medication activity. Similarly, we also wanted to include multiple activities related to frequent use of TV. To fulfill our desired dataset, we added aforementioned activities into the original dataset. The total period of experiment of the two-month study was longer than our intended experiment period of 10 days. Hence, we only selected data for a 10 day period from the available two months of data. We included data from the same time period section for all the three single-resident households.
We included some risky situations into the dataset based on the definition shown in
Table 2. For the purpose of our research, we included only
low and
medium level risks. As defined,
none risk indicates that there is no problem with the elderly. Hence, we do not need to alter the dataset for such risk, since they concur with the regular routine of the elderly. If the level of risk is
high, it indicates that the elderly person is in a serious condition and in need of immediate medical care. In such cases, no activity will be completed by the elderly, and the activity report in the application will not be updated.
However, our aim is to determine if any deviation from regular routine of the activities could be determined using our application. Though high risks can occur suddenly, we also think that, if we regularly monitor and determine low and medium level risks, then high level risks can be prevented or predicted. Because of this, we did not include high level risks in our dataset.
4.3. Experiment Details
We recruited a total of nine participants (gender: 6 Male, 3 Female; age range: 25–34 years old, average age: 28.6 years) to take part in our evaluation study. The participants were playing the role of ‘Monitors’ throughout the experiment. The modified dataset of the three single-person households were used for the three ‘Targets’ in the application. The participants were divided into three groups each. Thus, we had three participants each in three study groups. This was carried out to implement random distribution of our application in a way that each group, with an equal number of participants, will use a different application at a given time compared to other study groups. To implement that, we divided the experiment period into three phases in total.
Table 4 simplifies the study group and application interface division.
The three versions of the application were uploaded to Google Play Store. Before the start of the experiment, we conducted a research and experiment introduction session that all the participants were requested to attend compulsorily. We explained the theme of the study and experiment in detail, their role as monitors, and the tasks they have to complete while using the application. They were also provided a document containing all the information about the working principles of the different versions of the application, along with QR codes for each version. The documents also indicated the version of the application they were supposed to use in each phase of the experiment. As a reward for participation in the experiment, the participants were provided with a gift card worth 2000 JPY.
To make the transition between interfaces easier for the participants, we included a one day gap between each phase. The participants were asked to take a break for a day in between the phases. The phases were designed to be of three days each. However, at the start of phase 2, we encountered some complications with the server connected to our application, and the application did not work properly until mid-day. Hence, we asked the participants to continue phase 2 for one day more. Thus, in total, the experiment period consisted of 12 days, with breaks of two days in total. After the end of each phase, we asked the participants to fill in a questionnaire developed using Google Forms. Most of the questions had to be rated on a five-point Likert scale (1 = strongly disagree, 3 = neutral, 5 = strongly agree), while some of them were open-ended. The participants were asked to respond to questions or statements related to their perception of the version of the application, as well as the effect of change in the version of the application, such as “The activity related notifications were helpful in monitoring the elderly as it reminded me to check the application regularly.”, “I found the change in the interface confusing.”, and “I feel the new interface needed more mental effort.” At the end of the experiment, the participants were asked to fill out a final questionnaire. The purpose of these questionnaires is to gain insight into the impression of the participants for different versions and different notification types.
4.4. Results
The results of our study are analyzed based on the following three conditions:
Accurate detection of risky situations;
Low burden of monitoring on Monitors;
Timely Detection of risky situations.
4.4.1. Accuracy of Risk Detection
In order to verify the effectiveness of our visualization technique, it is necessary to check if the risks included in the application, as mentioned in
Section 4.2, will be identified correctly. In this section, we report the rate with which the risks included in the dataset were correctly identified in each phase, using different versions of the application.
Table 5 and
Table 6 show rate of correct identification of risks based on study groups and interfaces, respectively.
From
Table 5, we can observe that StudyGroup C was the most consistent group, with the highest risk identification rate during all of the three phases of the experiment. The rate of correct identification also increased along with the experiment, which proves that familiarity with the application helped to analyze the activity reports and submit reports.
There was a slight decrease in risk identification for StudyGroup A when the interface changed from graphical (GAR) to tabular (TAR) in phase 2 of the experiment. All of the participants in StudyGroup A agreed that the new interface needed more time to analyze in their questionnaires after phase 2, with 66.7% agreeing that the tabular interface (TI) needed more mental effort than graphical interface (GI). When the interface changed to graphical layout (GR) in phase 3 of experiment, there was an increase in the correct rate identification. When asked about the change, participants claimed that it was easier to understand the routine with the graph compared to tabular layout (66.7% agree, 33.3% strongly agree).
StudyGroup B showed a considerable increase in correct risk identification, in phase 2, as shown in
Table 5, even though they had graphical layout for both phase 1 (GR) and 2 (GAR). We can predict that familiarity with the application was the reason for such change. In their questionnaire after phase 2, 66.7% strongly agreed that they were familiar with the application and found it easier to use the application during this phase. However, in phase 3, their interface changed to tabular layout (TAR). This led to reduction in risk identification, with 33.3% strongly agreeing that the change in interface was confusing.
As shown in
Table 6, we found out that, in total, using GAR, on average about 75.2% of the time the risks were identified correctly. In comparison, the risks were identified correctly about 65.8% of the time using TAR. GR, which in this context, is the same in visualization as GAR had a risk identification accuracy of about 68.5%. The average rate of risk identification is lower for tabular interface (TI), compared to both of the graphical interfaces (GI). This can help to identify that graphical interfaces (GI) provide better understanding or identification of risks.
We also found statistically-significant differences between the average risk identification rates of the three interfaces using the one-way ANOVA method (p = 0.037). A Tukey-HSD post-hoc test revealed a significant pairwise difference between interfaces GAR and TAR (p = 0.032) whilst no difference was observed between GAR and GR (p = 0.2).
To investigate this further, we combined the results of GAR and GR into a single group and compared it with TAR, to clearly determine differences between graphical and tabular interfaces for risk identification. Through the paired t-test analysis, we found that there is a significant difference between the two (p = 0.047).
4.4.2. Low Burden Evaluation
We define burden as the time taken by the participants between opening the application to check the activity report of targets and submitting the report. We logged the time of opening of the application as well as the time of reporting using “Shared preference” functionality available for Android developers. These time periods were saved together in the Firebase database. We analyzed the burden time for each participant using this data and calculated an average burden time for each participant over the whole experiment period, which is shown in
Figure 8. The average burden time for each of the versions is also shown.
We can see that the burden time for GAR, on average, is always less than TAR. The mean burden time for GAR, TAR, and GR were observed to be 28 s, 38 s, and 52 s, respectively. As seen in
Figure 8, the burden for participant 1 while using GR is very high compared to other participants, and other interfaces used by the same participant. Upon inspection, it was discovered that, while using GR, for one particular report, the participant recorded an unusually high burden time, which was uncharacteristic for the participant based on his other responses. Discarding the unusually high burden time, the average burden time of the participant 1 was reduced from 193 s to almost 20 s. However, for the final analysis, the skewed data are kept as it is. Similarly, the burden for participant 2 while using TAR is zero because the participant did not record any response during phase 2 of the experiment.
To analyze the link between burden of using the application, and engagement with the application over time, we calculated the average time it took to report based on the phases of the experiment. The results are shown in
Figure 9. When the interface changed from graphical (GAR) to tabular (TAR), in phase 2 for StudyGroup A, we can see that the burden time was higher. In phase 3, when their interface changed back to graphical (GR), the burden time was observed to be extremely high (94 s) due to the unusual reporting by participant 1 as explained above. Discarding that particular incident, the burden time was observed to be lower than in phase 2 (28 s).
For StudyGroup B, the burden time was highest in phase 1, with 47 s, when using GR. However, the burden time decreased in phase 2 (25 s) when using GAR. This can be attributed to the participants getting familiar with the interface. In phase 3, however, when the interface changed to tabular (TAR), we can see that the average burden time increased to 37 s.
Similarly, when the interface was changed from tabular (TAR) to graphical (GR), for StudyGroup C in phase 2 of experiment, we can see that the average burden time was lower (22 s). Even though the burden time increased in phase 3 (25 s), using GAR, it was still lower than the burden time in phase 1 (42 s). Therefore, over the course of the experiment period, we can observe that change in interface had some effect on the engagement with the application and burden time. Familiarity with the application lowered the burden time, especially using a graphical interface (GI).
We found a statistically-significant difference in the burden time for the three interfaces using a one-way ANOVA method (p = 0.012). A Tukey-HSD post-hoc test revealed a significant pairwise difference between interfaces GAR and TAR (p = 0.039) whilst no difference was observed between GAR and GR (p = 0.13).
For further investigation, we combined the results of GAR and GR into a single group and compared it with TAR and through a paired
t-test analysis; we found that there is a significant difference between the two (
p = 0.049). This analysis, along with the results from
Figure 8 and
Figure 9, help to show that there is a significant difference between tabular and graphical interfaces for the burden faced while using the application, with a graphical interface resulting in a lower burden for the participants.
Lesser burden also resulted in higher engagement with the application.
Figure 10 shows that the total number of reports received using GAR across different phases were almost consistent across the three phases, and on average higher than when using TAR. There was a significant decrease in reports using TAR in phase 2 for StudyGroup A. This can be attributed to change in their interface because, in an earlier phase, they used graphical interface (GI). They also mentioned in the questionnaire after phase 2 that tabular interface (TI) was difficult to understand, which resulted in a lower number of reports.
We can thus conclude that GAR provides lesser burden to participants, in comparison with TAR, and on average has higher engagement and reporting. This further strengthens our proposal that graphical interface (GI), with adequate textual information, can be helpful for monitors to identify the routine of targets and distinguish risky situations whilst spending less time and effort analyzing the interface.
4.4.3. Timely Detection
Figure 11 shows the time taken to report about a completed activity during each phase, based on types of interface. Over the three phases of experiment, we can observe that using a graphical interface (GI), the reports for activities were received quicker compared to tabular interface (TI): GAR (average = 176.46 min, median = 115.01 min), TAR (average = 201.42 min, median = 118.85 min), and GR (average = 166.9 min, median = 121.12 min). Even though such high response times for the report are not favorable, we think that there were many factors that affected the reporting time for activities.
The time of notification generated, which is also the time when the activities were completed, was saved using "Shared preference" functionality, as mentioned in
Section 4.4.2. Similarly, we also saved the time when the activity report was submitted. We determine the time taken to report an activity by calculating the time difference between report submission and notification generation. For StudyGroup A, when the interface changed from graphical (GAR) to tabular (TAR) in phase 2, the reporting time was higher compared to phase 1, even if they had received both rN (recurring notifications) and abN (activity based notifications) in both of the phases. This can be attributed to the change in interface because, when their interface changed back to graphical (GR) in phase 3, the time of response also was observed to be lower than on phase 2, even though they did not receive abN. This shows that type of visualization can have an effect on the response time for notifications received.
StudyGroup B were almost consistent in their performance throughout the first two phases of the experiment period. In phase 2, when their interface changed from GR to GAR, there was no significant change in their response time even if they did not receive abN. However, when their visualization changed to tabular (TAR) in phase 3, the time of responses was higher than in the previous two phases.
In contrast, StudyGroup C did not show any significant differences in response time for activities based on changes in interface as well as reception of abN. When their interface changed from TAR to GR in phase 2 and from GR to GAR in phase 3, their response time for notifications did not show any high amount of significant differences. StudyGroup C thus did not show any conclusive effect for the change in visualization or notification strategies for the reception of reports to activities.
Table 7 shows the average response time of each participant while using each of the interfaces, where the lowest response time taken among the three interfaces is highlighted. Even though TAR consisted of both abN and rN notifications, we found that none of the participants responded quickly while using it. Moreover, the mean response time using TAR is highest across all the participants (except participant 2, who did not register any response during phase 2). We found that, even though they did not receive abN, some of the participants (4) recorded lowest mean response time using GR. GAR and GR recorded mean response times of about 176.46 min and 166.9 min respectively, while TAR had a mean response time of 201.42 min. Even though GR had lower average response time, we observed that the median response time for notification was lower for GAR (115.01 min) compared to GR (121.12 min) and TAR (118.85 min). This shows that reports were received quicker using GAR than GR or TAR.
Upon further analysis, we found statistically-significant differences between activity response time for the three interfaces using a one-way ANOVA method (p = 0.005). A Tukey-HSD post-hoc test revealed a significant pairwise difference between interfaces GR and TAR (p = 0.05) whilst no difference was observed between GAR and GR (p = 0.64) or between GAR and TAR (p = 0.055).
We then combined the results of interfaces that received abN, i.e., GAR and TAR, into a single group and compared it with GR, and found that a paired t-test shows a significant difference between the two (p = 0.022).
This shows that reception of abN does indeed have an effect on the time for response to the activities. To investigate this further, we determined the time range within which the responses to the activity notifications were received.
Table 8 shows the cumulative percentage of reports received within the given time ranges for the three versions of the application. We divide the time into 30 min intervals; however, the table only shows until 210 min, since the highest average time of response is within the 180–210 min range. We can see that the amount of responses received does not vary by a large amount if graphical interfaces are compared. However, for tabular interfaces, the response rate is lower even if abN was received. This shows that abN, when used with a graphical interface, provides a better result than compared with tabular interface. We then tried to investigate which interface provided the quickest response for activities.
We divided the notifications into those that were for regular activities and those that were for the risky situations. By using the time taken to report to activities, we determined the minimum time taken to submit a report for an activity among all the participants, and the version of the application used to submit that report. Thus, we found that, using which particular version of the application, we received the quickest response for each of the activities. The results are shown in
Figure 12 and
Figure 13. We can see that the risky situations responded quicker when using interfaces that consisted of abN, even though there is not much difference between interfaces for the quickest time of response to non-risky notifications.
In the final questionnaire, the participants responded with the reasons that could also provide the reason for such higher response time. Almost 45% participants (n = 4) mentioned that they were busy with their research/private work and could not respond to the notifications on time. We received responses such as: “I was so busy with my work”; “Busy with my research work or play a game"; “mentally busy with my own work"; “sometimes i was busy”. Similarly, two of the participants mentioned that they often forgot to check the application. This can be attributed to the different interface types used and notifications received.
Two of the participants responded in the questionnaire that they did not use the application if they did not receive any notifications, while six (66%) of them said they did not wait for the notifications to use the application but were busy with their work and could not respond immediately. We also wanted to know if the notifications received were perceived as distracting or disturbing, to analyze if their perception played any role in the response time. When asked if the notifications received from the application were distracting, 2 (22%) of them strongly claimed they were not disturbed, 5 (55%) said they were not disturbed, while 1 of them was neutral, and 1 agreed that he was distracted. Similarly, 8 (88%) (strongly agree: 4; agree: 4) agreed that they prefer to receive abN so that they can be regularly notified monitor frequently, while 1 of them was neutral.
5. Discussion and Limitations
In this section, first we discuss the results and verify research questions RQ1–3 mentioned in
Section 2.3, then we show some remaining issues as limitations.
5.1. Discussion
When considering user engagement and their ability to identify routine of individuals with the interface, we can conclude that the results are fairly positive towards GAR, as compared to TAR. Using GAR, we found that 75.2% of risky situations were correctly identified as risks, compared to 65.8% and 68.5% for TAR and GR, respectively. Though identification of risk varied between study groups using GAR (68.4% for StudyGroup A; 64.7% for StudyGroup B, and 92.6% for StudyGroup C), the overall identification rate is higher for GAR. This shows that risks can be identified using graphical interface and the style of graph that we used. A response from a participant, “I can see the difference of the duration directly from the graph. The table one need to scroll up and down to see all the information, which sometimes kind of annoying” also suggests that our visualization is effective. These findings justify our research questions, RQ1 and RQ2, that it is possible to identify the daily routine of individuals using a smartphone application, and it is possible to detect potential risks in such routine based on the visualization provided.
Using GAR, participants faced the lowest burden of 28 s, compared to 38 s in tabular (TAR). Similarly, none of the participants claimed that the application demanded a lot of time and effort from them. Regarding notifications, only one of the participants found them distracting, and 88.8% mentioned that they will prefer to receive activity based notifications for monitoring purposes. Similarly, all of the participants (77.8% strongly agree, 22.2% agree) responded that the use of traffic colors was useful to identify the state of the activities quickly. Therefore, we can verify RQ3, that constant notifications and using the application was not troublesome for the users.
We received a total of 1680 responses from participants over the experiment period. We can claim that such interaction is a result of their willingness to use the application. When interface of participants changed from graph to table, there was a reduction in the number of reports obtained (45.6% for StudyGroup A in phase 2, and 9.8% for StudyGroup B in phase 3). Similarly, when the interface changed from tabular to graph, we obtained an increase in the number of reports by 96.7% for StudyGroup A in phase 3 and reduction by 11.5% for StudyGroup C in phase 2. In total, the engagement with the application is high, which along with the lower interface analyzing time, verifies RQ3, that using the application is not a burden for the monitoring person.
At the end of the experiment, we asked the participants which representation of activities they preferred: table or graph. All of them agreed that graphical representation was better. Some of the responses we received, such as, “Got on a quick glance the exact duration of past activities and could check exact time of the day”; “With graph, it’s easy for me to compare the length of activity at the glance.”, further strengthens our proposal that the graphical interface we proposed can help to identify a daily routine in a clear and intuitive manner and further justifies RQ1, that a smartphone application can be a good tool for identifying daily activities.
5.2. Limitations
Our system evaluation requires that there are certain risky situations in the activity of the elderly. We did not conduct a real-time activity recognition of elderly, but instead, we used a pre-existing activity dataset because, in real-time scenarios, there is no surety of receiving such risky situations, and we would need to request someone to deliberately change their activity pattern so that others could detect it. Such a situation can invoke unfavorable reactions. Similarly, since activity recognition systems are not perfectly accurate, sometimes the activities may not be correctly identified, or falsely identified, which would hamper our evaluation. Moreover, we recruited students for the experiment, but they are always busy because of their academic work, and/or personal lives which might have affected the number and time of reception of reports.
6. Conclusions
In this study, we proposed a system, PATROL, that can be used to anonymously track everyday activities of the elderly and identify any potential risks in their daily routine using a smartphone application. Our system is aimed to be deployed in elderly residential areas or communities and does not disclose any private information such as age, location, etc. to the monitoring person to maintain the privacy and security of elderly residents. The monitoring person receives recurring notifications every two hours and activity-based notifications whenever an elderly person completes an activity from the service server and assesses elderly condition by a smartphone application visualizing elderly activity history. We designed our application with features such as single interface design, intuitive graphical user interface for activity and anomaly detection, and color and textual information for state of activities. These features altogether help not only to conduct quicker monitoring of activities of elderly, but also to induce a low amount of burden to the monitoring person, who at once may be responsible for monitoring single or multiple elderly people.
We added risky situations in an activity dataset obtained from a real-life experiment with elderly residents and conducted a user study using the proposed method and two other baseline methods varying in visualization and notification techniques for three groups consisting of nine participants. We found that with our proposed method, 75.2% of the risks were successfully identified, while 68.5% and 65.8% were identified with other methods. The proposed method also provided a better result for the timely reception of activities: GAR ( median = 115.1 min), TAR (median = 118.85 min), and GR (median = 121.12 min). Moreover, the interface analyzing and reporting time was also lower (28 s) in the proposed method compared to 38 and 54 s in other methods. As future work, we will conduct real-time activity recognition and monitoring using our application. To achieve that, we will also research/work on activity recognition systems using other kinds of sensors that can not only potentially provide better activity recognition in real time but also remove dependency on the elderly person for data collection. Moreover, we will explore the possibility to assess the elderly’s activity state and detect anomalies by using measurements from ambient sensors (temperature, humidity, illumination, etc.). We will also include high risk situations such as Fall (and no activities after the incident) and try to determine if participants will be able to deduce such emergency situations quickly. We will also aim to increase the number of participants to receive more reports and analyze the results based on age, gender, etc.