Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Leveraging Mobile Sensing and Bayesian Change Point Analysis to Monitor Community-scale Behavioral Interventions: A Case Study on COVID-19

Published: 03 November 2022 Publication History

Abstract

During pandemics, effective interventions require monitoring the problem at different scales and understanding the various tradeoffs between efficacy, privacy, and economic burden. To address these challenges, we propose a framework where we perform Bayesian change-point analysis on aggregate behavior markers extracted from mobile sensing data collected during the COVID-19 pandemic. Results generated by 598 participants for up to four months reveal rich insights: We observe an increase in smartphone usage around February 10th, followed by an increase in email usage around February 27th and, finally, a large reduction in participant’s mobility around March 13th. These behavior changes overlapped with important news events and government directives such as the naming of COVID-19, a spike in the number of reported cases in Europe, and the declaration of national emergency by President Trump. We also show that our detected change points align with changes in large scale external sources, including number of COVID-19 tweets, COVID-19 search traffic, and a large-scale foot traffic data collected by SafeGraph, providing further validation of our method. Our results show promise towards the feasibility of using mobile sensing to understand communities’ responses to public health interventions.

1 Introduction

COVID-19 is a global pandemic that has devastated the world through its toll on human lives and the economy. In the absence of pharmaceutical interventions like vaccines, non-pharmaceutical interventions such as social distancing and stay-at-home orders are key in preventing the healthcare system from becoming overburdened [15]. As evidenced by the 1918 Spanish flu pandemic, early interventions from local, state, and central governments ultimately determine the severity of infections and can reduce casualties by a factor of 1/8th [16]. With new interventions being implemented, there is an urgent need to track the impact of these interventions in real time to respond in a timely manner. Motivated by these challenges, we seek to address the following question: In the absence of widespread and rapid testing, how can we use mobile sensing data to detect COVID-19-related behavior changes and measure effectiveness of public interventions?
While there have been a few papers studying the impact of government interventions on the transmission rate of COVID-19, these studies often focus on the number of cases as an indirect proxy for human behaviors [12]. While the number of cases is useful to assess an intervention’s efficacy, it has its caveats. Not only will it lag behind other time-series data from social media and mobile sensing, but the data quality also depends substantially on the number of testing kits available [23, 28], making it challenging to infer behavior change. To amend this limitation, mobile sensing can supplement existing monitoring approaches. The rich behavioral markers extracted from in-the-wild personal sensing using smartphone-embedded sensors can provide a direct glimpse into various aspects of human behaviors and how they related to COVID-19 [20]. We use community aggregates (1) of mobile sensing features such as total distance traveled and step count to characterize human behavior during a critical period of the COVID-19 pandemic in the U.S. While these behavior features give us a glimpse into human behavior, it is challenging to visually discern when a significant change has occurred, due to high amount of noise in real-world data. At the same time, multiple change points might be a good fit for the data. A robust statistical framework is thus needed to model uncertainty. One, which has worked remarkably well in this setting, is a Bayesian Change Point Detection Approach. By constructing a posterior distribution over all possible change point configurations, we can draw samples from the posterior distribution to quantify uncertainty.
Fig. 1.
Fig. 1. Overview of our approach: (a) We propose to build a hierarchical time series representation by aggregating information at the level of neighborhoods, cities, and states. By aggregating features at the level of neighborhoods and building a hierarchy of aggregation, we can understand behavior change at different scales without revealing the user’s information. (b) Using this approach, various targeted interventions (e.g., social distancing commercials on TV/Youtube, full lockdown) can be performed at different scales. (c) We test the efficacy of interventions by performing Bayesian changepoint analysis.
Finally, while looking for behavioral changes using mobile sensing is a promising avenue for current research, it is also essential to understand its limitations. Data generated from mobile sensing studies tend to represent only a subset of populations and may not scale to the whole community. We validate our behavioral change points by comparing them with significant changes in behaviors extracted from external data streams generated by millions of participants from social media platforms such as Twitter, Google searches, and foot traffic to millions of Point of Interest locations in the United States.
Our key contributions can be summarized as follows: (1) We propose a framework to extract privacy-preserving behavioral markers (Figure 1) based on mobility, social interactions, and smartphone apps usage, to understand community behavior change at different scales during a pandemic. (2) We perform Bayesian change-point analysis on the proposed behavioral markers and identify critical dates when changes were detected in multiple behavior streams simultaneously. Using the proposed framework, for a sample size of N = 598 participants, we show that there were statistically significant behavioral changes at three dates. (3) We also identify statistically significant changes in macroscopic time series from millions of users on Twitter/Google search and SafeGraph within our uncertainty interval [1] and find consistent change points within a three days margin of error. (4) Finally, for the significant dates identified in (2), we identify clusters of events of national importance that could have given rise to these changes. Our framework allows policymakers in measuring the effect of public health interventions on people’s behavior, which in turn governs the infection rates and overall deaths.
The rest of the article is organized as follows: We first describe our dataset, the extracted behavior markers, and the Bayesian change-point detection algorithm. Next, we present our results and how the change points detected by our method correlate with significant events such as spike in cases in Europe and the declaration of national emergency by the US President. We also present how our results overlap with behavioral change points detected from other data sources, including Twitter, Google searches, and mobility data.

2 Related Work

The enormous social and economic impacts of COVID-19 have spurred research in various areas. Liu et al. [25] studied spikes in unproven COVID-19 therapy searches based on tweets of celebrities. Wang et al. [35] applied text mining and information retrieval methods to help inform policymakers and biomedical experts with effective treatment and management outcomes. Chen et al. [11] collected tweets with COVID-19-specific terms to understand the pandemic’s discussions in social media.
Several studies used self-reported measures to evaluate compliance: Barari et al. [5] performed a national survey in Italy to understand compliance with public health messages. Wise et al. [37] focused on risk perception changes during the first week of the pandemic. Bults et al. [10] looked at the behavioral responses of UK adults during the first phase of the pandemic.
While large-scale human behavior is challenging to quantify without the intrusive use of cameras and other monitoring techniques, passive sensing provides a less cumbersome and more objective way of assessing human behavior. Boukhechba et al. [8] monitored social anxiety correlated mobility and communication patterns with participants’ anxiety levels using the Social Interaction Anxiety Scale (SIAS) [9]. Wang et al. introduced StudentLife, a large-scale dataset that used mobile sensing for research on sociability, mental well-being, and students’ academic performance [36]. LiKamWa et al. [24] showed that feeling sad is strongly correlated with decreased mobile phone activity such as SMS Messaging, Bluetooth-detected contacts, location entropy, and a decrease in phone calls. Mobile sensing also opens up possibilities for non-intrusive monitoring of group patterns instead of individualized assessments. A recent publication from Trifan et al. [34] showed that about 30% of the papers in this behavioral health domain had analyzed the participant’s physical activities. Lustrek et al. [26] used physical activity data relying on GPS data, step counts, visible WiFi access spots, and accelerometer data to understand lifestyle activities such as eating, exercise, and work.
In the context of COVID-19, there has been research looking at changes pre- and post-pandemic. Andersen et al. [4] used GPS data to demonstrate evidence of compliance with social distancing regulations. Sun et al. [32] compared mobility and Bluetooth markers during the pre- and post-lockdown period. Huckins et al. [19] leveraged ecological momentary assessment (EMA) and passive sensing data to compare metrics of mental health of students during COVID-19 with a similar time during the previous year. Sanudo et al. [29] looked at physical activity, sedentary behavior, and sleep patterns pre- and post-quarantine. Pepin et al. [27] seek to understand the reduction in the step count pre- and post-lockdown. While these approaches are interesting, they require to manually specify the changepoints, something we infer automatically. This is particularly important in the context of gauging policy efficacy when knowing when the change happened is as important as knowing if a change happened at all.

3 Methods

3.1 Data Acquisition

The data was collected using a cross-OS (Android and iOS) mobile sensing platform called Readisense. It was built on top of Sensus [38]; an open-source crowdsensing tool developed at the University of Virginia since 2014. The application captures most common data streams from smartphone embedded sensors and applications, though there are some operating system-specific sensors like “Healthkit” data, which is available only in iOS. The application supports both Listening Probes (i.e., when a change in a sensor is noted, the application records those data) and Polling Probes (i.e., data is collected at regular intervals from the sensors). Even though a wide variety of sensors are available as part of the anonymous-app, we limited our study to the ones that captured human behavior change during the COVID-19 pandemic, including GPS, Step Count, App Usage, Battery Level, Wireless Access Point, and Screen Usage. The data collection was approved by the Institution Review Board (IRB), where 2,700 participants across US were recruited to participate in a four-month study to examine the relationship between mobile sensing data and symptoms of infectious diseases. Only those with at least 14 days of GPS, Bluetooth, and Activity data were considered, leaving 598 participants. The participants’ average age is 40.71 (sd = 11.52). Females make up 65.36% of the participants, with Whites accounting for 66.2%, African Americans 19.3%, Asians 7.3%, people of other races 3.3%, Hispanics 3.5%, and others 0.4%. The selected 598 participants were the ones being active during the COVID-19 pandemic. Fifty-two percent of the participants are full-time employees, 20% are full-time students, and 12% are working part-time. The top three states based on the number of participants are North Carolina, Virginia, and Ohio.

3.2 Feature Extraction

We collected a wide range of features characterizing the participant’s mobility, app usage, and physical activity. We primarily focus on mobility features in our study, since it is essential to study the participants’ movement patterns to quantify interventions such as social distancing. We also collected app usage and the percentage of time a participant’s screen was on to quantify how the participants’ interaction patterns with their smartphones have changed as a consequence of COVID-19. Since we were interested in behaviors at a community level, we aggregate all the sensor data for the participants from the different states for the case study. More detail is provided below.

3.2.1 Mobility and Activity Features.

Human behavior is characterized by repetitive patterns of mobility and location traces [3]. Mobility features from location traces have been shown to be predictive of a participant’s mental health and well-being [8]. It has also been used to study compliance with public health policies [6]. We extract the following mobility features:
Daily Step Count: The total steps taken by a participant per day. This provides a proxy to how much physical activity the participant is performing per day. Larger daily step counts have been shown to be correlated with lower depression, anger, and mood distress in previous studies. [39]
Total Distance Moved: We calculate the daily total distance traveled by each participant. Since we are working with anonymized GPS coordinates, it is impossible to use the Haversine distance. Thus, we calculate the Euclidean distance between latitude and longitude and scale it up by a factor of 60 miles to obtain distance in meters as an approximation of the Haversine distance. The approximation is reasonable, considering that most participants travel within local neighborhoods on a day-to-day basis.
Time Spent Outside Home: We use the DBSCAN [14] clustering algorithm to extract clusters that represent different places. We choose a radius of 10 meters and 10 points per cluster as hyper-parameters to the DBSCAN algorithm and denote home as the cluster with the largest number of entries between 12 am and 6 am.
Radius of Gyration: This is the standard deviation of GPS positions of each participant relative to the centroid of GPS coordinates. A small radius of gyration indicates sedentary behavior as opposed to a large radius, which is characteristic of long-distance commuters [18].
Access Point Change: We count the number of times a participant’s WiFi connection changes. Large values could indicate the participant switching between WiFi hotspots frequently, which is likely to happen in a public and business areas. A small value could indicate being at a place like a participant’s home, where the participant is not leaving with a fixed connection [21, 22, 30].

3.2.2 Smartphone Usage-based Features.

Increased smartphone usage has been linked to poorer self-reported measures of anxiety, sleep quality, and stress [31, 33]. We extract the following features to investigate changes in the participants’ smartphone usage patterns.
Battery Level: We use minimum, maximum battery levels of a participant’s device across all participants to quantify how frequently they were charging their phone.
App Usage: We primarily focus on most frequently used apps from various categories. This includes social apps such as Facebook, Twitter, and Instagram, engagement apps such as health and fitness, transportation apps, as well as finance apps.
Screen-on Fraction: What fraction of queries performed by anonymous-app found the screen to be in an active state. This provides us a proxy for how much the participants were using their smartphones.

3.3 External Data Sources

We validated the changepoints detected from mobile sensing by comparing them to the following external sources:
Twitter: We use publicly available data from TweetBinder [2] on the number of tweets with the following hashtags #covid19, #coronavirus, or #covid-19, between Jan 1st and Apr 15th.
Google searches: We extract people’s search volume index from Google Trends over the course of the pandemic.
Foot traffic data collected by SafeGraph based on visits to commercial and non-commercial locations [1]. SafeGraph aggregates point of sale from over 6 million point of interest locations in the US.

3.4 Privacy Considerations

Relying on multiple sensors to understand human behavior is the cornerstone of mobile sensing research. Simultaneously, since mobile phones come with various sensors, privacy is a crucial concern for researchers and research participants alike. We addressed this in three critical ways: (1) We anonymized all the collected data by only storing generic participant IDs for all our experiments, since we were interested in participants’ aggregated and group behavior. (2) GPS data was anonymized by omitting the integer parts. We only used the decimals for computing location-based movements. (3) For other sensors, data was anonymized (e.g., Bluetooth Mac addresses were hashed) on the device before being synced to the cloud. The data were stored in a HIPPA-compliant secure server and was strictly available to researchers under the IRB only.

3.5 Bayesian Change-point Analysis

We formulate the problem of detecting changes in the time series as a Bayesian change-point detection problem. Using a Bayesian methodology allows easy quantification of uncertainty and integration of priors (for instance, feeding in government directives as metadata instead of assuming a uniform distribution of change-points over locations). We use Barry and Hartigan’s [7] Bayesian change-point model for our analysis. Although such a formulation supports identifying mean, variance, and intercept changes, it is practical to choose models that are simple and easy to interpret. Furthermore, increasing evidence showed that large-scale behaviors in communities such as Twitter and financial markets, while being driven by external events, are characterized by discrete shifts and bursts. Since we are working with time series of means of features on a community level, which are being driven by news events and government policies, it is appropriate to identify similar discrete changes. We thus primarily analyze changes in levels.
The adopted Bayesian change-point model assumes that there is an unknown partition \(\rho\) of the data into contiguous blocks, such that within each block, the mean remains the same. The model also assumes an independent normal distribution for each block.
Let us assume we have n data points \(\lbrace X: X_1,\ldots ,X_n\rbrace\) . Let \(\rho = (U_1,\ldots ,U_n)\) indicate a partition of the time series into non-overlapping partitions. We use a Boolean array of change-points to denote the partitions. At each timestep, if \(U_i\) takes a value 1, then we have a new block; else, we remain in the same block.
We are interested in the posterior density \(f(\rho |X)\) . By Baye’s theorem, this can be written as
\begin{equation} f(\rho |X) \propto f(X|\rho) f(\rho). \end{equation}
(1)
Prior cohesion density: Let p denote the probability of getting a change point at each location. We assume this probability to be the same at each location. If we assume that there are b partitions, then the prior cohesion density can be written as
\begin{equation} f(\rho |p) = p^{b-1}(1-p)^{n-b} . \end{equation}
(2)
The joint density of observations and parameters given \(\rho\) is a product of densities of different blocks over the blocks in \(\rho\) . Let us consider a single block. If we assume that the data in this block is generated by a Gaussian with mean \(\theta\) and variance \(\sigma ^2\) , then let the prior density of \(\theta\) be a Gaussian with mean \(\mu _0\) and variance \(\sigma _0^2\)
\begin{equation} \begin{split}f(X_{ij}, \theta) = \Pi f(X_k|\theta) f(\theta)\\ f(X_{ij}) = \int \Pi f(X_k|\theta) f(\theta) d\theta . \end{split} \end{equation}
(3)
The above integral can be simplified to the expression below
\begin{equation} \begin{split}f(X_{ij}) = \left(\frac{1}{2\pi \sigma ^2}\right)^{(j-i)/2} \left(\frac{\sigma ^2}{\sigma _0^2 + \sigma ^2}\right)^{1/2} exp(V_{ij}), \end{split} \end{equation}
(4)
where
\begin{equation} V_{ij} = -\frac{\sum _{l=i+1}^{l=j} (X_l - \hat{X}_{ij})^2}{2\sigma ^2} - \frac{(j-i) (\hat{X}_{ij} - \mu _0)^2}{2(\sigma ^2 + \sigma _0^2)} \end{equation}
(5)
and \(\hat{X}_{ij}\) is the mean of the observations in the partition. However, \(f(X_{ij})\) still depends on the parameters \(\mu _0, \sigma ^2, \sigma _0^2\) . Defining \(w=\frac{\sigma ^2}{\sigma _0^2 + \sigma ^2}\) and choosing the following priors for the parameters:
\begin{align} f(\mu _0) &= 1, -\infty \le \mu _0 \le \infty \nonumber \nonumber\\ f(p) &= 1/p_0, 0 \le p \le p_0 \nonumber \nonumber\\ f(\sigma ^2) &= 1/\sigma ^2, 0\le \sigma ^2 \le \infty \\ f(w) &= 1/w_0, 0 \le w \le w_0 \nonumber \nonumber \end{align}
(6)
\begin{equation} \begin{split} f(X|\rho , \mu _0, w) = \int _{0}^{\infty } 1/\sigma ^2 \prod _{ij \epsilon P}^{} f(X_{ij}) d\sigma ^2 \end{split} . \end{equation}
(7)
After integrating our \(\mu _0\) and w, this can be simplified to the indefinite integral below. We urge the readers to read Reference [7] for the full derivation.
\begin{equation} f(X|\rho) \propto \int _{0}^{w_0} \frac{w^{(b-1)/2}}{(W + Bw)^{(n-1)/2}}dw, \end{equation}
(8)
where
\begin{equation} \begin{split}\hat{X} = \sum _{i=1}^{n} X_i/n, \; B = \sum _{ij \epsilon P} (j-i) (\hat{X}_{ij} - \hat{X})^2, \\ W = \sum _{ij \epsilon P} \sum _{l=i+1}^{l=j}(X_l - \hat{X}_{ij})^2 . \end{split} \end{equation}
(9)
Similarly, after integrating out the change probability p, the prior cohesion density thus can be written as
\begin{equation} f(\rho) \propto \int _{0}^{p_0} p^{b-1}(1-p)^{n-b} dp. \end{equation}
(10)
To calculate the posterior distribution over partitions, we use Markov Chain Monte Carlo (MCMC) [17]. We define a Markov chain with the following transition rule: With probability \(p_i\) , a new change point at the location i is introduced. Here, \(B_1, W_1\) and \(B_0, W_0\) refer to the expressions in Equation (12) with and without the change point in location i.
\begin{equation} \begin{split}\frac{p_i}{1-p_i} &= \frac{p(U_i=1|X,U_j,j \ne i)}{p(U_i=0|X,U_j,j \ne i)} \\ &= \frac{\int _{0}^{p_0} p^{b}(1-p)^{n-b-1} dp}{\int _{0}^{p_0} p^{b-1}(1-p)^{n-b} dp} x \frac{\int _{0}^{w_0} \frac{w^{b/2}}{(W_1 + B_1w)^{(n-1)/2}}dw}{\int _{0}^{w_0} \frac{w^{(b-1)/2}}{(W_0 + B_0w)^{(n-1)/2}}dw} \end{split} \end{equation}
(11)
We use the package bcp [13] in R to implement our change point analysis.

4 Results

From Figure 2, on February 10th, we observe a significant change-point in the screen on sensor, which is indicative of the amount of time the participant’s screen was on. On February 9th, we see an increase in the minimum battery level averaged across participants. The increase in active screen time indicates that the participants used their phones more. Similarly, the increase in the minimum battery level could indicate participants charging their phones more often. Both of these indicate an increase in smartphone usage. We also notice an increase in the variance across participants for both active screen time and minimum battery level. We also found a reduction in the number of times a participant switches WiFi networks around this date. These events overlap with February 11th, when the novel coronavirus strain first got its name as COVID-19.
Fig. 2.
Fig. 2. Top: We observe (a) an increase in the minimum battery level of the participants around Febuary 9th; (b) an increase in the percentage of time the participant’s screen is on around February 10th. Bottom: We observe (a) an increase mean email usage around February 27th; (b) a reduction in total steps taken by the participant per day around February 26th; (c) a drop in the number of times a participant switched his WiFi network around February 26th.
From Figure 2, around February 27th, we notice a significant increase in email usage and decrease in step counts. We also observed a reduction in the number of times a participant switched WiFi connections. The increase in email usages and the reduction in the number of times a participant switched WiFi networks could indicate a fraction of the study participants’ shifting to remote work. As we can see from the event timeline in Figure 4, these events overlap with the spike in the number of COVID-19-infected cases in Europe (February 28th) as well as the first COVID-19 press conference with President Trump and Dr. Anthony Fauci (February 27th). Also, on February 29th, the U.S. reported its first death. We also obtain a change in the number of tweets mentioning coronavirus around February 24th (Day 56), three days before the February 27th change-point (Figure 3).
Fig. 3.
Fig. 3. Top and Middle: We observe (a) a decrease in time spent outside by a participant around March 13th; (b) a decrease in total distance walked by the participant around March 16th; (c) a decrease in radius of gyration of the participant around March 15th; (d) decrease in number of times the participant’s WiFi connection switches around March 13th; (e) an increase in Instagram around March 14th; (f) a weak increase in Facebook around March 12th. Bottom: We observe spikes in social media and search traffic around February 25th (Day 56) and March 11th. From the mobility data, we notice a sharp drop in foot traffic to social places such as bars, restaurants, starting around March 11th. These overlap with our two observed change-points on February 27th and March 13th.
From Figure 3, around March 13th, we obtain changes in multiple behavioral streams. We notice a reduction in total distance traveled during the day, indicating that people travel much less on average. We also see a decrease in the radius of gyration, indicating that the participants’ mobility was considerably less spread out geographically than before. We also record that time spent outside home decreased from 6 hours to less than 2 hours, providing evidence that a substantial fraction of the participants complied with the social distancing guidance.
Some other interesting changes include a decrease in the number of times participants’ WiFi network changed, indicating that more people were shifting to work from home settings. We also observe a spike in Instagram usage and a weak increase in Facebook usage around the same time. We also spot some interesting nearby changes in external sources around this time. Using external events, we get change-points on March 11th in both Twitter and COVID-19 searches. We also see a reduction in mobility, as shown in the SafeGraph external plot. The change-point obtained from Twitter, Google search, and foot traffic data from SafeGraph lies within two days of the events we observe in our mobile sensing data.
The results are consistent with some important events that happened around the discovered change points. As we can see from the event timeline in Figure 4, On March 11th, President Trump blocked travel from most of Europe, followed by a national emergency declaration on March 13th. On March 15th, CDC recommended not to gather in groups larger than 50. The following day, President Trump suggested not having gatherings more than 10 people in size. We also observed a reduction in the standard deviation of mobility features, implying compliance with social distancing regulations. Interestingly, most of the behavior change in our model happened around President Trump’s declaration of national emergency. This is unusual, because most of our participants come from Virginia, North Carolina, and Ohio, where stay-at-home orders was given much later. Also, there is a slight upward trend towards the last day of our data. This could indicate that most social distancing measures eventually lose their effectiveness with time, and it is necessary to implement multiple interventions to ensure compliance. Finally, the number of changes in participants’ WiFi access points captured all three discovered change-points by all the other behavior markers. This hints towards the usefulness of this as a marker for understanding behavior during large-scale emergencies.
Fig. 4.
Fig. 4. We mine events of national significance from NYTimes COVID timeline and looked at clusters of events that happened within our changepoint uncertainty interval. Event timeline overlayed with detected changepoints (red) and uncertainty interval for each changepoint. For WifiChange and Mobility, we notice changes in macroscale sources (Twitter/ SafeGraph foot traffic/Google search for COVID (blue)). Interestingly, stay-at-home orders for both Virginia, North Carolina, and Ohio happened at least 10 days later from our last changepoint.
As can be seen in Figure 3, the change-points in our mobile sensing streams occur within three days of change-points from external data streams including Twitter, Google searches, and SafeGraph data. This indicates significant events had happened within those three days. Near our mobility change-point on March 13th, we detected two change-points on March 11 in both Twitter traffic in terms of the number of tweets with the hashtag COVID-19, as well as the Google searches for COVID-19, indicating a large-scale shift in behavior. We also see the start of a drop in visits to bars, restaurants, and other locations starting on March 13th. Similarly, around February 27th, we see a change in Twitter traffic on February 24th. Our data is also consistent with the foot traffic data we obtain from SafeGraph where we see a sharp drop around March 13th. This is a clear indication that behavioral change-points from mobile sensing can capture behavioral changes from external sources.

5 Discussion

Our case study’s changepoints lie within three days of behavioral change-points obtained from large-scale external sources such as Twitter, Google search trends, and foot traffic data from SafeGraph. Mobile sensing behavior effectively captured large-scale human behavior changes, and unlike relying on infection rates and COVID positivity, this method is cheap. It can quickly help local and state governments understand social distancing measures’ effectiveness. High smartphone adoption makes this an effective intervention-understanding strategy. Coupled with actual tests, this can help prepare local governments for possible outbreaks and create effective intervention measures. This approach can also be used to critically explore reopening strategies, since anonymized mobility data helps track group behavior effectively in a locality.
It is important to note that one-size-fits-all approaches might not work well in the context of interventions. Different communities might have different responses to the same intervention, and it might be necessary to try out multiple to ensure compliance. For instance, for some communities, a simple intervention such as YouTube advertisements or text messages reminding people to maintain social distancing might suffice. Simultaneously, strenuous interventions like a total lockdown might cause a financial strain on the economy, causing huge monetary and job losses. Another aspect to keep in mind is privacy, since users are unlikely to adopt a technology if it risks leaking their personal information. Our hierarchical behavioral change framework allows us to simultaneously probe policy efficacy at the level of neighborhoods, cities, and states.
Despite the promising contributions of this work, several limitations remain to be addressed. For instance, while we tried to investigate seasonality effect in this article, the limited time window in which the data were collected (January 1, 2020, to April 30, 2020) made it challenging to analyze the multi-year seasonality effect. Another way to show that the data collected does not have seasonal attributes is to study correlations with external datasets that are unlikely to show seasonality. In our case, while mobility can show seasonal trends, it is unlikely for COVID-19-related tweets or Google search trends to show similar correlations. The co-occurrence of our change points with ones in COVID-related social media sources indicates that these change points might be more than seasonal variations.
Also, the SafeGraph Data (collected by an independent external organization, composed of a more heterogeneous and larger demographic) aligned with the same footfall traffic behavior as the mobility markers from our dataset. All the current framework is based purely on correlations with real-world events and does not claim to establish any causality. To validate the event correspondence, we show the relation to other external datasets that exhibit similar changepoints. In the future, we plan to explore more fine-grained data (e.g., surveys, Twitter/Local Search Trends) from the users to understand event-specific changepoints.

6 Conclusion

In this work, we presented a framework to understand behavior change at different scales and evaluate public policy effectiveness amid the COVID-19 pandemic by extracting behavioral markers from a large mobile sensing dataset and verifying the changepoints against large-scale sources such as Twitter and SafeGraph. While our analysis shows promise towards mobile sensing usage for behavior monitoring and understanding public policy effectiveness in large-scale emergencies such as COVID-19, some questions remain unanswered. The lag between social media sources and our discovered behavior change-points could indicate that not all aspects of human behavior change simultaneously. For example, online behavior change might precede a change in mobility. We plan to investigate these questions in future work. We would also like to identify change-points in the infection dynamics and see how they relate to the ones obtained from our data. It may be useful to investigate the order of behavior changes that happened. In the current study, we first identify an increase in participants’ smartphone usage, followed by a change in their work environment (e.g., work from home), followed by a large reduction in mobility. We want to explore whether the same order in behavior changes will generalize to other large-scale emergencies. Our work shows that mobile sensing is a promising approach to monitor community behaviors, understand their dynamics, and detect any significant changes to external interventions.

Acknowledgments

We thank Akshit Goyal and Vishaka Datta for helpful comments during the editing of the draft.

References

[1]
2020. SafeGraph, a data company that aggregates anonymized location data from numerous applications in order to provide insights about physical places. To enhance privacy, SafeGraph excludes census block group information if fewer than five devices visited an establishment in a month from a given census block group. (2020).
[2]
2020. TweetBinder. The most complete hashtag tracking tool for Twitter. (2020).
[3]
Saeed Abdullah and Tanzeem Choudhury. 2018. Sensing technologies for monitoring serious mental illnesses. IEEE MultiMedia 25, 1 (2018), 61–75.
[4]
Martin Andersen. 2020. Early evidence on social distancing in response to COVID-19 in the United States. Available at SSRN 3569368 (2020).
[5]
Soubhik Barari, Stefano Caria, Antonio Davola, Paolo Falco, Thiemo Fetzer, Stefano Fiorin, Lukas Hensel, Andriy Ivchenko, Jon Jachimowicz, Gary King, et al. 2020. Evaluating COVID-19 public health messaging in Italy: Self-reported compliance and growing mental health concerns. medRxiv (2020).
[6]
Olivier Bargain and Ulugbek Aminjonov. 2020. Trust and compliance to public health policies in times of COVID-19. J. Pub. Econ. 192 (2020), 104316.
[7]
Daniel Barry and John A. Hartigan. 1993. A Bayesian analysis for change point problems. J. Amer. Statist. Assoc. 88, 421 (1993), 309–319.
[8]
Mehdi Boukhechba, Yu Huang, Philip Chow, Karl Fua, Bethany A. Teachman, and Laura E. Barnes. 2017. Monitoring social anxiety from mobility and communication patterns. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing and the ACM International Symposium on Wearable Computers. 749–753.
[9]
Elissa J. Brown, Julia Turovsky, Richard G. Heimberg, Harlan R. Juster, Timothy A. Brown, and David H. Barlow. 1997. Validation of the social interaction anxiety scale and the social phobia scale across the anxiety disorders.
[10]
Marloes Bults, Desiree J. M. A. Beaujean, Jan Hendrik Richardus, and Helene A. C. M. Voeten. 2015. Perceptions and behavioral responses of the general public during the 2009 influenza A (H1N1) pandemic: A systematic review. Disast. Med. Pub. Health Prepared. 9, 2 (2015), 207–219.
[11]
Emily Chen, Kristina Lerman, and Emilio Ferrara. 2020. Covid-19: The first public coronavirus Twitter dataset. arXiv preprint arXiv:2003.07372 (2020).
[12]
Jonas Dehning, Johannes Zierenberg, F. Paul Spitzner, Michael Wibral, Joao Pinheiro Neto, Michael Wilczek, and Viola Priesemann. 2020. Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science (2020).
[13]
Chandra Erdman, John W. Emerson, et al. 2007. bcp: An R package for performing a Bayesian analysis of change point problems. J. Statist. Softw. 23, 3 (2007), 1–13.
[14]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. 226–231.
[15]
Neil Ferguson, Daniel Laydon, Gemma Nedjati Gilani, Natsuko Imai, Kylie Ainslie, Marc Baguelin, Sangeeta Bhatia, Adhiratha Boonyasiri, Zulma Cucunuba Perez, Gina Cuomo-Dannenburg, et al. 2020. Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. (2020).
[16]
Dale Fisher and Annelies Wilder-Smith. 2020. The global community needs to swiftly ramp up the response to contain COVID-19. Lancet 395, 10230 (2020), 1109–1110. DOI:DOI:
[17]
Walter R. Gilks. 2005. Markov Chain Monte Carlo. Encycl. Biostatist. 4 (2005).
[18]
Sahar Hoteit, Stefano Secci, Stanislav Sobolevsky, Guy Pujolle, and Carlo Ratti. 2013. Estimating real human trajectories through mobile phone data. In Proceedings of the IEEE 14th International Conference on Mobile Data Management. IEEE, 148–153.
[19]
Jeremy Huckins, Elin L. Hedlund, Courtney Rogers, Subigya K. Nepal, Jialing Wu, Mikio Obuchi, Eilis I. Murphy, Meghan L. Meyer, Dylan D. Wagner, Paul E. Holtzheimer, et al. 2020. Mental health and behavior during the early phases of the COVID-19 pandemic: A longitudinal mobile smartphone and ecological momentary assessment study in college students. (2020).
[20]
Sedevizo Kielienyu, Burak Kantarci, Damla Turgut, and Shahzad Khan. 2020. Bridging predictive analytics and mobile crowdsensing for future risk maps of communities against COVID-19. In Proceedings of the 18th ACM Symposium on Mobility Management and Wireless Access. 37–45.
[21]
Jeeyoung Kim and Ahmed Helmy. 2011. The evolution of WLAN user mobility and its effect on prediction. In Proceedings of the 7th International Wireless Communications and Mobile Computing Conference. IEEE, 226–231.
[22]
Minkyong Kim and David Kotz. 2007. Periodic properties of user mobility and access-point popularity. Person. Ubiq. Comput. 11, 6 (2007), 465–479.
[23]
Alexander Lachmann. 2020. Correcting under-reported COVID-19 case numbers. MedRxiv (2020).
[24]
Robert LiKamWa, Yunxin Liu, Nicholas D. Lane, and Lin Zhong. 2013. MoodScope: Building a mood sensor from smartphone usage patterns. In Proceedings of the MobiSys Conference.
[25]
Michael Liu, Theodore L. Caputi, Mark Dredze, Aaron S. Kesselheim, and John W. Ayers. Internet searches for unproven COVID-19 therapies in the United States. JAMA Internal Medicine (????).
[26]
Mitja Lustrek, Bozidara Cvetkovic, Violeta Mirchevska, Özgür Kafali, Alfonso E. Romero, and Kostas Stathis. 2015. Recognising lifestyle activities of diabetic patients with a smartphone. In Proceedings of the 9th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth). 317–324.
[27]
Jean Louis Pépin, Rosa Maria Bruno, Rui-Yi Yang, Vincent Vercamer, Paul Jouhaud, Pierre Escourrou, and Pierre Boutouyrie. 2020. Wearable activity trackers for monitoring adherence to home confinement during the COVID-19 pandemic worldwide: Data aggregation and analysis. J. Med. Internet Res. 22, 6 (2020), e19787.
[28]
Peter Richterich. 2020. Severe underestimation of COVID-19 case numbers: Effect of epidemic growth rate and test restrictions. medRxiv (2020).
[29]
Borja Sañudo, Curtis Fennell, and Antonio J. Sánchez-Oliver. 2020. Objectively-assessed physical activity, sedentary behavior, smartphone use, and sleep patterns pre-and during-COVID-19 quarantine in young adults from Spain. Sustainability 12, 15 (2020), 5890.
[30]
Piotr Sapiezynski, Arkadiusz Stopczynski, Radu Gatej, and Sune Lehmann. 2015. Tracking human mobility using WIFI signals. PLoS One 10, 7 (2015), e0130824.
[31]
Samantha Sohn, Phillipa Rees, Bethany Wildridge, Nicola J. Kalk, and Ben Carter. 2019. Prevalence of problematic smartphone usage and associated mental health outcomes amongst children and young people: A systematic review, meta-analysis and GRADE of the evidence. BMC Psychiat. 19, 1 (2019), 1–10.
[32]
Shaoxiong Sun, Amos Folarin, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Nicholas Cummins, Faith Matcham, Gloria Dalla Costa, Letizia Leocani, Per Soelberg Sørensen, et al. 2020. Using smartphones and wearable devices to monitor behavioural changes during COVID-19. arXiv preprint arXiv:2004.14331 (2020).
[33]
Sara Thomée, Annika Härenstam, and Mats Hagberg. 2011. Mobile phone use and stress, sleep disturbances, and symptoms of depression among young adults—A prospective cohort study. BMC Pub. Health 11, 1 (2011), 66.
[34]
Alina Trifan, Maryse Oliveira, and José Luís Oliveira. 2019. Passive sensing of health outcomes through smartphones: Systematic review of current solutions and possible limitations. JMIR mHealth uHealth 7, 8 (2019), e12649.
[35]
Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Darrin Eide, Kathryn Funk, Rodney Michael Kinney, Ziyang Liu, William. Merrill, Paul Mooney, Dewey A. Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex D. Wade, Kuansan Wang, Christopher Wilhelm, Boya Xie, Douglas M. Raymond, Daniel S. Weld, Oren Etzioni, and Sebastian Kohlmeier. 2020. CORD-19: The Covid-19 open research dataset. ArXiv: abs/2004.10706 (2020).
[36]
Rui Wang, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella M. Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. 2014. StudentLife: Assessing mental health, academic performance and behavioral trends of college students using smartphones. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing.
[37]
Toby Wise, Tomislav Damir Zbozinek, Giorgia Michelini, Cindy C. Hagan, et al. 2020. Changes in risk perception and protective behavior during the first week of the COVID-19 pandemic in the United States. (2020).
[38]
Haoyi Xiong, Yu Huang, Laura E. Barnes, and Matthew S. Gerber. 2016. Sensus: A cross-platform, general-purpose system for mobile crowdsensing in human-subject studies. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing. 415–426.
[39]
Kornanong Yuenyongchaiwat. 2016. Effects of 10,000 steps a day on physical and mental health in overweight participants in a community setting: A preliminary study. Braz. J. Phys. Ther.AHEAD (2016).

Cited By

View all
  • (2024)Social and spatial disparities in individuals’ mobility response time to COVID-19: A big data analysis incorporating changepoint detection and accelerated failure time modelsTransportation Research Part A: Policy and Practice10.1016/j.tra.2024.104089184(104089)Online publication date: Jun-2024
  • (2023)Deep-Learning-Based Arrhythmia Detection Using ECG Signals: A Comparative Study and Performance EvaluationDiagnostics10.3390/diagnostics1324360513:24(3605)Online publication date: 5-Dec-2023
  • (2023)How are drivers’ stress levels and emotions associated with the driving context? A naturalistic studyJournal of Transport & Health10.1016/j.jth.2023.10164931(101649)Online publication date: Jul-2023
  • Show More Cited By

Index Terms

  1. Leveraging Mobile Sensing and Bayesian Change Point Analysis to Monitor Community-scale Behavioral Interventions: A Case Study on COVID-19

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Computing for Healthcare
    ACM Transactions on Computing for Healthcare  Volume 3, Issue 4
    October 2022
    331 pages
    EISSN:2637-8051
    DOI:10.1145/3544003
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 November 2022
    Online AM: 20 July 2022
    Accepted: 09 March 2022
    Revised: 29 December 2021
    Received: 13 October 2020
    Published in HEALTH Volume 3, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Datasets
    2. change-point detection
    3. behavior change
    4. mobile sensing

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • DARPA Warfighter Analytics using Smartphones for Health (WASH) program

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)331
    • Downloads (Last 6 weeks)38
    Reflects downloads up to 03 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Social and spatial disparities in individuals’ mobility response time to COVID-19: A big data analysis incorporating changepoint detection and accelerated failure time modelsTransportation Research Part A: Policy and Practice10.1016/j.tra.2024.104089184(104089)Online publication date: Jun-2024
    • (2023)Deep-Learning-Based Arrhythmia Detection Using ECG Signals: A Comparative Study and Performance EvaluationDiagnostics10.3390/diagnostics1324360513:24(3605)Online publication date: 5-Dec-2023
    • (2023)How are drivers’ stress levels and emotions associated with the driving context? A naturalistic studyJournal of Transport & Health10.1016/j.jth.2023.10164931(101649)Online publication date: Jul-2023
    • (2022)Software Architecture Patterns for Extending Sensing Capabilities and Data Formatting in Mobile SensingSensors10.3390/s2207281322:7(2813)Online publication date: 6-Apr-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media