research-article

Open access

Leveraging Mobile Sensing and Bayesian Change Point Analysis to Monitor Community-scale Behavioral Interventions: A Case Study on COVID-19

Authors:

Laura BarnesAuthors Info & Claims

ACM Transactions on Computing for Healthcare, Volume 3, Issue 4

Article No.: 37, Pages 1 - 13

https://doi.org/10.1145/3524886

Published: 03 November 2022 Publication History

All formats PDF

Abstract

During pandemics, effective interventions require monitoring the problem at different scales and understanding the various tradeoffs between efficacy, privacy, and economic burden. To address these challenges, we propose a framework where we perform Bayesian change-point analysis on aggregate behavior markers extracted from mobile sensing data collected during the COVID-19 pandemic. Results generated by 598 participants for up to four months reveal rich insights: We observe an increase in smartphone usage around February 10th, followed by an increase in email usage around February 27th and, finally, a large reduction in participant’s mobility around March 13th. These behavior changes overlapped with important news events and government directives such as the naming of COVID-19, a spike in the number of reported cases in Europe, and the declaration of national emergency by President Trump. We also show that our detected change points align with changes in large scale external sources, including number of COVID-19 tweets, COVID-19 search traffic, and a large-scale foot traffic data collected by SafeGraph, providing further validation of our method. Our results show promise towards the feasibility of using mobile sensing to understand communities’ responses to public health interventions.

1 Introduction

COVID-19 is a global pandemic that has devastated the world through its toll on human lives and the economy. In the absence of pharmaceutical interventions like vaccines, non-pharmaceutical interventions such as social distancing and stay-at-home orders are key in preventing the healthcare system from becoming overburdened [15]. As evidenced by the 1918 Spanish flu pandemic, early interventions from local, state, and central governments ultimately determine the severity of infections and can reduce casualties by a factor of 1/8th [16]. With new interventions being implemented, there is an urgent need to track the impact of these interventions in real time to respond in a timely manner. Motivated by these challenges, we seek to address the following question: In the absence of widespread and rapid testing, how can we use mobile sensing data to detect COVID-19-related behavior changes and measure effectiveness of public interventions?

While there have been a few papers studying the impact of government interventions on the transmission rate of COVID-19, these studies often focus on the number of cases as an indirect proxy for human behaviors [12]. While the number of cases is useful to assess an intervention’s efficacy, it has its caveats. Not only will it lag behind other time-series data from social media and mobile sensing, but the data quality also depends substantially on the number of testing kits available [23, 28], making it challenging to infer behavior change. To amend this limitation, mobile sensing can supplement existing monitoring approaches. The rich behavioral markers extracted from in-the-wild personal sensing using smartphone-embedded sensors can provide a direct glimpse into various aspects of human behaviors and how they related to COVID-19 [20]. We use community aggregates (1) of mobile sensing features such as total distance traveled and step count to characterize human behavior during a critical period of the COVID-19 pandemic in the U.S. While these behavior features give us a glimpse into human behavior, it is challenging to visually discern when a significant change has occurred, due to high amount of noise in real-world data. At the same time, multiple change points might be a good fit for the data. A robust statistical framework is thus needed to model uncertainty. One, which has worked remarkably well in this setting, is a Bayesian Change Point Detection Approach. By constructing a posterior distribution over all possible change point configurations, we can draw samples from the posterior distribution to quantify uncertainty.

Fig. 1.

Finally, while looking for behavioral changes using mobile sensing is a promising avenue for current research, it is also essential to understand its limitations. Data generated from mobile sensing studies tend to represent only a subset of populations and may not scale to the whole community. We validate our behavioral change points by comparing them with significant changes in behaviors extracted from external data streams generated by millions of participants from social media platforms such as Twitter, Google searches, and foot traffic to millions of Point of Interest locations in the United States.

Our key contributions can be summarized as follows: (1) We propose a framework to extract privacy-preserving behavioral markers (Figure 1) based on mobility, social interactions, and smartphone apps usage, to understand community behavior change at different scales during a pandemic. (2) We perform Bayesian change-point analysis on the proposed behavioral markers and identify critical dates when changes were detected in multiple behavior streams simultaneously. Using the proposed framework, for a sample size of N = 598 participants, we show that there were statistically significant behavioral changes at three dates. (3) We also identify statistically significant changes in macroscopic time series from millions of users on Twitter/Google search and SafeGraph within our uncertainty interval [1] and find consistent change points within a three days margin of error. (4) Finally, for the significant dates identified in (2), we identify clusters of events of national importance that could have given rise to these changes. Our framework allows policymakers in measuring the effect of public health interventions on people’s behavior, which in turn governs the infection rates and overall deaths.

The rest of the article is organized as follows: We first describe our dataset, the extracted behavior markers, and the Bayesian change-point detection algorithm. Next, we present our results and how the change points detected by our method correlate with significant events such as spike in cases in Europe and the declaration of national emergency by the US President. We also present how our results overlap with behavioral change points detected from other data sources, including Twitter, Google searches, and mobility data.

2 Related Work

The enormous social and economic impacts of COVID-19 have spurred research in various areas. Liu et al. [25] studied spikes in unproven COVID-19 therapy searches based on tweets of celebrities. Wang et al. [35] applied text mining and information retrieval methods to help inform policymakers and biomedical experts with effective treatment and management outcomes. Chen et al. [11] collected tweets with COVID-19-specific terms to understand the pandemic’s discussions in social media.

Several studies used self-reported measures to evaluate compliance: Barari et al. [5] performed a national survey in Italy to understand compliance with public health messages. Wise et al. [37] focused on risk perception changes during the first week of the pandemic. Bults et al. [10] looked at the behavioral responses of UK adults during the first phase of the pandemic.

While large-scale human behavior is challenging to quantify without the intrusive use of cameras and other monitoring techniques, passive sensing provides a less cumbersome and more objective way of assessing human behavior. Boukhechba et al. [8] monitored social anxiety correlated mobility and communication patterns with participants’ anxiety levels using the Social Interaction Anxiety Scale (SIAS) [9]. Wang et al. introduced StudentLife, a large-scale dataset that used mobile sensing for research on sociability, mental well-being, and students’ academic performance [36]. LiKamWa et al. [24] showed that feeling sad is strongly correlated with decreased mobile phone activity such as SMS Messaging, Bluetooth-detected contacts, location entropy, and a decrease in phone calls. Mobile sensing also opens up possibilities for non-intrusive monitoring of group patterns instead of individualized assessments. A recent publication from Trifan et al. [34] showed that about 30% of the papers in this behavioral health domain had analyzed the participant’s physical activities. Lustrek et al. [26] used physical activity data relying on GPS data, step counts, visible WiFi access spots, and accelerometer data to understand lifestyle activities such as eating, exercise, and work.

In the context of COVID-19, there has been research looking at changes pre- and post-pandemic. Andersen et al. [4] used GPS data to demonstrate evidence of compliance with social distancing regulations. Sun et al. [32] compared mobility and Bluetooth markers during the pre- and post-lockdown period. Huckins et al. [19] leveraged ecological momentary assessment (EMA) and passive sensing data to compare metrics of mental health of students during COVID-19 with a similar time during the previous year. Sanudo et al. [29] looked at physical activity, sedentary behavior, and sleep patterns pre- and post-quarantine. Pepin et al. [27] seek to understand the reduction in the step count pre- and post-lockdown. While these approaches are interesting, they require to manually specify the changepoints, something we infer automatically. This is particularly important in the context of gauging policy efficacy when knowing when the change happened is as important as knowing if a change happened at all.

3 Methods

3.1 Data Acquisition

The data was collected using a cross-OS (Android and iOS) mobile sensing platform called Readisense. It was built on top of Sensus [38]; an open-source crowdsensing tool developed at the University of Virginia since 2014. The application captures most common data streams from smartphone embedded sensors and applications, though there are some operating system-specific sensors like “Healthkit” data, which is available only in iOS. The application supports both Listening Probes (i.e., when a change in a sensor is noted, the application records those data) and Polling Probes (i.e., data is collected at regular intervals from the sensors). Even though a wide variety of sensors are available as part of the anonymous-app, we limited our study to the ones that captured human behavior change during the COVID-19 pandemic, including GPS, Step Count, App Usage, Battery Level, Wireless Access Point, and Screen Usage. The data collection was approved by the Institution Review Board (IRB), where 2,700 participants across US were recruited to participate in a four-month study to examine the relationship between mobile sensing data and symptoms of infectious diseases. Only those with at least 14 days of GPS, Bluetooth, and Activity data were considered, leaving 598 participants. The participants’ average age is 40.71 (sd = 11.52). Females make up 65.36% of the participants, with Whites accounting for 66.2%, African Americans 19.3%, Asians 7.3%, people of other races 3.3%, Hispanics 3.5%, and others 0.4%. The selected 598 participants were the ones being active during the COVID-19 pandemic. Fifty-two percent of the participants are full-time employees, 20% are full-time students, and 12% are working part-time. The top three states based on the number of participants are North Carolina, Virginia, and Ohio.

3.2 Feature Extraction

We collected a wide range of features characterizing the participant’s mobility, app usage, and physical activity. We primarily focus on mobility features in our study, since it is essential to study the participants’ movement patterns to quantify interventions such as social distancing. We also collected app usage and the percentage of time a participant’s screen was on to quantify how the participants’ interaction patterns with their smartphones have changed as a consequence of COVID-19. Since we were interested in behaviors at a community level, we aggregate all the sensor data for the participants from the different states for the case study. More detail is provided below.

3.2.1 Mobility and Activity Features.

Human behavior is characterized by repetitive patterns of mobility and location traces [3]. Mobility features from location traces have been shown to be predictive of a participant’s mental health and well-being [8]. It has also been used to study compliance with public health policies [6]. We extract the following mobility features:

•

Daily Step Count: The total steps taken by a participant per day. This provides a proxy to how much physical activity the participant is performing per day. Larger daily step counts have been shown to be correlated with lower depression, anger, and mood distress in previous studies. [39]

•

Total Distance Moved: We calculate the daily total distance traveled by each participant. Since we are working with anonymized GPS coordinates, it is impossible to use the Haversine distance. Thus, we calculate the Euclidean distance between latitude and longitude and scale it up by a factor of 60 miles to obtain distance in meters as an approximation of the Haversine distance. The approximation is reasonable, considering that most participants travel within local neighborhoods on a day-to-day basis.

•

Time Spent Outside Home: We use the DBSCAN [14] clustering algorithm to extract clusters that represent different places. We choose a radius of 10 meters and 10 points per cluster as hyper-parameters to the DBSCAN algorithm and denote home as the cluster with the largest number of entries between 12 am and 6 am.

•

Radius of Gyration: This is the standard deviation of GPS positions of each participant relative to the centroid of GPS coordinates. A small radius of gyration indicates sedentary behavior as opposed to a large radius, which is characteristic of long-distance commuters [18].

•

Access Point Change: We count the number of times a participant’s WiFi connection changes. Large values could indicate the participant switching between WiFi hotspots frequently, which is likely to happen in a public and business areas. A small value could indicate being at a place like a participant’s home, where the participant is not leaving with a fixed connection [21, 22, 30].

3.2.2 Smartphone Usage-based Features.

Increased smartphone usage has been linked to poorer self-reported measures of anxiety, sleep quality, and stress [31, 33]. We extract the following features to investigate changes in the participants’ smartphone usage patterns.

•

Battery Level: We use minimum, maximum battery levels of a participant’s device across all participants to quantify how frequently they were charging their phone.

•

App Usage: We primarily focus on most frequently used apps from various categories. This includes social apps such as Facebook, Twitter, and Instagram, engagement apps such as health and fitness, transportation apps, as well as finance apps.

•

Screen-on Fraction: What fraction of queries performed by anonymous-app found the screen to be in an active state. This provides us a proxy for how much the participants were using their smartphones.

3.3 External Data Sources

We validated the changepoints detected from mobile sensing by comparing them to the following external sources:

•

Twitter: We use publicly available data from TweetBinder [2] on the number of tweets with the following hashtags #covid19, #coronavirus, or #covid-19, between Jan 1st and Apr 15th.

•

Google searches: We extract people’s search volume index from Google Trends over the course of the pandemic.

•

Foot traffic data collected by SafeGraph based on visits to commercial and non-commercial locations [1]. SafeGraph aggregates point of sale from over 6 million point of interest locations in the US.

3.4 Privacy Considerations

Relying on multiple sensors to understand human behavior is the cornerstone of mobile sensing research. Simultaneously, since mobile phones come with various sensors, privacy is a crucial concern for researchers and research participants alike. We addressed this in three critical ways: (1) We anonymized all the collected data by only storing generic participant IDs for all our experiments, since we were interested in participants’ aggregated and group behavior. (2) GPS data was anonymized by omitting the integer parts. We only used the decimals for computing location-based movements. (3) For other sensors, data was anonymized (e.g., Bluetooth Mac addresses were hashed) on the device before being synced to the cloud. The data were stored in a HIPPA-compliant secure server and was strictly available to researchers under the IRB only.

3.5 Bayesian Change-point Analysis

We formulate the problem of detecting changes in the time series as a Bayesian change-point detection problem. Using a Bayesian methodology allows easy quantification of uncertainty and integration of priors (for instance, feeding in government directives as metadata instead of assuming a uniform distribution of change-points over locations). We use Barry and Hartigan’s [7] Bayesian change-point model for our analysis. Although such a formulation supports identifying mean, variance, and intercept changes, it is practical to choose models that are simple and easy to interpret. Furthermore, increasing evidence showed that large-scale behaviors in communities such as Twitter and financial markets, while being driven by external events, are characterized by discrete shifts and bursts. Since we are working with time series of means of features on a community level, which are being driven by news events and government policies, it is appropriate to identify similar discrete changes. We thus primarily analyze changes in levels.

The adopted Bayesian change-point model assumes that there is an unknown partition \(\rho\) of the data into contiguous blocks, such that within each block, the mean remains the same. The model also assumes an independent normal distribution for each block.

Let us assume we have n data points \(\lbrace X: X_1,\ldots ,X_n\rbrace\) . Let \(\rho = (U_1,\ldots ,U_n)\) indicate a partition of the time series into non-overlapping partitions. We use a Boolean array of change-points to denote the partitions. At each timestep, if \(U_i\) takes a value 1, then we have a new block; else, we remain in the same block.

We are interested in the posterior density \(f(\rho |X)\) . By Baye’s theorem, this can be written as

\begin{equation} f(\rho |X) \propto f(X|\rho) f(\rho). \end{equation}

(1)

Prior cohesion density: Let p denote the probability of getting a change point at each location. We assume this probability to be the same at each location. If we assume that there are b partitions, then the prior cohesion density can be written as

\begin{equation} f(\rho |p) = p^{b-1}(1-p)^{n-b} . \end{equation}

(2)

The joint density of observations and parameters given \(\rho\) is a product of densities of different blocks over the blocks in \(\rho\) . Let us consider a single block. If we assume that the data in this block is generated by a Gaussian with mean \(\theta\) and variance \(\sigma ^2\) , then let the prior density of \(\theta\) be a Gaussian with mean \(\mu _0\) and variance \(\sigma _0^2\)

\begin{equation} \begin{split}f(X_{ij}, \theta) = \Pi f(X_k|\theta) f(\theta)\\ f(X_{ij}) = \int \Pi f(X_k|\theta) f(\theta) d\theta . \end{split} \end{equation}

(3)

The above integral can be simplified to the expression below

\begin{equation} \begin{split}f(X_{ij}) = \left(\frac{1}{2\pi \sigma ^2}\right)^{(j-i)/2} \left(\frac{\sigma ^2}{\sigma _0^2 + \sigma ^2}\right)^{1/2} exp(V_{ij}), \end{split} \end{equation}

(4)

where

\begin{equation} V_{ij} = -\frac{\sum _{l=i+1}^{l=j} (X_l - \hat{X}_{ij})^2}{2\sigma ^2} - \frac{(j-i) (\hat{X}_{ij} - \mu _0)^2}{2(\sigma ^2 + \sigma _0^2)} \end{equation}

(5)

and \(\hat{X}_{ij}\) is the mean of the observations in the partition. However, \(f(X_{ij})\) still depends on the parameters \(\mu _0, \sigma ^2, \sigma _0^2\) . Defining \(w=\frac{\sigma ^2}{\sigma _0^2 + \sigma ^2}\) and choosing the following priors for the parameters:

\begin{align} f(\mu _0) &= 1, -\infty \le \mu _0 \le \infty \nonumber \nonumber\\ f(p) &= 1/p_0, 0 \le p \le p_0 \nonumber \nonumber\\ f(\sigma ^2) &= 1/\sigma ^2, 0\le \sigma ^2 \le \infty \\ f(w) &= 1/w_0, 0 \le w \le w_0 \nonumber \nonumber \end{align}

(6)

\begin{equation} \begin{split} f(X|\rho , \mu _0, w) = \int _{0}^{\infty } 1/\sigma ^2 \prod _{ij \epsilon P}^{} f(X_{ij}) d\sigma ^2 \end{split} . \end{equation}

(7)

After integrating our \(\mu _0\) and w, this can be simplified to the indefinite integral below. We urge the readers to read Reference [7] for the full derivation.

\begin{equation} f(X|\rho) \propto \int _{0}^{w_0} \frac{w^{(b-1)/2}}{(W + Bw)^{(n-1)/2}}dw, \end{equation}

(8)

where

\begin{equation} \begin{split}\hat{X} = \sum _{i=1}^{n} X_i/n, \; B = \sum _{ij \epsilon P} (j-i) (\hat{X}_{ij} - \hat{X})^2, \\ W = \sum _{ij \epsilon P} \sum _{l=i+1}^{l=j}(X_l - \hat{X}_{ij})^2 . \end{split} \end{equation}

(9)

Similarly, after integrating out the change probability p, the prior cohesion density thus can be written as

\begin{equation} f(\rho) \propto \int _{0}^{p_0} p^{b-1}(1-p)^{n-b} dp. \end{equation}

(10)

To calculate the posterior distribution over partitions, we use Markov Chain Monte Carlo (MCMC) [17]. We define a Markov chain with the following transition rule: With probability \(p_i\) , a new change point at the location i is introduced. Here, \(B_1, W_1\) and \(B_0, W_0\) refer to the expressions in Equation (12) with and without the change point in location i.

\begin{equation} \begin{split}\frac{p_i}{1-p_i} &= \frac{p(U_i=1|X,U_j,j \ne i)}{p(U_i=0|X,U_j,j \ne i)} \\ &= \frac{\int _{0}^{p_0} p^{b}(1-p)^{n-b-1} dp}{\int _{0}^{p_0} p^{b-1}(1-p)^{n-b} dp} x \frac{\int _{0}^{w_0} \frac{w^{b/2}}{(W_1 + B_1w)^{(n-1)/2}}dw}{\int _{0}^{w_0} \frac{w^{(b-1)/2}}{(W_0 + B_0w)^{(n-1)/2}}dw} \end{split} \end{equation}

(11)

We use the package bcp [13] in R to implement our change point analysis.

4 Results

From Figure 2, on February 10th, we observe a significant change-point in the screen on sensor, which is indicative of the amount of time the participant’s screen was on. On February 9th, we see an increase in the minimum battery level averaged across participants. The increase in active screen time indicates that the participants used their phones more. Similarly, the increase in the minimum battery level could indicate participants charging their phones more often. Both of these indicate an increase in smartphone usage. We also notice an increase in the variance across participants for both active screen time and minimum battery level. We also found a reduction in the number of times a participant switches WiFi networks around this date. These events overlap with February 11th, when the novel coronavirus strain first got its name as COVID-19.

Fig. 2.

From Figure 2, around February 27th, we notice a significant increase in email usage and decrease in step counts. We also observed a reduction in the number of times a participant switched WiFi connections. The increase in email usages and the reduction in the number of times a participant switched WiFi networks could indicate a fraction of the study participants’ shifting to remote work. As we can see from the event timeline in Figure 4, these events overlap with the spike in the number of COVID-19-infected cases in Europe (February 28th) as well as the first COVID-19 press conference with President Trump and Dr. Anthony Fauci (February 27th). Also, on February 29th, the U.S. reported its first death. We also obtain a change in the number of tweets mentioning coronavirus around February 24th (Day 56), three days before the February 27th change-point (Figure 3).

Fig. 3.

From Figure 3, around March 13th, we obtain changes in multiple behavioral streams. We notice a reduction in total distance traveled during the day, indicating that people travel much less on average. We also see a decrease in the radius of gyration, indicating that the participants’ mobility was considerably less spread out geographically than before. We also record that time spent outside home decreased from 6 hours to less than 2 hours, providing evidence that a substantial fraction of the participants complied with the social distancing guidance.

Some other interesting changes include a decrease in the number of times participants’ WiFi network changed, indicating that more people were shifting to work from home settings. We also observe a spike in Instagram usage and a weak increase in Facebook usage around the same time. We also spot some interesting nearby changes in external sources around this time. Using external events, we get change-points on March 11th in both Twitter and COVID-19 searches. We also see a reduction in mobility, as shown in the SafeGraph external plot. The change-point obtained from Twitter, Google search, and foot traffic data from SafeGraph lies within two days of the events we observe in our mobile sensing data.

The results are consistent with some important events that happened around the discovered change points. As we can see from the event timeline in Figure 4, On March 11th, President Trump blocked travel from most of Europe, followed by a national emergency declaration on March 13th. On March 15th, CDC recommended not to gather in groups larger than 50. The following day, President Trump suggested not having gatherings more than 10 people in size. We also observed a reduction in the standard deviation of mobility features, implying compliance with social distancing regulations. Interestingly, most of the behavior change in our model happened around President Trump’s declaration of national emergency. This is unusual, because most of our participants come from Virginia, North Carolina, and Ohio, where stay-at-home orders was given much later. Also, there is a slight upward trend towards the last day of our data. This could indicate that most social distancing measures eventually lose their effectiveness with time, and it is necessary to implement multiple interventions to ensure compliance. Finally, the number of changes in participants’ WiFi access points captured all three discovered change-points by all the other behavior markers. This hints towards the usefulness of this as a marker for understanding behavior during large-scale emergencies.

Fig. 4.

As can be seen in Figure 3, the change-points in our mobile sensing streams occur within three days of change-points from external data streams including Twitter, Google searches, and SafeGraph data. This indicates significant events had happened within those three days. Near our mobility change-point on March 13th, we detected two change-points on March 11 in both Twitter traffic in terms of the number of tweets with the hashtag COVID-19, as well as the Google searches for COVID-19, indicating a large-scale shift in behavior. We also see the start of a drop in visits to bars, restaurants, and other locations starting on March 13th. Similarly, around February 27th, we see a change in Twitter traffic on February 24th. Our data is also consistent with the foot traffic data we obtain from SafeGraph where we see a sharp drop around March 13th. This is a clear indication that behavioral change-points from mobile sensing can capture behavioral changes from external sources.

5 Discussion

Our case study’s changepoints lie within three days of behavioral change-points obtained from large-scale external sources such as Twitter, Google search trends, and foot traffic data from SafeGraph. Mobile sensing behavior effectively captured large-scale human behavior changes, and unlike relying on infection rates and COVID positivity, this method is cheap. It can quickly help local and state governments understand social distancing measures’ effectiveness. High smartphone adoption makes this an effective intervention-understanding strategy. Coupled with actual tests, this can help prepare local governments for possible outbreaks and create effective intervention measures. This approach can also be used to critically explore reopening strategies, since anonymized mobility data helps track group behavior effectively in a locality.

It is important to note that one-size-fits-all approaches might not work well in the context of interventions. Different communities might have different responses to the same intervention, and it might be necessary to try out multiple to ensure compliance. For instance, for some communities, a simple intervention such as YouTube advertisements or text messages reminding people to maintain social distancing might suffice. Simultaneously, strenuous interventions like a total lockdown might cause a financial strain on the economy, causing huge monetary and job losses. Another aspect to keep in mind is privacy, since users are unlikely to adopt a technology if it risks leaking their personal information. Our hierarchical behavioral change framework allows us to simultaneously probe policy efficacy at the level of neighborhoods, cities, and states.

Despite the promising contributions of this work, several limitations remain to be addressed. For instance, while we tried to investigate seasonality effect in this article, the limited time window in which the data were collected (January 1, 2020, to April 30, 2020) made it challenging to analyze the multi-year seasonality effect. Another way to show that the data collected does not have seasonal attributes is to study correlations with external datasets that are unlikely to show seasonality. In our case, while mobility can show seasonal trends, it is unlikely for COVID-19-related tweets or Google search trends to show similar correlations. The co-occurrence of our change points with ones in COVID-related social media sources indicates that these change points might be more than seasonal variations.

Also, the SafeGraph Data (collected by an independent external organization, composed of a more heterogeneous and larger demographic) aligned with the same footfall traffic behavior as the mobility markers from our dataset. All the current framework is based purely on correlations with real-world events and does not claim to establish any causality. To validate the event correspondence, we show the relation to other external datasets that exhibit similar changepoints. In the future, we plan to explore more fine-grained data (e.g., surveys, Twitter/Local Search Trends) from the users to understand event-specific changepoints.

6 Conclusion

In this work, we presented a framework to understand behavior change at different scales and evaluate public policy effectiveness amid the COVID-19 pandemic by extracting behavioral markers from a large mobile sensing dataset and verifying the changepoints against large-scale sources such as Twitter and SafeGraph. While our analysis shows promise towards mobile sensing usage for behavior monitoring and understanding public policy effectiveness in large-scale emergencies such as COVID-19, some questions remain unanswered. The lag between social media sources and our discovered behavior change-points could indicate that not all aspects of human behavior change simultaneously. For example, online behavior change might precede a change in mobility. We plan to investigate these questions in future work. We would also like to identify change-points in the infection dynamics and see how they relate to the ones obtained from our data. It may be useful to investigate the order of behavior changes that happened. In the current study, we first identify an increase in participants’ smartphone usage, followed by a change in their work environment (e.g., work from home), followed by a large reduction in mobility. We want to explore whether the same order in behavior changes will generalize to other large-scale emergencies. Our work shows that mobile sensing is a promising approach to monitor community behaviors, understand their dynamics, and detect any significant changes to external interventions.

Acknowledgments

We thank Akshit Goyal and Vishaka Datta for helpful comments during the editing of the draft.

References

[1]

2020. SafeGraph, a data company that aggregates anonymized location data from numerous applications in order to provide insights about physical places. To enhance privacy, SafeGraph excludes census block group information if fewer than five devices visited an establishment in a month from a given census block group. (2020).

Abstract

1 Introduction

2 Related Work

3 Methods

3.1 Data Acquisition

3.2 Feature Extraction

3.2.1 Mobility and Activity Features.

3.2.2 Smartphone Usage-based Features.

3.3 External Data Sources

3.4 Privacy Considerations

3.5 Bayesian Change-point Analysis

4 Results

5 Discussion

6 Conclusion

Acknowledgments

References

Cited By

Index Terms

Recommendations

Towards A Framework for Mobile Behavior Change Research

Predicting Symptom Trajectories of Schizophrenia using Mobile Sensing

Modeling and Evaluating Mobile-based Interventions for Food Intake Behavior Change

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations