1. Introduction
Monitoring both “person-to-person” and “person-to-place” interactions is a critical issue for post COVID-19 reopenings [
1,
2]. Although person-to-person contact is a major factor of virus spread, recent studies have shown that a person can be infected even after the infected person has left the room [
2,
3]. When sharing the same indoor space, close contact can cause viruses to spread via air, objects, or floor, even after two to three days if the recommended protective equipment is not used, or disinfection is not carried out [
4]. Geospatial information integrated into unified Internet of COVID-19 solutions plays an important role in monitoring the pattern of COVID-19 spread considering both infected “people” and “places”, and duration of contact. Such unified geospatial-enabled IoT solutions can be leveraged to understand the impact of virus spread for handling outbreaks, as well as, timely resource planning and allocation [
5] on a cross-organizational scale [
6,
7,
8,
9].
There are many ad hoc Internet of COVID-19 solutions for combating the COVID-19 pandemic that use various sensor-based technologies [
7,
10,
11,
12,
13,
14]. An important way to evaluate and limit the spread of COVID-19 using the IoT is through the use of digital contact tracing solutions [
14,
15]. Digital contact tracing uses various combinations of close-range, proximity-based sensing technologies, such as smartphones, wearables [
16], Bluetooth Low Energy (BLE) beacons [
14], and positioning-based solutions [
17] that use anonymous or randomly coded locations [
11]. Regardless of the choice of technology, they all share the same goal: To identify and inform those who may have been exposed to the COVID-19 virus, or those who are in the high-risk category, so that they can take appropriate actions such as isolation, care, and treatment [
18]. In addition to contact tracing apps, ongoing effort is being made to monitor post COVID-19 measures using the IoT [
10,
11,
12,
13]. However, these ad hoc IoT solutions are unable to interoperate with each other as they are developed using different sensors, data models, communication protocols, and applications without any interoperable way to interconnect these heterogeneous systems and exchange data.
The major goal of this research is to design, implement, and evaluate an interoperable, standard-based, scalable IoT architecture for integrating the disparate Internet of COVID-19 Things (IoCT). This paper proposed an effective post COVID-19 information system for evaluating transmission risk for both people and places using disparate IoT systems, e.g., proximity-based beacons or Global Navigation Satellite System (GNSS)-based tracking, camera-based COVID-19 risky behavior detection, and contextual indoor geospatial information. A low-cost, multi-sensor, real-time IoCT was deployed that can be rapidly applied to different COVID-19 workplace reopening scenarios such as schools, office management systems, and smart cities. The proposed IoCT was employed to identify and limit the risk pattern of COVID-19 transmission especially within enclosed buildings. The risk of COVID-19 spread inside buildings from person-to-person and person-to-place interactions when taking into consideration different distances, durations, and types of activities (e.g., disinfecting activities) was modelled using the IndoorGML graph data model [
19,
20]. This research presents the innovative use of the Open Geospatial Consortium (
OGC) [
21] SensorThings Application Programming Interface (API) [
22,
23], as well as, the IndoorGML that uses Poincare duality to geo-reference IoT sensor observations for both 3D spaces and Node-Relation graphs in topology space. Our paper also argues that the integration of the IndoorGML and SensorThings API is critical for effective COVID-19 risk analysis and visualization. To the best of our knowledge, this paper is the first real-world implementation of the SensorThings API (
STA) and IndoorGML.
In order to validate the IoCT, an integrated COVID-19 solution was deployed and evaluated to monitor and analyze the risks of COVID-19 transmission in workplace reopening. For example, the following criteria may increase the risk of COVID-19 spread in an office room: If the room was used and the density of people was not regulated; if a sick person was present; or if people were not following social distancing rules etc. The proposed IoCT is able to access the risk history of each room using BLE proximity, deep learning-enabled cameras, and smart audio sensors. If the risk of spread in some rooms were high, appropriate alerts would be sent and received to shut down and disinfect the actionable list of contaminated places in order to prevent further transmission. This proposed IoCT was deployed using hybrid edge and cloud computing. The Calgary Centre for Innovative Technology (CCIT) building (with an area of 9530 m
2) located in the University of Calgary campus [
24] was used for a real-life testing scenario. The outcome of this solution will be useful for the protection of building staff and visitors as it integrates information-based solutions for real-time situational awareness and early warnings. The IoCT improves both the quality and speed of pandemic emergency response by enabling IoT system interoperability and unlocking necessary information for real-time decision making. The use of open-source software as well as the standard nature of this research boosts its usability as an international tool during the COVID-19 pandemic.
In summary, the main contributions of this work are: (1) The innovative implementation of the SensorThings API and IndoorGML for analyzing indoor COVID-19 spreading risk patterns; (2) Deploying and validating a low-cost, standard-based, real-time IoCT for COVID-19 situational awareness that adheres to open IoT paradigms with interoperable agile access to individual COVID-19 sensor data; and (3) Evaluating person-to-place COVID-19 workplace reopening scenarios for the first time using an open geospatial-based IoT.
The remainder of this paper is organized as follows:
Section 2 presents background information on IoCT conceptual modelling using new trends in geospatial open standards;
Section 3 presents the architecture proposal for the IoCT platform;
Section 4 details the proof of concept of our architecture proposal using a workplace reopening scenario;
Section 5 discusses the experimental results of the IoCT with the use of various sensors; and finally, this paper finishes with conclusions and an overview of future work in
Section 6.
3. Proposed Interoperable IoCT System Architecture
The following architecture was proposed for the IoCT in order to design, implement, and evaluate a scalable, interoperable design for incorporating various sensors, geospatial data infrastructures, and healthcare information for post COVID-19 reopening applications.
Figure 5 shows the IoCT proposed architecture for interconnecting an Internet of heterogeneous COVID-19 system of systems with the interoperable geospatial IoT technologies using OGC standards. The following sections summarize this architecture in three parts: Sensor and Data Extract, Transfer and Load,
OGC-Based Cloud Data Management, Storage, and Application layer.
The first section describes the “Extract, Transform, Load” (ETL) architecture for geospatial sensor data and resource datasets. Disparate geospatial and IoT data sources are available for monitoring and studying COVID-19 spread. The coordination of a diverse range of data requires a comprehensive communication, integration, and interoperability model. Existing IoT systems operate within silos of information, APIs, and proprietary data formats. Firstly, the proposed architecture aimed to aggregate heterogeneous and real-time COVID-19 data streams by extracting data from heterogeneous data sources. There were two types of location-based information used for the IoCT: Positioning and Proximity. GNSS-based positioning accurately (within two to five metres on average) estimates the outdoor location of a wearable device. Most proximity sensors only provide closeness information with a range of no more than five metres from a position that is usually represented by a Bluetooth beacon. Location information was integrated into a smartphone app in an edge gateway device for computation and the transference of data onto the cloud. The other data source for monitoring workplaces came from available data streams from smart camera and audio sensors. Smart cameras and audio sensors were attached to a Jetson Xavier NX development kit [
49] which served as the edge computation device for deep learning (DL) computation and the IoT gateway. Various sensor data streams were transformed by data cleaning and preparation for contact tracing query and analytics. This vast amount of spatial-temporal data was then inserted into a data stream Management System (DSMS) in near real-time. After ETL, the sensor data loading modules streaming the disparate data sources into the cloud module can be developed using the OGC STA, an open geospatial IoT exchange standard. In the cloud data storage, these datasets need to be aggregated into a unified geospatial data model and encoding also known as the
OGC IndoorGML hierarchy of indoor cell spaces.
The OGC-based cloud data management and storage section in
Figure 5 presents a cloud-native OGC standard-based IoT platform for people and place data management. A cloud-native architecture (a container-based environment) was designed and developed to enable distribute, scalable, and flexible management and access of the IoCT datastores. Building a cloud-native architecture with open geospatial standards enables interoperability and scalability. The proposed cloud architecture was based on Amazon Web Services (AWS) and is capable of scaling out, and up, to handle the high-volume, high-velocity, single, or multiple, real-time data streams and user access. The proposed IoCT architecture is geographically scalable and considers spatial indexing technology. This scalable IoT data cloud architecture was designed in a way that was distributed, load balanced, and without a single point of failure. Kubernetes, a container orchestration framework, and AWS Managed Services were used as the building blocks. To get real-time insights into data streams and prepare them for analytics, we designed some enrichment functionalities using the Lambda function that included, location, semantics, metadata, collection method, or contextual information. To create an interoperable common operating picture for spatial data, we used OGC standards, data models, and encodings, in addition to the OGC STA to connect not only different IoT platforms, but also external geospatial applications and visualization tools. The OGC Standard-based data records published in the AWS IoT Core were stored in Amazon DynamoDB which functioned as a fully managed, No-SQL, scalable database. The proposed cloud-native platform is able to support a flexible security model thus allowing for a range of policies to be implemented. A security layer was also implemented in the cloud to support a centralized security model in order to integrate different design choices and cryptographic models as dictated by public health response. This integrated security layer worked with different systems and increased the system’s security with a cryptographic design which was not decoded for the cloud. Then, a publish/subscribe model for data delivery was developed, allowing for different levels of data access.
The last section discusses how two prototype applications were built based on the open geospatial architecture. Firstly, we demonstrated the interoperation of the Internet of disparate COVID-19 solutions and then contextualized them using open geospatial standards such as the OGC IndoorGML. Secondly, a new and unique geospatial algorithm was examined by building a person-to-place risk model for cleaning indoor spaces based on the colocation or co-movement patterns of people and places (e.g., a room) in an effective and interoperable way. For the visualization purpose of this research, the SensorUp Explorer developed by SensorUp Inc. was used and further developed as a spatiotemporal Web dashboard.
4. Experimental Design
This section discusses the experimental design related to a cleaning risk use case as the most important prevention activity in post COVID-19 workplace reopening with the use of the IoCT as a multi-sensor platform. To effectively integrate the multi-sensor system for cleaning risk analysis, a multi-criteria evaluation [
50] was applied to identify, and rank COVID-19 risky behaviors based on an available multi-sensor system in the CCIT building at the University of Calgary campus. Since 80 percent of the data used by the proposed IoCT system was geospatially related, Spatial Multi-Criteria Decision Analysis (SMCDA) provided a superior framework for a variety of decision-making situations [
51,
52,
53].
4.1. COVID-19 Risk Assessment Using IndoorGML
The SMCDA simultaneously represents decision spaces as well as criteria values based on attribute and geographic topology [
50]. For this research, topological relationships from the OGC IndoorGML dual graph were used for risk aggregation for the multi-sensor system. A scientific SMCDA process can be put in place using the different steps shown
Figure 6. In order to initialize the decision-making process for this paper, equal weights for various risk criteria map layers were considered. This helped ensure fast implementation and quick proof of concept.
The main step for risk criteria assessment was determining the factors affecting the risk of COVID-19 spread based on information from existing studies from both the World Health Organization (WHO) [
27] and the Government of Canada website [
28]. The number of active virus particles present in a place was considered the most important factor for determining the risk of infection [
27]. Various transmission ways of SARS-CoV-2 transmission include airborne transmission caused by small droplets, and larger droplet transmission (droplets can survive up to several days on different surfaces) [
54]. The term “viral load” will also be used to refer to the number of active virus particles present in a space. Virus particles live for different lengths of time, depending on a number of factors, the most significant one being surface material. Risk of infection for any particular IndoorGML cell space was modelled as the viral load within the space.
Assuming that a proportion of any average group of people is infected, the viral load within a space increases along with the number of people occupying it, the amount of time the people spend in the space, and the actions of the people within the space. Talking loudly, exercising, and coughing expel more droplets into the environment than other activities, and thus increase the viral load within the space. The virus also passes from surface to surface through touch, so touching surfaces without cleaning hands in between also increases the viral load within the space. The viral load was broken down into a hierarchy of smaller cause similar to a root cause analysis in order to evaluate the different factors. The following layers represent the respective criterion maps. Effective parameters were identified based on available sensors and data according to the implemented IoCT multi-sensor system. The viral load risk criteria are listed as follow:
: Risk from Cleaning: Cleaning schedule reported on a smartphone app based on the time that had elapsed from the previous cleaning. For this paper, the cleaning frequency for each room was every 6 h, meaning that after six hours the risk is maximized at one whereas immediately after cleaning it is at 0. is a spatiotemporal map layer comprised of OGC IndoorGML cells with values between 0 and 1.
: Risk from Contact Tracing: Proximity tracing map extracted from beacons which includes a trajectory map of traced people on an OGC IndoorGML graph. These trajectories show the location of the cleaner and the number of people in a place. If a person identifies himself or herself as a COVID-19 infected person, the historical trajectories can be used for the contact tracing map layer calculation.
: Risk from People Density: Gathering restriction map from smart cameras which includes the number of people over each IndoorGML node. This value changes over a range of 0–1 based on the number of people divided by the capacity of the room (which can be assigned or generated from the area property of an IndoorGML cell node). This information is reported online and aggregated once the room is cleaned.
: COVID-19 Risky Behaviors: Risky behavior violation map which includes the number of incidents or violations (number of people violating social distancing, hugging, people touching common surfaces and objects, talking loudly, exercising, coughing, and sneezing). This value is a weighted average of the risky behavior factors (number of violations) over the frequency of cleaning (normally six hours for each room). This layer was generated using smart cameras and audio sensors based on the number of detected risky events using deep learning algorithms as described in the following sections.
As we progress with COVID, various criteria have been introduced and evaluated in COVID-19 spread risk [
54,
55]. Transmission ways of coronavirus and prioritizing their importance are still under debate and more studies are underway to understand transmission ways better [
56]. Although the risk calculation could be very complicated based on time, room volume, air circulation, etc. [
57,
58], our research’s scope includes a general risk assessment function which is a simple weighted average. The
COVID-19 viral load risk function was modelled using a weighted average of risky criteria (as mentioned above) over each IndoorGML node (e.g., a meeting room). For the initial prototype calculation of risk, Equation (1) was used:
For simplicity sake, we assigned each of the above map layers with a similar weight and computed the risk factor over each IndoorGML node using an aggregation function.
However, this risk function can be easily manipulated and configured by the users on the client side. So, we evaluated a set of different weights and evaluated them in
Section 5.7. In this new risk model, the wights are as follows:
Risk from Cleaning : since cleaning is a measure of potentially unknown risks of COVID-19 transmission, such as airborne particles passing through the ventilation.
Risk from Contact Tracing: : it is one of the strongest risks for the place
Risk from People Density: : people using the space or engaging in risky behaviors in the space are stronger indicators and higher weight.
Risky behaviours:: We assumed the number of people being the strongest due to the prevalence of airborne virus being multiplicative on the number of people in the room, whereas risky behaviors are less frequent, and therefore harder to weight ().
Moreover, the above Equation (1) does not take into account the duration of time spent occupying the space, the actions taken, and the decay of the virus particles over time. The algorithm restarts the people count and cough numbers after the space is cleaned. A slightly more complicated set of equations (Equation (2)) expands on the simple risk calculation by taking into account the amount of time that people remain in a particular space, and the decay of the virus particles over time (assuming the worst-case scenario of 72 h for all of the particles deposited to become inactive).
For future research, we will consider different risk profiles (i.e., optimistic and pessimistic) for various user groups (e.g., pessimistic risk profile for COVID-19 vulnerable people).
4.2. Interoperable IoCT Using STA
In order to integrate multiple COVID-19 sensor systems, the OGC STA was used to support the interoperability between the sensing layer, cloud data management, and cleaning risk assessment application.
Figure 7 shows an example of the OGC STA data model being used in the cleaning scenario for a specific Thing—an IndoorGML cell.
Every IndoorGML cell has a Location in space and time. This geospatial encoding was performed by GeoJSON (Geographical JavaScript Object Notation) [
59]. Every sensor was referenced by the IndoorGML cell in which the sensor was installed. Each Thing can have multiple Datastreams, which are collections of Observation entities grouped together using the same Observed Property. For the cleaning use case, a different Datastream for each sensor’s phenomenon was used. Each Datastream contained a Sensor and an ObservedProperty. This refers to the instruments that can observe a phenomenon. For this paper, eight different Datastreams were defined, including, proximity, density, and coughs. An ObservedProperty specifies the phenomenon and also contains the unit of measurement. A Datastream can have several Observations, and they dictate the value for the phenomena encoded by the OGC Observations and Measurements (OM). For our example, this can refer to the values taken from a sensor measurement. FeatureOfInterest identifies the characteristics of the Thing. The Thing entity is an IndoorGML cell and the FeatureOfInterest entity describes the characteristics of this cell. For example,
Figure 7 shows that “Duration” spent by a smartphone user in a room recorded within the proximity of a BLE beacon is considered a Datastream entity that kept the “Time” duration in seconds as an ObservedProperty. This Datastream entity used “Smartphones” as the sensor entity to keep Observations which are the duration of time that users spend in each cell in seconds.
Table 2 lists all Datastream entities that were used together with their sensing profile, whereby each property indicates the type of format that was encoded. For this research, all of the observations were sent to the Amazon IoT Core using smartphones and the Jetson Xavier NX development kit [
49]. The next step was to map observations to an instance of the
OGC STA endpoint using the Amazon Lambda functions. Interested readers can see and test the JSON payloads that were used to send all eight types of observations in
Supplementary Materials.
5. Results and Discussions
5.1. Smartphone Cleaning App
Cleaning activities play an important role in reducing the risk of being exposed to COVID-19. Three types of user activities were defined for the purposes of this research: Working (i.e., the user is busy working), not working (i.e., the user is either a visitor or having time off), and cleaning (i.e., the user is a staff member who is either cleaning or disinfecting the room). As seen in
Figure 8a, these three different activities were taken into consideration by the mobile application and the user-selected types of activities were internally stored in their mobile phones.
We assumed that after the cleaning activity was carried out, the risk of any COVID-19 viral load being present returned to zero. Over time, interactions between users and the space such as coughing, talking, and touching surfaces would again increase each room’s risk (Equation (2)). If a cleaner specifies in the mobile app that cleaning is done, the room will be marked as “cleaned”, and the risk will go down to zero. Cleaning staff, based on the COVID-19 dissecting rules and regulations forced by the facilities, are trained and clean the room using advanced cleaning equipment (e.g., electrostatic sprayers), which kills 99% viruses. This cleaning activity ensures the virus is killed, and there is no chance for cross-contamination. It is reasonable to assume that the facilities will take precautions with cleaning as much as possible. However, if this assumption is not valid, the risk will be increased over time, which complicates the calculations and increases virus spread and true-positive alarms. Considering cleaning activities resets the risk calculations for the final risk map and reduces false-positive COVID-19 notification alerts. In the future, we are going to evaluate standard-level cleaning activities for COVID-19 using smart cameras automatically. Furthermore, cleaning should include enhanced space ventilation, as airborne particles are remarkably decreased by adequate ventilation.
For this research, a virus transmission interval is assumed to be a time interval of 15 min. In other words, if user A was interacting with a room that had been used by a positive COVID-19 infected person, user B, the system would notify user A of probable exposure to the virus. If we consider the situation in which cleaning activity took place after user B left the room, the risk of being exposed by the infected place would be zero. This case can be considered a false positive notification alert for user A. As a result, the proposed system can considerably reduce false positive notifications by using different types of activities. A demo scenario of cleaning person is presented in
Supplementary Materials and the trajectories of both building cleaners and visitors is shown in
Supplementary Materials.
5.2. Proximity-Based Contact Tracing
For the purposes of this research the third floor of the CCIT building was selected for an experiment. After extracting the related metadata such as room names for the rooms from the IndoorGML, 12 Estimote Proximity beacons were spatially distributed between 12 different cell spaces. The contact tracing technique applied for this research was designed in a way that protects user privacy. The application detects the proximal appearance of users within the proximity zone of each beacon by considering the value of the Received Signal Strength Indicator (RSSI) that was broadcasted by the beacons. The duration of appearance of the user in the proximity zone defined for each beacon and the corresponding date and time information for this proximal appearance are the only information stored in the internal storage of mobile phones.
Figure 8b shows a screenshot of the developed mobile application for collecting different types of observations including BeaconID, time, date, and the duration that the target user spent in the proximal zone of each beacon. Assuming that the incubation period of COVID-19 is two weeks, the application will work as a background service that saves data internally for a two-week period.
In situations in which the user becomes a positive COVID-19 case, he/she can voluntarily share data captured within the past two weeks with the backend database management system. An AWS product Amazon Cognito was used to control user authentication and access to data storage. As shown in
Figure 8c, users are required to sign in/up for an Amazon Cognito account in order to share their information. After signing in as an authorized client, users can publish their internal information to the Amazon cloud as shown in
Figure 8d. All of the data related to the COVID-19 cases will be stored and managed in the DynamoDB database in the Amazon cloud. Our developed application was connected to the DynamoDB using another AWS product, the IoT Core. When new data is added to cloud storage, the contact tracing application will look for any matches between the backend data and the data stored internally in the user device. If it finds any matches that show that a confirmed COVID-19 positive case and the target user were close to each other for more than 15 min, the application will then notify the target user about potential exposure to COVID-19 and alert cleaning staff to disinfect the place. This process is shown in
Figure 9. A demo of people trajectories is shown in
Supplementary Materials.
There are various methods for indoor positioning, such as WiFi, BLE beacons, or dead reckoning. Using BLE technology is cost-effective compared to other indoor positioning techniques, which use maintenance, installation, and cabling costs. Generally, Bluetooth devices cost ~20× less than WiFi devices and have a similar WiFi accuracy [
60].
In this paper, we focused on BLE proximity detection for contact tracing instead of precise positioning. Three categories of user location will be of importance for this paper including immediate (less than 60 cm), near (1–6 m), and far (>10 m) distance of the Bluetooth receiver from active BLE beacon. On the other hand, it was still a challenge working with BLE signals that are interfered with by structures. Indoor setting and layout have direct effects on radio waves used in Bluetooth technology. Another challenge was that the different beacon types and battery states produce different signal strengths, so using one beacon library for all types of beacons was problematic.
In this paper, an active BLE beacon is placed in each IndoorGML cell (e.g., room). Moreover, we focus on proximity detection (i.e., immediate (within 0.6 m away), near (within about 1–8 m), and far (is beyond 10 m) distances from the active BLE beacon) to make indoor spatiotemporal trajectories using IndoorGML cell connectivity. We avoided having to determine the exact range by way of careful beacon placement to prevent overlaps. In the context of COVID-19 spread, locating in the immediate and near distance from the infected host would be dangerous for coronavirus transmission (through droplet transmission). Accordingly, different health organizations such as WHO recommended two meters distance from others. As a result, proximity detection should be of more importance in the COVID-19 context. In other words, considering precise positioning would only increase the computation cost in this specific application. Describing an indoor location using IndoorGML graph cell also helps with privacy. Considering privacy concerns for individual tracking, especially in indoor environments, we believe that proximity positioning respects user privacy more than precise positioning.
Depending on the size of the data, type of beacons, and network bandwidth, mobile proximity detection performance may differ. In our experiment, various beacons such as Estimote (
https://estimote.com/), Accent Systems (
https://accent-systems.com/) and Radius Networks (
https://www.radiusnetworks.com/) have been evaluated using the developed app on the Samsung Galaxy S9 smartphone. Our results demonstrated that the app could capture a beacon’s proximity of fewer than 60 milliseconds, which is enough for our case study. The complexity of the position determination depends on the beacon software development kit; however, the complexity is O(n) in the worst-case scenario. Concerning the duration spent in a room, we detected and recorded durations of less than five seconds when walking past beacons in a corridor. Significance of time for the sake of COVID-19 risk was not considered important for durations less than 15 min, which was standard practice. So, our sampling and recording intervals were much better than was required for COVID-19 risk evaluation.
The mobile application publishes a JSON payload to the AWS IoT Core cloud data management system in which:
Online service: A single record showing the presence of a user in the proximity of an active BLE beacon is published to the AWS IoT core.
Offline service: An array of records showing the user’s pretenses in a time window is published to the AWS cloud.
A JSON payload showing a single enriched proximity location captured by the developed smartphone app is shown in
Supplementary Materials. For more information regarding contact tracing app can be found in [
61].
5.3. Video-Based People Density
This section discusses the experimental design for our camera surveillance for counting people, People Density, or the number of people who entered or left a geofence polygon area. For indoor spaces, Physical Distancing rules result in restrictions on the number of people occupying a space. The input for the DL models was online video feeds of fixed cameras focused on the regions of interest defined as IndoorGML cells (e.g., rooms, corridors, lobbies, elevators, stairs, and coffee places). Some cameras might even be able to cover multiple regions of interest (IndoorGML cells), depending on where they are installed and if the spaces are separated by glass walls or windows. An alarm can be triggered by the number of people entering or exiting a region (identified in the camera image) if the density of people exceeds the density of that area. Moreover, the number of people violating physical distancing rules can be identified and reported to the IoCT.
For our cleaning use case demo (
Supplementary Materials), we considered a meeting room as an IndoorGML node (Room 326) with a four-person capacity. For this demo, the OGC indoorGML was used as it offered the following advantages: IndoorGML cells were defined as the geofence; the geometry and area of each cell (geofence) were calculated and the location of each indoorGML cell (the centroid of the geofence) was used for the enrichment of the camera data. The number of people entering or exiting each cell was monitored. People in each frame were detected in real-time using a pre-trained You Only Look Once (YOLO) model [
62] and the results were then published as an MQTT message to the AWS IoT Core. On the backend, the maximum allowed people in a cell, or cell capacity, was either assigned by the building management, or calculated by dividing the cell area into squares of six feet two inches. The “Gathering Restriction”—the number of people over each IndoorGML node—was then calculated. This value changes over a range of 0–1 based on the number of people divided by the capacity of the room. Should the number of people exceed the cell capacity, a Gathering Restriction alarm would be generated for the cell. The following figure (
Figure 10) shows a frame of the meeting room, detected people, and Gathering Restriction alarm. The video demo of this scene is attached in
Supplementary Materials which shows the people count online when they enter or exit the room.
5.4. Video-Based Physical Distancing
Physical Distancing was monitored for each cell using a pre-trained YOLO model for detecting people in that cell. Relative distance was then calculated as follows: The pairwise distance between two people is the distance between the two similar corners of their bounding box. In order to minimize the camera’s vanishing point effect, the distance was then compared to their bounding box diameters. If the distance was less than the longest diameter, it was assumed that the relative distance between those people was violating the Physical Distancing rule. For the following example, the view from a fixed camera was divided into several polygons (geofences). This can result in the creation of separate geofences (indicated by the IndoorGML nodes if they were in the building) from the camera’s viewpoint. The number of people per geofence polygon and the number of times that people were closer than two metres were reported to the IoCT. The following figure (
Figure 11) shows a frame of multiple geofences in an outdoor area, the detected people, and the Physical Distancing violations. The video demo of this scene is attached in
Supplementary Materials which shows the people count online when they entered or exited the geofences, as well as the physical distancing violations. Outdoor geofences can be connected to the IndoorGML graph nodes.
5.5. Video-Based Risky Behavior Detection
Camera stream processing is a popular and quick way to detect objects. Human behaviors and actions can be detected as objects from the video frames using a trained deep learning model. For the detection of risky behaviors such as coughing, hugging, handshaking, and doorknob touching, the You Only Look Once version3 (YOLOv3) which is suitable for real-time behavior detection for online video streams, was trained and applied [
63,
64]. This library classifies and localizes detected objects in one step with a speed of faster than 40 frames per second (FPS). We considered two main types of risky behaviors for COVID-19 indoor transmission: Group risky behaviors (e.g., hugging) and individual risky behaviors (e.g., coughing).
Figure 12 illustrates how to train a model for COVID-19 transmission risky behavior detection using YOLOv3.
In total, 603 images for coughing, 634 images for hugging, 608 images for handshaking, and 623 images for door touching were used from COCO dataset [
62] for transfer learning for the pre-trained model (YOLOv3). These images were taken from free sources found through Google image searches. For labelling objects, a semi-automatic method was applied. Darknet library was also used for training. For individual behaviors, all of the people in images were detected and labelled in a text file whilst the algorithm aggregated intersected bounding boxes of people into a single bounding box. As wrong labels might be generated, the images should be manually checked to correct misclassified objects. For this step 80 percent of the images were selected for training and 20 percent for testing. To increase the accuracy of this model, the configuration in
Table 3 was used.
To increase training accuracy and speed, a transfer learning process was applied. The base layer is a pre-trained YOLOv3 that uses the COCO dataset for all of the layers of our model except the last. Transfer learning helps with training by exploiting the knowledge of a pre-trained supervised model to address the problems of small training datasets for COVID-19 risky behaviors [
65]. To evaluate the accuracy of the model, we tried to check the results for different video datasets by exporting all of the frames for detection under various circumstances for the metrics listed in
Table 4.
After studying the outcomes, we found that the “hugging” and “handshaking” classes experienced the highest false negative results compared to coughing as the larger dataset was being prepared for training. It appeared that hugging and handshaking (grouping actions) were more varied in terms of the types of handshaking and hugging. Therefore, training precision could be improved with the preparation of more varied data. Moreover, some of the false positive results for coughing showed that in most cases, moving a hand near the face was detected as coughing, regardless whether it had actually taken place. Furthermore, the number of false negatives increased in a more populated area. Detected touching behavior results demonstrated high numbers of false negative cases. About 75 percent of false negative cases occurred when the predictor incorrectly detected small objects. Therefore, specifying limitations for box sizes and level of confidence for the predictor can reduce false negatives. The results of evaluating precision, recall, F-score, and number of samples for each behavior action class is listed in
Table 5.
5.6. Audio-Based Risky Behavior Detection
This section examines an audio classification algorithm that recognizes coughing and sneezing using an audio sensor with an embedded DL engine. The methodology for audio detection is shown in
Figure 13. This figure shows the four main steps of the audio DL process.The recording needs to first be preprocessed for noise before being used for extracting sound features. The most commonly known time-frequency feature is the short-time Fourier transform [
67], Mel spectrogram [
68], and wavelet spectrogram [
69]. The Mel spectrogram was based on a nonlinear frequency scale motivated by human auditory perception and provides a more compact spectral representation of sounds when compared to the STFT [
3]. To compute a Mel spectrogram, we first convert the sample audio files into time series. Next, its magnitude spectrogram is computed, and then mapped onto the Mel scale with power 2. The end result would be a Mel spectrogram [
70]. The last step in preprocessing would be to convert Mel spectrograms into log Mel spectrograms. Then the image results would be introduced as an input to the deep learning modelling process.
Convolutional neural network (CNN) architectures use multiple blocks of successive convolution and pooling operations for feature learning and down sampling along the time and feature dimensions, respectively [
71]. The VGG16 is a pre-trained CNN [
72] used as a base model for transfer learning (
Table 6) [
73]. VGG16 is a famous CNN architecture that uses multiple stacks of small kernel filters (3 by 3) instead of the shallow architecture of two or three layers with large kernel filters [
74]. Using multiple stacks of small kernel filters increases the network’s depth, which results in improving complex feature learning while decreasing computation costs. VGG16 architecture includes 16 convolutional and three fully connected layers. Audio-based risky behavior detection is based on complex features and distinguishable behaviors (e.g., coughing, sneezing, background noise), which requires a deeper CNN model than shallow architecture (i.e., two or three-layer architecture) offers [
75]. VGG16 has been adopted for audio event detection and demonstrated significant literature results [
71]. The feature maps were flattened to obtain the fully connected layer after the last convolutional layer. For most CNN-based architectures, only the last convolutional layer activations are connected to the final classification layer [
76].
The ESC-50 [
77] and AudioSet [
78] datasets were used to extract cough and sneezing training samples. The ESC-50 dataset is a labelled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification. AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labelled, 10 s sound clips taken from YouTube videos. Over 5000 samples were extracted for the transfer learning CNN model which was then divided to train and test datasets. We examined the performance of the trained CNN models using coughing and sneezing. The results are shown in
Table 7.
5.7. Risk Calculation and Visualization
To demonstrate risk calculation using Equation (2), we evaluated the proposed IoCT using the following cleaning use case scenarios. In meeting room number 326 of the CCIT building, the number of people increased as people entered the room, and this event was detected by a smart camera in the room. The number of people was shown online in the video frame and map visualization browser in green until the room capacity (five) was reached. When the fourth person came in (room capacity is assumed to be three), the alarm notification for “Room exceeded capacity” is shown. After that, a person coughed in the meeting room, and this event was detected by both the smart camera and audio sensors. A notification showed “Cough detected”. Then, the person who coughed opened the door and this event was detected by the smart camera. A “High-risk behavior detected” notification was shown. The risk profile at that moment exceeded the threshold of 0.7 and a notification was sent to the people in room, and to a cleaner. The color of the room polygon turned red indicating high risk and the room polygon was extruded (i.e., the polygon height increases) proportional to the risk value. People started to leave the room causing the risk from People Density to go down, but the risk is higher than at the very beginning as a coughing event had occurred. The total risk value of the meeting room falls but remains higher than before the risky behavior (i.e., cough) took place. The cleaner closer to the room changes his activity status to cleaning (shown by an icon on the map) and moves closer towards the room (from elevator to room). The cleaner trajectory alongside the other people trajectories extracted from BLE beacons were visualized too. After the cleaning activity, the room’s total risk level goes back down to zero and the color of the room polygon changes back to green. The video demo of this scene is attached in the
Supplementary Materials which shows the risk profile of the room. A sample screen shot of the
Supplementary Materials demo video is presented in
Figure 14.
To evaluate the impact of various weights assigned to different map layers, we used two sets of weights for map layer aggregations on the client side:
; and
,
,
and
as mentioned in
Section 4.1.
Figure 15 shows two risk profiles for room 326 over 40 min from 20:00 to 20: 40 p.m. on 11 June 2020.
Evaluating precision, recall, and F-Score of video-Based and audio-Based risky behavior detection are listed in in
Table 5 and
Table 7 accordingly.
Table 8 includes time performance of different developed functionalities (e.g., video-based person density, video-based physical distancing, video-based risky behavior detection, and audio-based risky behavior detection) on various platforms such as Jetson NX, laptop, and android smartphone. The performance of using a deep learning engine is highly dependent on Graphics and Computing processors. Therefore, the performance of those functionalities is evaluated on a laptop with more robust processing units. The laptop has NVIDIA GeForce RTX 2070 with 7.5 computation capabilities and a Core i7. Therefore, the performance on Jetson NX is lower than on the laptop. The best performance values are video-based risky behavior detection because they only involve the object detection task. Audio-based risky behavior detection segments the voice in specific time frames and converts them into spectrogram images. Voice patterns are detected in images using the VGG model. Therefore, the time of processing for audio is higher than video object detection. Video-based people density and video-based physical distancing give worse performance values than simple object detection regarding complexities in tracking functions.
6. Conclusions
This paper presents an Internet of COVID-19 Things platform called IoCT which offers two main contributions: (1) The design and development of a low-cost, real-time, comprehensive situational awareness for workplace reopenings after COVID-19; and (2) Interoperability through the open geospatial standards for indoor COVID-19 person-to-place risk assessment. In addition, the proposed platform is able to be applied to any kind of sensor and for use with different applications.
The proposed IoCT platform offers an easy connection between software and hardware which is necessary to achieve a global-level COVID-19 pandemic situational awareness. At the software level, a cloud architecture was developed for the IoCT, and the Sensors incorporated in this study are able to be included into it with minimal effort. At the hardware level, it offers a plug and play connection which will be explored for future research. It offers the possibility for scaling and access to a large number of low-cost sensors (manufactured by different companies) with an interoperable IoT design using the OGC STA as a conceptual modelling layer on top of the AWS. Furthermore, it provides the option of expansion because of the many compatible components which lead to the schematics being fully available.
In order to validate the proposed architecture, the IoCT sensor network was created and validated using multiple Things, Sensors, and Datastreams. Using the case of a scalable and connected COVID-19 IoT system, we deployed an interoperable sensorized platform to create a comprehensive picture for a post COVID-19 workplace reopening. A cleaning use case was developed for the University of Calgary campus to validate this. This platform was developed using an Android smartphone and Jetson NX, and applied the use of various sensors including BLE, camera, and microphone to provide many benefits. A network with 2 IoCT Things and 20 Sensors was successfully deployed. Each IoCT Thing was designed based on the IoT paradigm and can be considered a smart object that is permanently connected using the Internet Protocol.
Another benefit of using open standards is that they offer interoperable applications that facilitate access to data and reusability. A Web client was deployed to consume data for the OGC STA provided by the IoCT platform. The OGC STA offers easy and agile access to sensor data using IoT paradigms. Moreover, the OGC IndoorGML allows for the aggregation of various cameras and contact tracing systems that can work together in a common indoor risk model and exchange various data within the space model for risk calculations. The OGC IndoorGML model can be used for various trajectory mining as well. The IoCT can be used for person-to-place interactions in order to identify those who may have been in close contact with an infected person, or with a virus-contaminated place. Moreover, the proposed system will inform people to take appropriate actions such as cleaning, social distancing, testing, isolation, or choosing safe pathways and locations. This paper improves both the quality and speed of pandemic emergency response by enabling IoT system interoperability and unlocking necessary information for real-time decision making, as well as accelerating new application development that is interoperable, scalable, and extensible.
Our future work will explore the interoperability between various BLE systems and standards to achieve plug and play contact tracing apps with various contextual information [
79]. Another area for future research would be applying different data analysis to the indoor trajectory data provided by the IoCT platform [
80]. For that we would attempt to obtain different metrices for person-to-place scenarios using an aggregation of the camera and BLE sensors for trajectory estimation [
61]. This analysis will include spatial-temporal methodologies for real-time event detection using the deep learning module.