1 Introduction
Recent years have witnessed the advent of indoor localization systems harnessing the capabilities of smartphones [
40,
49,
53,
54,
55,
67]. At the core of indoor localization systems, WiFi-based systems are built on the ubiquitous deployment of WiFi technology. Toward making WiFi-based localization possible at the intended accuracy level, fingerprint-based approaches are proposed, as they have been shown to be capable of providing accurate, fine-grained positioning [
1,
2,
4,
6,
7,
13,
27,
28,
30,
39,
42,
45,
51,
62,
63,
68,
70]. Fingerprinting works in two phases: offline and online. During the offline phase, the received signal strengths (i.e., RSS), from the
access points (APs) installed in the area of interest, are recorded by a cell phone carried by the site surveyor. This is done at specific reference points. The surveyor has to remain at the exact points over a relatively long time and tag these recorded data manually through the use of a data collection application. The collected fingerprint is then leveraged to build a localization model that can be either probabilistic methods, e.g., Reference [
68], or machine learning-based, e.g., References [
1,
18,
47]. During the online phase, this model can subsequently be used to obtain a real-time estimate of the user’s location.
The data collection required for fingerprinting is time-consuming, vulnerable to environmental change, laborious, and unscalable; especially in large testbeds. To tackle this problem, several techniques have been proposed including using robots, additional sensors, computer vision, crowd-sensing, propagation models, and/or data interpolation techniques [
14,
22,
34,
57]. These approaches do not account for the effect of humans or reduce the ubiquity and/or accuracy of the localization system or may have other limitations (e.g., require a complex calibration process, raise privacy concerns, require interaction from the users, etc).
In this article, we propose
LiPhi++, a system for seamlessly enabling fingerprinting-based indoor localization without the associated data collection overhead inherent in the traditional fingerprinting method. The idea is to opportunistically leverage transportable
laser-range scanners (LRSs) (or LiDARs) in a user-transparent way to tag WiFi scans collected during the normal movement of building users without human intervention.
LiPhi++ can also construct the required fingerprint database and build the localization model with as few as only one LRS. In addition, the used LRSs can be temporarily deployed and then reused in other buildings (Figure
1), significantly reducing the overhead and cost of deploying a WiFi localization system.
Nevertheless, LiPhi++ needs to make provision for a number of challenges including handling instantaneous changes of the WiFi signals that could yield spurious points in the estimated traces, matching anonymous LRS traces (i.e., sequence of point labels) to the WiFi scans collected by the identified user, and constructing robust deep learning localization models. For this, we introduce a novel iterative user trace refinement approach that uses temporal and spatial location smoothing and outlier detection to ensure the validity of the initial trace shape. This estimated user trace is matched to the available LRS traces collected at the same time with the lowest cumulative pointwise distance. These automatically labeled user traces constitute a large (i.e., dense) fingerprint database that enables the effective training of deep learning models. Additionally, LiPhi++ includes provisions to ensure the generalization and robustness of the trained model against overfitting.
We implemented LiPhi++ on Android devices in two different testbeds. Our results show that LiPhi++ outperforms the state-of-the-art indoor positioning techniques in both testbeds by at least 284%. This accuracy is obtained without any fingerprinting overhead or user intervention along with robustness to temporal variations of the signals and infrastructure.
This article extends our earlier work in Reference [
48]. Specifically, we propose a new localization model (multimodal deep recurrent neural network) to provide more robust performance. This performance is achieved through training the localization model on a sequence of input scans rather than a single scan as in Reference [
48]. Therefore, the proposed localization model is designed to learn the underlying relationship between the signals received from WiFi access points as well as the temporal correlation (i.e., historical changes) between successive scans, leading to better localization performance. Furthermore,
LiPhi++ compensates for the temporal variations of the WiFi signals and the class imbalance problems through the use of both spatial discretization and augmentation techniques. The proposed model and its associated modules enhance the accuracy by 18.5% and 37.3% compared to our earlier method in Reference [
48] when tested with data collected on the calibration time or a few months later, respectively.
The rest of this article is structured as follows: Section
2 gives a background on laser range finders and discusses practical issues facing WiFi-based fingerprinting localization. In Section
3, we provide an overview of
LiPhi++. Section
4 presents in detail the methodology proposed by
LiPhi++. In Section
5, we describe the data collection process and provide a detailed evaluation of the system. In Section
6, we discuss the research carried out in literature and most relevant to
LiPhi++. Finally, we conclude the article in Section
7.
2 Background and Motivation
In this section, we start with a background on laser range scanners. Then, we discuss issues that need to be addressed by traditional fingerprinting techniques.
2.1 Laser Range Scanners
LRSs are devices that detect surrounding objects using eye-safe lasers. A laser beam scans the scene in one or two dimensions and can obtain accurate distance at each angle with sub-decimeter errors and with high frequency (e.g., 20 scans per second at each angle). Since the LRS units become less expensive and more popular [
23,
31,
65], they can be used more in large indoor environments such as malls and museums. Our team has developed a LiDAR-based tracking system (Figure
1) and has deployed it in a shopping mall for the purpose of developing
LiPhi++. As shown from the figure, it is battery operated with a place-and-play feature, using Raspberry Pi 3 and an LTE module. Therefore, the setup overhead cost of the LiDAR-based tracking system is negligible (which includes placing the LiDAR in a specific location and marking its location on the map). Using LiDARs enables the collection of large amounts of dense data, which facilitates the effective application of accurate, though data-hungry solutions (e.g., deep learning).
2.2 Fingerprint Construction Overhead
In this section, we quantify the site survey overhead (in terms of time) that is incurred by a typical manual fingerprinting process. Given \(n\) reference points in the area of interest and assuming the time for collecting the information at each point is \(\tau\) minutes, and a constant time for moving to a new point and tagging the location the surveyor is standing at (setup time, denoted by \(\alpha\) ); then the site survey overhead is computed as \(n \times (\tau + \alpha)\) . For instance, to construct a fingerprint database for a shopping mall at 1,000 points with \(\tau = 5\) mins and \(\alpha = 1\) mins would require 100 hours in data collection, highlighting the massive overhead inherent in manual fingerprinting. Furthermore, this time cost has to be paid with every change in the testbed (e.g., moving an AP to a new location). In contrast, LiPhi++ requires zero-extra overhead as the fingerprint database is constructed opportunistically and transparently from the users of the building in their normal movements in the environment. Additionally, the used LiDAR-based tracking system covers a large area and is easily transportable with negligible overhead.
2.3 Temporal Variations
The RSS values from an arbitrary AP at a given location have two types of variations:
Short-duration variations and
Long-duration variations. Short-duration variations are due to occasional environment variations, e.g., user’s movements, and lead to drastic variations in the RSS measurements received from some APs over a short time. These types of fluctuations usually lead to location estimates that remarkably deviate from their preceding or subsequent estimations and can be considered outliers. However, Long-duration variations refer to attenuation of the signals due to lasting changes in the environment that permanently affect the quality of the fingerprint database on which the localization model is built. Figure
2 shows the two RSS distributions of an arbitrary AP at the same location in two different months (July and November), which depicts how the distribution changes over time. This change can be seen as an inadvertent covariate shift [
58] and leads to a remarkable drop in the accuracy of the trained machine learning models [
16,
48,
58]. This effect is empirically quantified in Section
5. To tackle this issue, the fingerprint database would need to be rebuilt, which would involve the associated overhead cost estimated in the previous section. By leveraging the laser range scanners,
LiPhi++ can keep the fingerprint up-to-date, significantly reducing the effects of the temporal variation with zero extra overhead.
However, the installed APs and/or their parameters (e.g., transmission power) may change over time due to maintenance/replacement or the addition of new APs. These changes lead to variation in the APs density in the area of interest. For example, during the development of this work, the APs in the building of our Lab have been maintained/replaced, leading to a severe decrease of the originally installed APs by 30%. This, in turn, negatively affects the already-trained traditional localization model as it loses significant information while it cannot consider the new APs. This highlights the importance of continuously updating the fingerprint as in LiPhi++.
3 Problem Statement and System Overview
3.1 Problem Statement
Without a loss of generality, we assume the user’s phone is tracked in a two-dimensional (2D) indoor environment \(\mathbb {L}\) containing \(m\) access points and \(q\) LRS devices. The user is at an unknown location \(l \in \mathbb {L}\) carrying a device receiving scans for the nearby APs. Let an arbitrary WiFi scan be represented as \(x_i=\lbrace x_{i1},\ldots , x_{in}\rbrace\) , where \(n \le m\) due to the noise and AP fluctuation and the \(j{\rm th}\) entry is the RSS measurement from the \(j{\rm th}\) AP in the \(i{\rm th}\) scan. In the offline phase, the problem is formally expressed as follows: Given a signal strength vector \(x_i=\lbrace x_{i1},\ldots , x_{in}\rbrace\) , the coordinates of a subset \(n_z \in n\) of APs with known locations (called reference APs), LiPhi++ seeks to find a rough estimate of the sequence of user locations, \(r = \lbrace r_1,r_2,\ldots , r_k\rbrace\) of length \(k\) , with coarse-grained (i.e., relatively low accuracy) accuracy. These coarse-grained estimates are then required to be improved using a set of LRSs that are temporarily installed in the area of interest. However, accurate LRS-based location estimates \(l = \lbrace l_1,l_2,\ldots , l_k\rbrace\) do not include information about which location belongs to which user. Therefore, LiPhi++ is asked to correct the WiFi-based per-user locations with the accurate but user-anonymous LRS locations. This process yields the targeted automatically constructed fingerprint database. However, the problem in the online phase becomes the following: Given an RSS vector \(x_i=\lbrace x_{i1},\ldots , x_{in}\rbrace\) , we aim at finding the location \(l_i\) that maximizes the probability \(P(l_i|x_i)\) . To answer this query, LiPhi++ trains a deep localization model with the constructed fingerprint database.
In the next sections, we discuss the details of how LiPhi++ builds the fingerprint database and permits robust localization in continuous space.
3.2 System Overview
Figure
3 shows the architecture of the proposed system.
LiPhi++ has two phases: an offline phase and an online tracking phase. In the offline phase,
LiPhi++ aims to
automatically construct a fingerprint database and a deep localization model. Toward achieving these goals, the system designer initializes the offline stage by feeding the floorplan layout of the environment with APs’ locations to the system.
1 For constructing the required fingerprint database, the building users scan the WiFi measurements from the deployed APs in the area of interest using the
WiFi Scan Collector module in a crowdsourcing manner. This module is an application installed on the user’s phone. Note that these scans are collected without any manual intervention from the users. Hence it does not require user feedback as in previous systems, e.g., Reference [
32]. However, the location of the collected scans can be coarsely estimated by the
WiFi Trace Estimator module based on a propagation model as verified in Reference [
13]. Simultaneously, the deployed laser range scanners detect moving objects (i.e., users) in the considered environment. The LRSs scans are further processed by
LRS Trace Estimator module to obtain the sequence of pedestrian positions forming a trace. Then, the
Trace Matcher module is responsible for matching and correcting the estimated WiFi-based trace by the location tags of the most similar LRS-based trace along the available walking paths. As a result, a fingerprint database is constructed of timestamped WiFi scans and annotated by its corresponding LRS-based labels. This database is leveraged by the
Localization Model Builder to train a deep neural network, which is used later in the online phase.
During the online phase, the user carrying her phone at an unknown location scans for WiFi information from the detectable APs in the area of interest. These scans are then forwarded to the LiPhi++ server. The Location Estimator module feeds the data to the localization model constructed in the offline phase to estimate the current user location.
5 Evaluation
In this section, we evaluate the performance of
LiPhi++ in two real-world indoor testbeds whose details are presented in Table
1. The first one (denoted as Office) is a big office of 240 m
\(^2\) area at a service building in our university (Figure
12). It is a cluttered indoor environment that contains desks, whiteboards, and bookcases. The second one (denoted as Floor), shown in Figure
13, is a larger indoor testbed spanning a whole floor in another university campus with a 629 m
\(^2\) area containing several labs of different sizes and furniture placements, meeting rooms, offices as well as corridors.
First, we describe how the data are collected and the software used. Next, we study the effect of the different system parameters on LiPhi++’s accuracy. Finally, we compare the performance of LiPhi++ to three state-of-the-art localization systems.
5.1 Data Collection Setup and Tools
The data are collected with an Android application designed especially for this task. This application continuously scans for the nearby APs in the area of interest and records the information of each one including the current time, the MAC address (ID), and the corresponding signal strength (i.e., timestamp, ID, RSS). The scanning rate is set to 1 scan per second. Even though we have a total of 136 and 52 APs detected in the Floor and the Office testbeds, respectively; we use five and eight reference APs, respectively, since those are the ones with known
a priori locations. LRS units are uniformly distributed over the area of interest and they cover around six rooms of the Floor environment.
2 Four LRSs are deployed along the periphery of the walls in the Office testbed. All LRSs are installed at the same height of 1.40 m, and user traces can be detected as visualized in Figure
5.
Test points were collected on a uniform grid with a 1-m spacing using the traditional fingerprinting approach for evaluation only. Note that LiPhi++ does not require any calibration or collection of data in the traditional fingerprinting manner to build the fingerprint database. The data were collected using several Android phones, including Samsung Note8, HTC One X9, Motorola Moto G5, among others. This is done with a view to capturing the device-variant characteristics of the WiFi measurements. The total number of samples that are transparently collected via crowdsourcing and automatically labeled at the Office and the Floor testbeds is 13 K and 208 K samples, respectively. This number of samples is increased triple times by the data augmentation module. Then, 80% are used for training, and 20% are dedicated for validation purposes. Holdout test scans are collected on different days to show how the system will perform in the presence of environmental changes over time. The test data were collected at 128 and 567 fingerprint points in the Office and the Floor testbeds, respectively. The number of test samples per fingerprint point is 100 and 120 samples in the Office and the Floor testbeds, respectively. We implemented our deep localization model using the Keras learning library, which is a high-level neural network API running on top of the Google TensorFlow framework.
5.2 Effect of Changing LiPhi++ Parameters
In this section, we evaluate the effect of the different parameters and factors that affect
LiPhi++ performance. In the following subsections, we show the effects of changing these parameters only on the Floor testbed for clarity of presentation. We report the optimal obtained parameters in Table
2. However, we report how
LiPhi++ performs in both testbeds in Subsection
5.2.8.
5.2.1 Effect of Virtual Grid Spacing.
Figure
14 shows the effect of changing the virtual cell spacing (i.e., reference point density) on the overall accuracy of the system and the corresponding time required by the
WiFi-based Trace Estimator module to provide a location estimate. The run-time is calculated using a Lenovo Thinkpad X1 laptop running a 2.2-GHz Intel i7-8750H processor with 64 GB RAM. The figure shows that, as expected, a smaller spacing between reference points yields high localization accuracy with a negligible delay in the calculation time. Nevertheless, this is performed only during the offline phase and
LiPhi++ does not affect the real-time performance of the system. Additionally, the figure shows that a reference point spacing of up to 0.5 m is enough to maintain the high accuracy of
LiPhi++. It is worth mentioning that at a grid spacing of 1.5 m, a remarkable relative drop in the system accuracy is observed as virtual reference points have been defined over some non-accessible locations.
5.2.2 Effect of Varying Density of Reference APs.
Figure
15 shows the effect of changing the APs density on the system accuracy. For this, we uniformly and incrementally removed APs from the eight total reference APs present in the area of interest. The figure shows the accuracy of the WiFi trace estimator module degrades as the number of available APs is reduced. However, even with as low as five reference APs,
LiPhi++ maintains a steady localization error of around 1 m. This high accuracy with a relatively low number of APs can be explained as a result of two processes: First, the location resetting using the accurate LRS trace estimator. Second, the used data augmentation techniques help the model to maintain its localization accuracy even with low AP densities. This highlights the robustness of
LiPhi++.
5.2.3 Effect of LRS-based Labeling.
In this section, we study the influence of training the localization model of
LiPhi++ using WiFi scans labeled by LRS as compared to labeling using the
WiFi-based Trace Estimator module only. Figure
16 shows boxplots of the localization error of the system in both cases. The figure depicts that, as expected, leveraging LRS gives a drastic improvement in median error (273.8%), compared to the case of relying only on the coarse-grained WiFi-based labels. This can be attributed to the LRSs’ refinement of the training data, which significantly enhances the learning of the localization model and justifies the impact of using the place-and-play LRSs on the
LiPhi++ system.
5.2.4 Number of Layers in the Network.
Deep learning is designed to provide a hierarchical learning ability that can be achieved through cascading different layers. Therefore the number of layers of the deep network is one of the effective hyperparameters to boost the system performance. Figure
17 shows the effect of changing the number of layers on
LiPhi++ accuracy. Empirically, the figure shows that increasing the number of layers increases the accuracy. This can be justified as the deeper models have more parameters and better learning ability. The figure also shows that, beyond an optimal value of five layers, the model tends to overfit the training data, leading to an accuracy drop.
5.2.5 WiFi Estimator Smoothing Window.
Figure
18 shows the effect of varying the number of the WiFi scans utilized in estimating a location by the WiFi-based estimator module (Section
4.2). The figure shows that the more scans are fed to the module, the better localization accuracy until it reaches an optimal value at
\(v= 5\) beyond which degradation occurs. This can be justified by two opposing factors. (1) Increasing
\(v\) results in more information for location smoothing and outlier avoidance. (2) However, as
\(v\) increases, more time is spent to collect these samples, which may lead to locating the user in a preceding location (i.e., latency in response). A balance is achieved at a window size of five scans, which leads to the best performance.
5.2.6 Effect of Data Augmentation.
Initially, 13 K and 208 K samples are automatically collected and labeled at the Office and the Floor testbeds, respectively. Then, the data augmentation module increases the amount of training data multiple times, enabling efficient utilization of deep learning models. Figure
19 shows the effect of leveraging the augmented data on the localization performance. The figure shows that data augmentation improves the
LiPhi++ performance compared to augmentation-free training by 66.2%. The figure also confirms that the more training samples generated by the augmentation technique, the better the performance to cope with real-world deployments by implicitly simulating the inherent variation of the noisy wireless channel. Beyond using three times multiple of the original data,
LiPhi++ performance tends to saturate then deteriorate as the noisy data becomes dominant relative to the non-synthetic/original samples.
5.2.7 Sequence Length.
Figure
20 shows the performance of
LiPhi++ when varying the number of timesteps of the sequence that is fed to the RNN as an input. The figure depicts that as the input sequence gets longer, the positioning accuracy improves. This is due to the fact that the localization model has more information (multiple scans) over time, which helps the model to avoid spurious samples generated due to temporal signal variations. The model saturates at an optimal value of five timesteps (i.e., 5 seconds), which yields the best performance. Note that since
LiPhi++ works with overlapping sequences, it provides an estimate for every one (the scanning rate), enabling real-time tracking. Longer sequences exceeding this length may lead to a drop in the system performance as the sequence will cover multiple user locations.
5.2.8 Performance in Different Testbeds.
In this section, we evaluate how the system would perform in two different testbeds: the Floor testbed and the Office testbed (Figures
12 and
13). The former is larger (629 m
\(^2\) ) with many rooms; therefore, it requires a wider coverage by LRSs while the latter is smaller (240 m
\(^2\) ) an open area without inner walls. In the Floor testbed, six LRSs are leveraged to cover the whole area of 12 rooms incrementally, and the Office testbed has four LRSs, covering the entire area of interest. Figure
21 shows that
LiPhi++ obtains better performance in the Floor testbed as compared to the Office testbed. This can be justified due to two reasons: (1) The number of considered APs (input vector) in the Floor testbed is more (136 and 52 on the Floor and the Office testbeds, respectively), which favors its performance as the model learns more information, and (2) WiFi signatures are more location discriminative in the Floor testbed due to the presence of walls and the richer multi-path environment.
5.2.9 Performance of Fine-tuning.
Figure
22 shows the localization performance of
LiPhi++ when training the localization model from scratch with the whole dataset compared to fine-tuning the already trained model with a few new samples collected later after some changes in the environment (typically 1,000 samples). The figure shows that
LiPhi++ behaves equally in terms of localization accuracy for the two cases. However, fine-tuning provides tremendous savings of training time in the online phase as it takes as low as only 13 epochs for convergence. However, training the model from scratch requires 1,225 epochs to converge. This is because fine-tuning starts with good initial values of the parameters as compared to starting from a random set of parameters in the other case.
5.3 Comparative Evaluation
In this section, we compare the accuracy of
LiPhi++ to one traditional fingerprinting technique that builds a denoising autoencoder for localization (WiDeep [
1]) and another probabilistic WiFi-based localization technique that automatically constructs the fingerprint database using labels obtained from BLE-devices (iBeacons), HybridLoc [
52]. For a fair comparison, all techniques are deployed in the same environment and trained on the same data (i.e., RSS data) collected from a total of 136 APs. Additionally, 10 iBeacons are installed in the environment, which is required for HybridLoc’s operation, as reported in Reference [
52].
5.3.1 Location Accuracy.
Figures
23 and
24 show the CDF of distance error for the three techniques with temporal variations. Figure
23 shows the accuracy of the three systems when tested with a fresh fingerprint. Specifically,
LiPhi++ and WiDeep [
1] are matched in performance even though WiDeep [
1] is trained with manual fingerprinted data while
LiPhi++ is trained with automatically constructed fingerprints. This can be attributed due to the accurate labeling obtained from the LRS-based trace estimator module and the well-designed deep localization model that considers the evolution of the data among consecutive WiFi scans. HybridLoc [
52] obtains the lowest performance, as BLE-based tracking provides coarse-grained labeling, which is not enough for annotating WiFi scans. Additionally, it uses a probabilistic method that does not benefit from the deep learning methods advantages.
Figure
24 shows how all systems would perform when tested four months later. The figure illustrates that our
LiPhi++ system achieves improvements in localization performance by 284.7% and 418%, under this condition, as compared to HybridLoc and WiDeep; respectively. This can be explained by the combination of the data augmentation methods (AP dropping and signal-shifting) in the training data and the adoption of different regularization techniques, which gives
LiPhi++ greater flexibility and generalization ability than the other systems. Additionally,
LiPhi++ and HybridLoc [
52] obtained better performance as compared to WiDeep [
1] as they have provisions to update the fingerprint database and the localization model. Despite the high overhead spent to collect data for WiDeep [
1] as a traditional fingerprinting technique, its accuracy cannot be maintained without re-doing the arduous calibration process.
LiPhi++ outperforms the earlier work in Reference [
48], when tested with fresh and time-variant test data, by 18.5% and 37.3%. This can be justified due to the ability of the new localization model (multimodal deep recurrent neural network to learn/estimate the user location from a sequence of input scans rather than a single scan as in Reference [
48]. As a result, the localization model learns the underlying relationship between the RSSs received from APs and the temporal correlation (i.e., signal evolution) between successive scans, leading to better localization performance.
In summary—as shown in Table
3—
LiPhi++ is robust to variation over time surpassing the other techniques. This highlights the promise of
LiPhi++ in enabling an accurate localization model with zero calibration overhead.
5.3.2 Time per Location Estimate.
We used a Lenovo Thinkpad X1 laptop running a 2.2-GHz Intel i7-8750H processor with 64 GB RAM for evaluating the end-to-end running time of the different techniques. Figure
25 shows the results. The figure shows that as
LiPhi++, Liphi (the earlier work in Reference [
48]), and WiDeep [
1] are all deep neural network-based systems, they need to pass the data through all the layers of the network. This takes more time than the traditional probabilistic technique proposed in HybridLoc [
52].
LiPhi++ needs less location-inference time compared to Liphi and WiDeep, as
LiPhi++ has a fewer number of layers and neurons and, by extension, a smaller number of calculations. Nevertheless, since the sampling rate is set to 1 ms, all techniques’ running time allows them to provide real-time location tracking.
5.3.3 Device Heterogeneity.
In this section, we evaluate the robustness of all systems to device heterogeneity. Initially, all systems are trained and tested with data collected by the same set of devices (i.e., Samsung Note 8, HTC One X9, Motorola Moto G5), which is shown in Figure
26. The figure also shows the performance of all systems when tested with data collected by Google Pixel XL (i.e., not included in the training set), which has completely different form factors and WiFi chips. HybridLoc [
52] provides acceptable adaptability to the device heterogeneity as it utilizes probabilistic techniques, which are known to perform well in the presence of uncertainty. WiDeep [
1] leverages denoising autoencoders and models device heterogeneity effect as an additive noise leading to remarkable robustness. The figure confirms that
LiPhi++ provides superior robustness to the device heterogeneity problem (approximately the same accuracy when testing with the different testing device as when testing with the same training devices). This is due to the combination of data augmentation (i.e., Signal-shifting and spatial discretizer) in the training data and the adoption of a recurrent neural network. The RNN implicitly learns the location from relative RSS values in the input sequence rather than the absolute RSS amplitudes in a single scan, which gives
LiPhi++ greater flexibility than the other systems.
6 Related Work
In this section, we discuss the most relevant literature.
6.1 Fingerprinting Systems
Fingerprinting systems [
9,
68] present the most popular localization technique due to their high accuracy. In particular, the system in Reference [
9] employs deterministic matching using K-nearest neighbor, so that the unknown user location is assigned to the fingerprint location closest to the average RSS signature of that location. However, deterministic techniques cannot handle the inherent noise and variations in the WiFi signal. However, probabilistic techniques such as Reference [
68] have better adaptability to noise as noise is usually modeled as an uncertainty phenomenon. In this case, the recorded fingerprints are the RSS histogram of each AP at each reference location and the user location is estimated based on Bayesian inference. In these techniques, the signals from different APs are considered to be independent to avoid the curse of dimensionality problem. This leads to a loss of useful information, which leads to coarse-grained localization accuracy.
However, cameras are used to improve WiFi fingerprinting-based indoor positioning in Reference [
34]. However, this solution usually requires a complicated calibration process to adjust for camera scaling and perspective and may necessitate the presence of many permanently installed cameras to cover the whole area. Unlike camera-based solutions, a LiDAR-based solution is transportable. So it can seamlessly be placed and work without any tedious calibration. Additionally, camera-based solutions are not suitable for environments where privacy is at a premium, which is not the case in LiDARs.
Recently, different deep learning-based localization systems, e.g., References [
1,
5,
17,
36,
37,
38,
46,
49,
50,
63,
64] have shown better localization performance due to their ability to learn complex patterns and automatically extract discriminative features. Several deep learning architectures have been proposed in indoor positioning including Restricted Boltzman Machines in DeepFi [
63], a deep convolutional neural network for CSI-based localization in Reference [
64] and stacked denoising autoencoders for each fingerprint reference point in Reference [
1]. The commonality between these techniques is that they depend on traditional fingerprinting and do not have provisions to reduce the data collection overhead. This is a major problem in deep learning-based systems, as they require large amounts of data to be properly trained, which directly translates to extra fingerprinting overhead to satisfy this requirement.
In contrast, LiPhi++ builds a deep learning-based localization model relying on a fingerprint database transparently constructed without explicit user participation. Additionally, LiPhi++ has provisions to boost the model’s robustness to noise.
6.2 Crowdsourcing Systems
Another line of research is proposed to mitigate the calibration overhead required for constructing fingerprint databases using crowdsourcing. This can be done explicitly without user intervention [
33] or implicitly with user intervention [
3,
35,
61]. The system in Reference [
33] increases the fingerprint coverage by periodically asking the user to provide her current location. Although in theory, this method can provide accurate fingerprints, it is annoying to users and is not a practical solution. Hence, the systems in References [
3,
35,
61] implicitly estimate the user location coarsely using dead-reckoning based on the users’ smartphones’ inertial sensors. The corresponding WiFi scan is then associated with a fingerprint. Thereafter, the location estimation can be opportunistically refined using sensor-based landmarks or map-matching. However, inertial sensors in smartphones are noisy, leading to an increasing error over time and missed opportunities to correct estimations. To avoid the noisy inertial sensors, the system in Reference [
52] proposes a method to tag WiFi scans with locations obtained from BLE-enabled high-end smartphones. Therefore, the method requires the area to be well covered with BLE beacons (e.g., iBeacons), which should be sensed by high-end phones to estimate the user location and therefore build a WiFi fingerprint database. While this method is feasible, its application requires the site surveyors to be equipped with high-end phones and the area to be well covered by BLE beacons. Therefore, it cannot be considered a ubiquitous solution for every environment.
LiPhi++, on the contrary, requires neither user intervention nor high-end devices with permanently installed devices. It only uses temporarily installed LRSs for constructing the fingerprint database.
6.3 Propagation-based Systems
The basic idea of propagation models is the use of signal strength measurements received from the APs at the user device to calculate the distance between those APs and the device [
12]. In particular, the stronger the RSS overheard from an arbitrary AP, the shorter the distance between the device and that AP. For example, the system in Reference [
9] proposes a free-space propagation model that is then extended by calculating the signal attenuation in complex indoor environments caused by different objects such as walls and furniture. The systems in References [
14,
22] synthetically build the radio maps of the different locations in 2D and 3D areas, respectively. To do that, these systems use some WiFi scans collected from the environment to calibrate the propagation model. This process usually incurs a high computational cost and cannot generalize well, since the model parameters are tightly coupled to the phone used for measurements. To handle the hardware dependency problem, IncVoronoi [
13] constructs a Voronoi diagram of the area of interest relative to the different AP locations. Therefore, IncVoronoi incrementally enhances the confidence of the user region by refining the Voronoi tessellation of the area of interest as well as handling hardware diversity.
Although propagation-based techniques do not, in general, require a site survey, they provide coarse-grained accuracy as compared to fingerprinting-based techniques.
6.4 LRS-based Systems
Several systems have been proposed leveraging LRSs in many indoor applications. In Reference [
11], an indoor navigational system for a robot is built based on WiFi for localization in addition to an embedded LRS for enhancing the position estimates and avoiding obstacles.
LiPhi++, on the contrary, does not assume that every user’s phone is equipped with LRSs. Another research direction aims to track pedestrians in indoor environments based on LRS as proposed in References [
19,
60]. Although LiDAR is a promising technology for accurate user tracking, it cannot identify the tracked person. To handle this issue, the system in Reference [
59] leverages mobile phone inertial sensors to estimate the user trajectory using the
dead-reckoning (DR) approach. Then, the system matches the LRS-based trajectory to the DR-based trajectory to identify the user. However, depending on the noisy onboard inertial sensors in the consumers’ phones lead to large position errors and random estimated trajectories that cannot be easily matched. Additionally, the system leverages LRSs for tracking purposes and requires inertial sensors (which exist only in high-end smartphones) for identification purposes. This, therefore, requires extreme deployment expenses and limits its ubiquitous adoption.
In contrast, LiPhi++ does not require noisy inertial sensors nor permanently deployed LRSs. LiPhi++ only uses LRSs temporarily during fingerprint database construction/maintenance, leading to extreme savings in expenses. Additionally, it provides an accurate WiFi localization system with similar accuracy to traditional (i.e manual) fingerprinting techniques with virtually zero data collection overhead.