GL-Tree: A Hierarchical Tree Structure for Efficient Retrieval of Massive Geographic Locations
Abstract
:1. Introduction
1.1. Overview
1.2. Research Objectives and Motivation
1.3. Proposal Overview
1.4. Paper Outline
2. Related Work
2.1. Research Related to Location Privacy Protection
2.2. Spatial Indexing Techniques in Location-Based Applications
2.3. Application of Spatial Indexing Techniques in Location Privacy Protection
2.4. Comparative Analysis
3. Geohash Encoding
3.1. The Idea of Geohash Encoding
3.2. Example of the Geohash Encoding Process
- (1)
- The latitude interval (−90, 90) is iteratively dichotomized and area code, and after 20 rounds, the latitude value 39.965391 is finally encoded as a 20-bit binary string “1011100011010110101101111”, as shown in Table 1. Similarly, we may dichotomize and encode the longitude interval (−180, 180) iteratively, and after 20 rounds, we can encode the longitude value 116.364017 as a 20-bit binary string “1101001010101111110110”.
- (2)
- We cross-merge the latitude and longitude coded binary strings in (1) bit by bit in the order from left to right (the odd index bits of the merged binary string place the latitude coded bits, the even index bits place the longitude coded bits, and the index of the first position on the left is 0); then, we can obtain a combined 40-bit coded binary string is “1110011101001000110110111011111001111101”.
- (3)
- We split the combined 40-bit binary string into eight groups. Each group has five consecutive bits. Then, we convert each group into a decimal number. The result of the conversion is 28 29 4 13 23 15 19 29. We convert these decimal numbers into characters using base32 encoding. The final Geohash code is obtained, which is "wx4ergmx".
3.3. Geohash Encoding Algorithm Pseudocode
Algorithm 1: Geohash Encoding Algorithm |
1: Input: , range of geographic coordinates of the earth; , an array of all characters encoded in BASE32; , the length of the Geohash-code output by the algorithm; , the Location that need to be converted to Geohash-code, is latitude, is longitude. |
2: Output: , Geohash code of loc. |
3: |
4: |
5: |
6: if is an odd number do: |
7: |
8: end if |
9: |
10: |
11: |
12: while do: |
13: |
14: if do: |
15: |
16: |
17: else |
18: |
19: |
20: end if |
21: ++ |
22: end while |
23: |
24: while do: |
25: |
26: if do: |
27: |
28: |
29: else |
30: |
31: |
32: end if |
33: ++ |
34: end while |
35: |
36: |
37: for do: |
38: if i is an odd number do: |
39: |
40: else |
41: |
42: end if |
43: end for |
44: |
45: |
46: |
47: for do: |
48: |
49: end for |
50: return |
- (1)
- In lines 3–8, the algorithm initializes the length of the binary strings of latitude and longitude according to .
- (2)
- In lines 9–34, the algorithm defines two empty sets for recording, respectively, the binary character set obtained from the location latitude and longitude, according to the Geohash encoding conversion method.
- (3)
- In lines 35–43, the algorithm cross-merges the latitude and longitude binary strings bit by bit in the order from left to right. When merging, a character ‘1’ or ‘0’ is taken from the longitude binary string in order when i is an odd number, and a character ‘1’ or ‘0’ is taken from the latitude binary string in order when i is an even number or ‘0’.
- (3)
- Lines 44–49, in left-to-right order, the algorithm splits the merged binary string by one segment every five bits. Then, the segmented binary string is converted into a decimal integer, and the decimal integer is used as an index value to find its corresponding ASCII character from base32. Then, the algorithm generates the Geohash code.
3.4. Analysis of the Geohash Encoding Process
- (1)
- Geohash encoding is performed by iteratively dichotomizing the latitude and longitude intervals on the Earth’s surface, gradually narrowing down the intervals to keep approximating a geographic coordinate position. Therefore, a specific two-dimensional spatial location coordinate can be encoded into a unique Geohash-encoded string. However, a Geohash-encoded string is a representation of a rectangular location region, not a specific location point. Different encoding lengths correspond to different sizes of segmentation areas, and the shorter the encoding length, the larger the area represented. Table 2 shows the variation of latitude, longitude, and geographical extent of the spatial rectangular interval represented by the code as the length of the Geohash code increases.
- (2)
- If two different Geohash codes have the same string prefix, it means that the geographical interval corresponding to these two Geohash codes is two different subintervals of the interval corresponding to their common prefix string Geohash code. For example, the following nine Geohash codes: “wx4ergmq”, “wx4ergmw”, “wx4ergmy”, “wx4ergmr”, “wx4ergmx”, “wx4ergmz”, “wx4ergt2”, “wx4ergt8”, and “wx4ergtb”, which have six bits of the same string prefix “wx4erg”, indicate that they belong to the Geohash subintervals of the region encoded as “wx4erg”.
- (3)
- As the length of the Geohash code increases by 1 for each geographic interval corresponding to the encoding, the geographic interval will be divided into 32 grid intervals of the same size. The grid is divided into for odd bits and for even bits, as shown in Figure 1a,b.
- (4)
- The coded characters are space-filled according to the “Z” curve. Neighboring characters correspond to the spatial location of the grid nearby. However, there is a case of abrupt change in position. As shown in Figure 1, the encoded characters “7” and “8”, “g” and “h”, “s” and “r”, although the characters are adjacent to each other, the spatial positions are far away.
4. GL-Tree
4.1. Structure of GL-Tree
- (1)
- The GL-Tree is composed of many L-Trees. The L-Tree is a four-layer structured tree, the structure of which is shown in the double-dotted rectangular box in Figure 2. The L-Tree is labeled as L-level according to the level located in the GL-Tree. If an L-Tree is considered a node, the GL-Tree is a multi-branch tree structure composed of these nodes.
- (2)
- Each L-Tree consists of four layers of structure, from the top down, the root node, the first-layer middle node, the second-layer middle node, and the leaf node.
- (3)
- Each L-Tree has only one root node. The root node consists of a four-tuple . The is the level of the L-Tree in the GL-Tree. The is the prefix string of the Geohash code corresponding to this L-Tree. The is a pointer to the middle node at the top layer of the L-Tree. The is a pointer to the root node of the previous L-Tree in the GL-Tree.
- (4)
- The middle node of the L-Tree is divided into two layers. Each node sets the keywords for branch retrieval. Each keyword corresponds to one branch. The number of keywords in the first layer of nodes is no more than four, and the number of keywords in the second layer of nodes is no more than two. A keyword is a character in Base32 encoding. That is, each node has at most four or two keywords represented by Base32 characters. These characters are arranged left to right on the middle node in the order they are in the Base32 encoding.
- (5)
- A branch corresponding to one keyword is connected by a pointerp. Furthermore, p points to the next middle node or leaf node in the L-Tree.
- (6)
- In the middle node of the L-Tree, the keyword is the largest character of the next node pointed by the pointer p.
- (7)
- The leaf nodes of the L-Tree are used to store the list of geographic locations. All locations can only be stored in the leaf nodes of the lowest level L-Tree of the GL-Tree, or in the leaf nodes of the L-Tree at any level, according to the actual application requirements and user location privacy needs.
- (8)
- Each leaf node consists of at most four data items and two pointers and . All the leaf nodes in each L-Tree are connected by and into a bidirectional link sorted in keyword order. A head node is set as the first node of this link.
- (9)
- Each data item of the L-Tree is a three-tuple consisting of a keyword, a list of geographical locations , and a pointer to the next level of the L-Tree in GL-Tree.
- (10)
- The geographic location is stored in , and a geographic location is stored only once in the same data item.
4.2. The Physical Meaning of the GL-Tree
- (1)
- As the hierarchy of GL-Tree extends deeper, the smaller the grid interval it divides the geographic location space into, the larger the length of the Geohash code corresponding to the grid interval. The L-Tree structure of the ith level of the GL-Tree tree corresponds to different choices of the ith character bit of the Geohash code. According to the base32 encoding and Geohash encoding rules, there are 32 possible values for each character bit of the Geohash code. Therefore, at each level of the GL-Tree, the keywords in the data items of each L-Tree also have 32 possible character fetching values. Since each data item is connected to a next-level L-Tree by a pointer , there are at most L-Trees at level i of the GL-Tree.
- (2)
- Each L-Tree in a GL-Tree corresponds to a geographic location interval. For example, if the value in the root node of an L-Tree is “wx4er”, this means that the L-Tree is located at the 6th level of the GL-Tree tree and corresponds to a rectangular Geohash code of “wx4er” geographic location interval. At this level, the location interval will be further gridded into 32 same-size subintervals.
- (3)
- A data item in an L-Tree corresponds to a Geohash code, which also corresponds to a rectangular geolocation interval. This interval is one of the 32 subintervals of the same size obtained by further gridding the rectangular interval corresponding to the L-Tree. The length of the Geohash encoding corresponding to the data item of the L-Tree at level i of the GL-Tree tree is also i. For example, if the example L-Tree in (2) has a data item with the keyword “7”, it corresponds to the subinterval of the Geohash code “wx4er7”.
- (4)
- The deeper the L-Tree is in the GL-Tree, the smaller the rectangular geographic interval corresponding to the data items of that L-Tree is. The geographic interval corresponding to the data items of the upper L-Tree is further divided into 32 smaller subintervals in the lower L-Tree pointed by the pointer .
- (5)
- The L-Tree structure is designed based on the characteristics of the Geohash encoded division of location intervals in spatial proximity relationships. As can be seen in Figure 1, the coded corresponding intervals have the following characteristics: adjacent numbered grid intervals are also adjacent positions; for every four adjacent numbered intervals in a group, the positions of the adjacent numbered intervals behind will have smaller mutations; for every eight adjacent numbered intervals in a group, the positions of the adjacent numbered intervals behind will have larger mutations. In order to make the L-Tree reflect this inter-interval position relationship, we design the structure of the L-Tree as a “4-2-4” structure, as shown in Figure 2. The first layer intermediate node contains at most four branch terms corresponding to 32 consecutive numbers, which can manage the division and position storage of the whole interval corresponding to the L-Tree. The second layer intermediate node contains at most two branching items, corresponding to eight consecutive numbers, which can manage the division and location storage of eight adjacent intervals. The leaf node has four data items, corresponding to four consecutive numbers, which can manage the division and location storage of four adjacent intervals.
- (6)
- In an L-Tree, if the of a data item is not null, it points to an L-Tree in the next level of the GL-Tree, indicating that the geographic interval corresponding to the data item is further divided into smaller subintervals.
- (7)
- The is in the root node of the lower L-Tree, whose value is equal to the Geohash code of the geographic interval represented by the upper L-Tree data item pointing to this root node.
- (8)
- For the L-Tree at level ith of the GL-Tree, the process of traversing from the root node to a data item of the L-Tree is the process of retrieving and matching the position of the ith character in a Geohash code.
- (9)
- The Geohash code of the geographic interval corresponds to the data item in the L-Tree. The Geohash code corresponds to a retrieval route. This retrieval route starts from the root node of the L-Tree at the first level of the GL-Tree and goes down until the data item is retrieved. In retrieval order, the keywords of all the data items passing through the retrieval route are connected by a string. The string is the Geohash code of the geographic interval corresponding to the data items.
- (10)
- The character matching process for the intermediate nodes of the L-Tree should be performed along the first keyword to its left, which is not smaller than the retrieved character. For example, the four keywords of the first intermediate node in Figure 2 are “7”, “g”, “r”, and “z”. When searching for the character in a Geohash code, if the corresponding character in the Geohash code is any character from “1” to “7”, the search is performed down the pointer p of the keyword “7”. Furthermore, if the character is any character from “8” to “g”, the search is performed down the pointer p of the keyword “g”. The retrieval process of “r” and “z” is similar.
- (11)
- The storage of geographic location in GL-Tree is to perform a keyword search on the GL-Tree tree according to the Geohash code of the location coordinates, in top–down order, to find the L-Tree corresponding to the Geohash code, and to find the corresponding data item in this L-Tree. Then, the location is stored in the of this data item. The location search then follows the same process to find the location stored in the of the corresponding data item.
4.3. Example of Retrieval and Maintenance Process of GL-Tree
- (1)
- The Geohash encoding algorithm is first executed to convert the geographic coordinates (latitude: 40.222012, longitude: 116.248283) into the corresponding 8-char Geohash code “wx4sv61q”. According to the Geohash encoding precision, the string code corresponds to a rectangular geolocation interval of size km2.
- (2)
- For location retrieval, the retrieval algorithm reads each character from the Geohash code in turn, and then searches the L-Tree at the corresponding level in the GL-Tree to find the data item corresponding to each character. The algorithm firstly reads the first character “w” of the Geohash code, then starts from the root node of the L-Tree at level 1 in the GL-Tree, searches along the tree structure, top–down, and finds the data item whose keyword is “w” in the leaf nodes of the L-Tree. If the pointer of this data item is not null, the algorithm finds the root node of the second-level L-Tree along this pointer and continues to retrieve the data item corresponding to the character “x” in the second level L-Tree. In this recursion, the data items corresponding to characters “4”, “s”, “v”, “6”, “1”, and “q” are retrieved in the L-Tree at levels 3–8 in turn.
- (3)
- If the algorithm finds the data item corresponding to the character "q" in the 8th level L-Tree, it can find the coordinates of that location from the of the data item. In the case of location storage, the location coordinates can be stored in this .
- (4)
- To improve the retrieval efficiency of the algorithm and reduce the consumption of memory space, the GL-Tree tree is dynamically constructed and maintained. If the data item corresponding to the keyword is not found in any layer of the L-Tree, or if in the data item is empty, this implies that this data item and its subsequent L-Tree do not exist, and the current location to be retrieved is not stored in the GL-Tree. For example, a. If the data item corresponding to the keyword “v” is not found in the fifth level L-Tree, this indicates that the L-Tree corresponding to the Geohash code “wx4s” exists in the GL-Tree, the L-Tree corresponding to the Geohash code “wx4sv” and its subsequent L-Trees do not exist. b. If the data item corresponding to the keyword “v” is found in the fifth level L-Tree, but the of this data item is null, that means the data item with the Geohash code “wx4sv” exists in the GL-Tree at this time and there is no subsequent L-Tree. Therefore, if the algorithm is performing a location retrieval operation and returns null, this indicates that it failed to retrieve the location coordinates. If the algorithm is performing a location storage operation, case a requires inserting a new data item with the keyword “v” at the fifth level L-Tree and creating structure and data items of the L-Trees at its subsequent levels, while case b requires recursively creating structure and data items of the L-Trees at layers 6–8 below the data item with the keyword “v”. In the end, the position is stored in of the data item with the keyword “q” in the 8th level L-Tree.
- (5)
- The process of searching at the intermediate nodes of the L-Tree has been described in (9) of the previous section. The description will not be repeated here.
5. Algorithm and Analysis
5.1. Construction Algorithm for GL-Tree
Algorithm 2: Construction Algorithm for GL-Tree |
1: Input: , a location dataset; , the maximum hierarchical value of the GL-Tree. |
2: Output: , a GL-Tree structure which has stored the locations in the location dataset to the corresponding data items. |
3: /* Initialize an empty GL-Tree with only the topmost L-Tree root node, without any location data */ |
4: |
5: /* Read all the locations in the location dataset and store them in the GL-Tree */ |
6: for do: |
7: |
8: // Insert a location into the GL-Tree, this method will call Algorithm 3 within |
9: end for |
10: return |
- (1)
- Line 3, to convert to its corresponding Geohash code.
- (2)
- Line 4, to read the character corresponding to the position in the current hierarchy from the Geohash code.
- (3)
- Line 5, to find the leaf node corresponding to character c in the current L-Tree.
- (4)
- Lines 6–17, if the level of the current L-Tree is the lowest level of the GL-Tree, the algorithm determines the presence or absence of the data item corresponding to c. If yes, then insert the location into the list of that data item. Otherwise, a new data item is created in that L-Tree using the keyword c, and the location is added to the list of the data item.
- (5)
- Lines 18–34, if the current level of the L-Tree is not the lowest level of the GL-Tree, the algorithm determines the presence or absence of the data item corresponding to c. If it exists, the algorithm inserts the location into the list of that data item, and the location insertion operation continues at the next level of the L-Tree pointed by in the data item. Otherwise, in that L-Tree, create a new data item using the keyword c and insert the position into the list of the data item, then continue to create a new next-level L-Tree and insert the location in the next-level L-Tree.
- (6)
- As a side note, the algorithm described here stores the locations in the leaf nodes of the L-Tree at each level. The purpose is to verify the performance of storing and querying locations in different levels of the L-Tree. However, for specific application scenarios, the location can be inserted into the list of the L-Tree at the specified level only according to the actual requirements.
Algorithm 3: Insert a location into L-Tree |
1: Input: , a location in location dataset; , the hierarchical value of the current L-Tree in the GL-Tree for which the algorithm is called; , the maximum hierarchical value of the GL-Tree. |
2: Output: |
3: |
4: |
5: |
6: if is equal to do: |
7: if not is do: |
8: |
9: if not is do: |
10: |
11: else |
12: |
13: end if |
14: else |
15: |
16: |
17: end if |
18: else |
19: if not is do: |
20: |
21: if not is do: |
22: |
23: |
24: |
25: else |
26: |
27: |
28: end if |
29: else |
30: |
31: |
32: |
33: end if |
34: end if |
35: return |
5.2. Location Retrieval Algorithm
Algorithm 4: Location retrieval algorithm |
1: Input: , a target location; , possible hierarchy of target locations in the GL-Tree; , a GL-Tree structure which has stored the locations in the location set to the corresponding data items. |
2: Output: , a boolean variable indicating whether the target location was retrieved or not |
3: |
4: |
5: |
6: While not is and do: |
7: |
8: |
9: if not is do: |
10: |
11: if not is do: |
12: if is equal to do: |
13: |
14: else |
15: |
16: ++ |
17: end if |
18: end if |
19: end if |
20: end while |
21: return |
- (1)
- Line 3, first converts target to its corresponding Geohash code.
- (2)
- Line 4, locates the top-level L-Tree of the GL-Tree and starts the retrieval operation from this tree.
- (3)
- Line 5, sets the hierarchy of the current query L-Tree in the GL-Tree.
- (4)
- Lines 6–20, If the currently retrieved L-Tree is not , the algorithm extracts the character c at the position corresponding to the Geohash code of and then verifies whether the leaf node corresponding to c exists. If the leaf node exists, the algorithm continues to find whether there is a data item with the keyword c in the leaf node. If the data item with the keyword c exists, this case indicates the current level is the target level. Then, it continues to retrieve whether there is a target in the list of the data item. Therefore, it returns or . If the level of the retrieved L-Tree is smaller than the target level, the algorithm moves to the next level of the L-Tree along the pointer in the data item with keyword c and continues retrieving. If the leaf node or data item corresponding to c in the retrieved L-Tree does not exist, it returns directly.
5.3. Algorithm Analysis
5.4. Analysis of Algorithm Performance
5.5. Analysis of Algorithm Application
6. Experiments and Results Analysis
6.1. Comparison of the Impact of the Amount of Location Data
6.2. The Effect of Location Order on the Performance of GL-Tree in Location Dataset
6.3. Analysis of Experimental Results
7. Conclusions
- The future primary work is to design a GL-Tree location privacy protection scheme based on the nearest neighbor query and spatial range query as application scenarios and perform performance tests.
- Improve GL-Tree to make it easy to record users’ trajectories and location semantics, and achieve efficient retrieval and similarity comparison of trajectories.
- Combine the non-repudiation of Merkle tree with the storage retrieval performance of GL-Tree. Add service authentication in location privacy protection using blockchain technology. Implement a multi-chain location privacy protection scheme using public and federated chains to decentralize user queries and location storage based on user location privacy levels.
- To combine GL-Tree and differential privacy techniques to design a location privacy protection scheme based on differential privacy. According to the user’s location sensitivity requirement, k intervals with similar access frequency to the user’s location interval are selected, and Laplace noise is added to the location data of the corresponding nodes.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Salahdine, F.; Kaabouch, N. Social engineering attacks: A survey. Future Internet 2019, 11, 89. [Google Scholar] [CrossRef] [Green Version]
- Al Hwaitat, A.K.; Almaiah, M.A.; Almomani, O.; Al-Zahrani, M.; Al-Sayed, R.M.; Asaifi, R.M.; Adhim, K.K.; Althunibat, A.; Alsaaidah, A. Improved security particle swarm optimization (PSO) algorithm to detect radio jamming attacks in mobile networks. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 614–625. [Google Scholar] [CrossRef]
- Jayasinghe, K.; Poravi, G. A survey of attack instances of cryptojacking targeting cloud infrastructure. In Proceedings of the 2020 2nd Asia Pacific Information Technology Conference, New York, NY, USA, 17–19 January 2020; pp. 100–107. [Google Scholar]
- Almaiah, M.A.; Al-Zahrani, A.; Almomani, O.; Alhwaitat, A.K. Classification of cyber security threats on mobile devices and applications. In Artificial Intelligence and Blockchain for Future Cybersecurity Applications; Springer: Cham, Switzerland, 2021; pp. 107–123. [Google Scholar]
- Wu, L.; Chen, C.H.; Zhang, Q. A mobile positioning method based on deep learning techniques. Electronics 2019, 8, 59. [Google Scholar] [CrossRef] [Green Version]
- He, D.; Chan, S.; Guizani, M. Handover authentication for mobile networks: Security and efficiency aspects. IEEE Netw. 2015, 29, 96–103. [Google Scholar] [CrossRef]
- Almaiah, M.A.; Dawahdeh, Z.; Almomani, O.; Alsaaidah, A.; Al-Khasawneh, A.; Khawatreh, S. A new hybrid text encryption approach over mobile ad hoc network. Int. J. Electr. Comput. Eng. IJECE 2020, 10, 6461–6471. [Google Scholar] [CrossRef]
- Almaiah, M.A. Almaiah, M.A. A new scheme for detecting malicious attacks in wireless sensor networks based on blockchain technology. In Artificial Intelligence and Blockchain for Future Cybersecurity Applications; Springer: Cham, Switzerland, 2021; pp. 217–234. [Google Scholar]
- Lam, J.; Abbas, R. Machine learning based anomaly detection for 5g networks. arXiv 2020, arXiv:2003.03474. [Google Scholar]
- Huang, H.; Cheng, Y.; Weibel, R. Transport mode detection based on mobile phone network data: A systematic review. Transp. Res. Part C Emerg. Technol. 2019, 101, 297–312. [Google Scholar] [CrossRef]
- Leong, C.M.; Tan, K.L.; Puah, C.H.; Chong, S.M. Predicting mobile network operators users m-payment intention. Eur. Bus. Rev. 2021, 33. [Google Scholar] [CrossRef]
- Gruteser, M.; Grunwald, D. Anonymous usage of location-based services through spatial and temporal cloaking. In Proceedings of the 1st International Conference on Mobile Systems, Applications and Services, San Francisco, CA, USA, 5–8 May 2003; pp. 31–42. [Google Scholar]
- Niu, B.; Li, Q.; Zhu, X.; Cao, G.; Li, H. Achieving k-anonymity in privacy-aware location-based services. In Proceedings of the IEEE INFOCOM 2014-IEEE Conference on Computer Communications, Toronto, ON, Canada, 27 April–2 May 2014; pp. 754–762. [Google Scholar]
- Almusaylim, Z.A.; Jhanjhi, N. Comprehensive review: Privacy protection of user in location-aware services of mobile cloud computing. Wirel. Pers. Commun. 2020, 111, 541–564. [Google Scholar] [CrossRef]
- Gedik, B.; Liu, L. Location privacy in mobile systems: A personalized anonymization model. In Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS’05), Columbus, OH, USA, 6–10 June 2005; pp. 620–629. [Google Scholar]
- Mokbel, M.F.; Chow, C.Y.; Aref, W.G. The new casper: Query processing for location services without compromising privacy. VLDB 2006, 6, 763–774. [Google Scholar]
- Kido, H.; Yanagisawa, Y.; Satoh, T. An anonymous communication technique using dummies for location-based services. In Proceedings of the ICPS’05. Proceedings. International Conference on Pervasive Services, Santorini, Greece, 11–14 July 2005; pp. 88–97. [Google Scholar]
- Lu, H.; Jensen, C.S.; Yiu, M.L. Pad: Privacy-area aware, dummy-based location privacy in mobile services. In Proceedings of the Seventh ACM International Workshop on Data Engineering for Wireless and Mobile Access, Vancouver, BC, Canada, 13 June 2008; pp. 16–23. [Google Scholar]
- Wu, D.; Zhang, Y.; Liu, Y. Dummy location selection scheme for k-anonymity in location based services. In Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, NSW, Australia, 1–4 August 2017; pp. 441–448. [Google Scholar]
- Liao, D.; Huang, X.; Anand, V.; Sun, G.; Yu, H. k-DLCA: An efficient approach for location privacy preservation in location-based services. In Proceedings of the 2016 IEEE International Conference on Communications (ICC), Kuala Lumpur, Malaysia, 23–27 May 2016; pp. 1–6. [Google Scholar]
- Wang, Y.; Li, M.; Luo, S.; Xin, Y.; Zhu, H.; Chen, Y.; Yang, G.; Yang, Y. LRM: A location recombination mechanism for achieving trajectory k-anonymity privacy protection. IEEE Access 2019, 7, 182886–182905. [Google Scholar] [CrossRef]
- Diao, Y.; Ye, A.; Cheng, B.; Zhang, J.; Zhang, Q. A Dummy-Based Privacy Protection Scheme for Location-Based Services under Spatiotemporal Correlation. In Proceedings of the 2020 International Conference on Networking and Network Applications (NaNA), Haikou City, China, 11–14 December 2020; pp. 443–447. [Google Scholar]
- Kim, J.W.; Edemacu, K.; Kim, J.S.; Chung, Y.D.; Jang, B. A survey of differential privacy-based techniques and their applicability to location-based services. Comput. Secur. 2021, 111, 102464. [Google Scholar] [CrossRef]
- Birchfield, S.T.; Rangarajan, S. Spatial histograms for region-based tracking. ETRI J. 2007, 29, 697–699. [Google Scholar] [CrossRef]
- Neimeyer, G. Geohash Tips & Tricks. 2019. Available online: http://geohash.org/site/tips.html (accessed on 19 November 2022).
- Xiang, W. An efficient location privacy preserving model based on Geohash. In Proceedings of the 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), Beijing, China, 28–30 October 2019; pp. 1–5. [Google Scholar]
- Kai, L.; Yiliang, H.; Jingjing, W.; Kaiyang, G. Location Privacy Protection Method Based on Geohash Coding and Pseudo-Random Sequence. In Proceedings of the 2022 3rd Information Communication Technologies Conference (ICTC), Nanjing, China, 6–8 May 2022; pp. 178–183. [Google Scholar]
- Ye, A.; Chen, Q.; Xu, L. Private and Flexible Proximity Detection Based on Geohash. In Proceedings of the 2017 IEEE 85th Vehicular Technology Conference (VTC Spring), Sydney, NSW, Australia, 4–7 June 2017; pp. 1–5. [Google Scholar]
- Brown, R.A. Building a balanced kd tree in o (kn log n) time. arXiv 2014, arXiv:1410.5420. [Google Scholar]
- Greenspan, M.; Yurick, M. Approximate kd tree search for efficient ICP. In Proceedings of the Fourth International Conference on 3-D Digital Imaging and Modeling, 2003: 3DIM 2003, Banff, AB, Canada, 6–10 October 2003; pp. 442–448. [Google Scholar]
- Robinson, J.T. The KDB-tree: A search structure for large multidimensional dynamic indexes. In Proceedings of the 1981 ACM SIGMOD International Conference on Management of Data, Ann Arbor, MI, USA, 29 April–1 May 1981; pp. 10–18. [Google Scholar]
- Orlandic, R.; Yu, B. Implementing KDB-trees to support high-dimensional data. In Proceedings of the 2001 International Database Engineering and Applications Symposium, Grenoble, France, 16–18 July 2001; pp. 58–67. [Google Scholar]
- Yu, B.; Bailey, T.; Orlandic, R.; Somavaram, J. KDB/sub KD/-tree: A compact KDB-tree structure for indexing multidimensional data. In Proceedings of the ITCC 2003 International Conference on Information Technology: Coding and Computing, Las Vegas, NV, USA, 28–30 April 2003; pp. 676–680. [Google Scholar]
- Henrich, A.; Six, H.W.; Hagen, F.; Widmayer, P. The LSD tree: Spatial access to multidimensional point and non-saint objects. In Proceedings of the the 5th Very Large Databases Conference, Los Angeles, CA, USA, 6–10 February 1989; pp. 45–54. [Google Scholar]
- Cui, N.; Yang, X.; Wang, B. A novel spatial cloaking scheme using hierarchical hilbert curve for location-based services. In Proceedings of the International Conference on Web-Age Information Management, Nanchang, China, 3–5 June 2016; Springer: Cham, Switzerland, 2016; pp. 15–27. [Google Scholar]
- Lee, H.J.; Hong, S.T.; Yoon, M.; Um, J.H.; Chang, J.W. A new cloaking algorithm using Hilbert curves for privacy protection. In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Security and Privacy in GIS and LBS, San Jose, CA, USA, 2 November 2010; pp. 42–46. [Google Scholar]
- Boston, M. A dynamic index structure for spatial searching. In Proceedings of the ACM-SIGMOD, Boston, MA, USA, 18–21 June 1984; pp. 547–557. [Google Scholar]
- Vu, T.; Eldawy, A. R-Grove: Growing a family of R-trees in the big-data forest. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 6–9 November 2018; pp. 532–535. [Google Scholar]
- Arpitha, Y.; Bhargavi, A.; Mahima, K.; Nagavi, K.; Meenakshi, H. A Navigation Supporting System Using R-Tree. In Proceedings of the 3rd National Conference on Image Processing, Computing, Communication, Networking and Data Analytics, Gangtok, India, 2–3 November 2018; p. 489. [Google Scholar]
- Kamel, I.; Faloutsos, C. Hilbert R-Tree: An Improved R-Tree Using Fractals. In Proceedings of the Twentieth International Conference on Very Large Data Bases, Santiago, Chile, 12–15 September 1994; pp. 500–509. [Google Scholar]
- Beckmann, N.; Kriegel, H.P.; Schneider, R.; Seeger, B. The R*-tree: An efficient and robust access method for points and rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, USA, 23–26 May 1990; pp. 322–331. [Google Scholar]
- Mehta, D.P.; Sahni, S. Handbook of Data Structures and Applications; Chapman and Hall/CRC: Boca Raton, FL, USA, 2004. [Google Scholar]
- Das, J.; Majumder, S.; Gupta, P.; Mali, K. Collaborative recommendations using hierarchical clustering based on Kd trees and quadtrees. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 2019, 27, 637–668. [Google Scholar] [CrossRef]
- To, Q.C.; Dang, T.K.; Küng, J. B ob-tree: An efficient b+-tree based index structure for geographic-aware obfuscation. In Proceedings of the Intelligent Information and Database Systems: Third International Conference, ACIIDS 2011, Daegu, Republic of Korea, 20–22 April 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 109–118. [Google Scholar]
- Zhang, J.; Xiao, X.; Xie, X. Privtree: A differentially private algorithm for hierarchical decompositions. In Proceedings of the 2016 International Conference on Management of Data, San Francisco, CA, USA, 26 June–1 July 2016; pp. 155–170. [Google Scholar]
- Hu, H.; Chen, Q.; Xu, J. VERDICT: Privacy-preserving authentication of range queries in location-based services. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), Brisbane, Australia, 8–12 April 2013; pp. 1312–1315. [Google Scholar]
- Zhao, X.; Dong, Y.; Pi, D. Novel trajectory data publishing method under differential privacy. Expert Syst. Appl. 2019, 138, 112791. [Google Scholar] [CrossRef]
- Yuan, S.; Pi, D.; Zhao, X.; Xu, M. Differential privacy trajectory data protection scheme based on R-tree. Expert Syst. Appl. 2021, 182, 115215. [Google Scholar] [CrossRef]
- Shao, Z.; Taniar, D.; Adhinugraha, K.M. Range-kNN queries with privacy protection in a mobile environment. Pervasive Mob. Comput. 2015, 24, 30–49. [Google Scholar] [CrossRef]
- Gao, M.; Xiang, L.; Gong, J. Organizing large-scale trajectories with adaptive Geohash-tree based on secondo database. In Proceedings of the 2017 25th International Conference on Geoinformatics, Buffalo, NY, USA, 2–4 August 2017; pp. 1–6. [Google Scholar]
- Guo, N.; Xiong, W.; Wu, Y.; Chen, L.; Jing, N. A geographic meshing and coding method based on adaptive Hilbert-Geohash. IEEE Access 2019, 7, 39815–39825. [Google Scholar] [CrossRef]
- Zhou, Y.; Li, G.; Yang, Y.; Shi, W. Location privacy protection method for nearest neighbor query based on GeoHash. Comput. Sci. 2019, 8, 212–216. [Google Scholar]
Latitude | Rounds | Interval | Interval (Code 0) | Interval (Code 1) | Encode |
---|---|---|---|---|---|
1 | (−90, 90) | (−90, 0) | (0, 90) | 1 | |
2 | (0, 90) | (0, 45) | (45, 90) | 0 | |
3 | (0, 45) | (0, 22.5) | (22.5, 45) | 1 | |
4 | (22.5, 45) | (22.5, 33.75) | (33.75, 45) | 1 | |
39.965391 | 5 | (33.75, 45) | (33.75, 39.375) | (39.375, 45) | 1 |
6 | (39.375, 45) | (39.375, 42.1875) | (42.1875, 45) | 0 | |
7 | (39.375, 42.1875) | (39.375, 40.78125) | (40.78125, 42.1875) | 0 | |
... | ... | ... | ... | ... | |
20 | (39.96517, 39.96552) | (39.96517, 39.96534) | (39.96534, 39.96552) | 1 |
Length of Geohash Code | Latitude Bit | Longitude Bit | Latitude Error | Longitude Error | Height | Width |
---|---|---|---|---|---|---|
1 | 2 | 3 | ±23 | ±23 | 4992.6 km | 5009.4 km |
2 | 5 | 5 | ±2.8 | ±5.6 | 624.1 km | 1252.3 km |
3 | 7 | 8 | ±0.70 | ±0.70 | 156 km | 156.5 km |
4 | 10 | 10 | ±0.087 | ±0.18 | 19.5 km | 39.1 km |
5 | 12 | 13 | ±0.022 | ±0.022 | 4.9 km | 4.9 km |
6 | 15 | 15 | ±0.0027 | ±0.0055 | 609.4 m | 1.2 km |
7 | 17 | 18 | ±0.00068 | ±0.00068 | 152.5 m | 152.9 m |
8 | 20 | 20 | ±0.00086 | ±0.000172 | 19 m | 38.2 m |
9 | 22 | 23 | ±0.000021 | ±0.000021 | 4.8 m | 4.8 m |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, B.; Zhang, C.; Xin, Y. GL-Tree: A Hierarchical Tree Structure for Efficient Retrieval of Massive Geographic Locations. Sensors 2023, 23, 2245. https://doi.org/10.3390/s23042245
Liu B, Zhang C, Xin Y. GL-Tree: A Hierarchical Tree Structure for Efficient Retrieval of Massive Geographic Locations. Sensors. 2023; 23(4):2245. https://doi.org/10.3390/s23042245
Chicago/Turabian StyleLiu, Bin, Chunyong Zhang, and Yang Xin. 2023. "GL-Tree: A Hierarchical Tree Structure for Efficient Retrieval of Massive Geographic Locations" Sensors 23, no. 4: 2245. https://doi.org/10.3390/s23042245
APA StyleLiu, B., Zhang, C., & Xin, Y. (2023). GL-Tree: A Hierarchical Tree Structure for Efficient Retrieval of Massive Geographic Locations. Sensors, 23(4), 2245. https://doi.org/10.3390/s23042245