Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Edge-assisted Collaborative Image Recognition for Mobile Augmented Reality

Published: 05 October 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Mobile Augmented Reality (AR), which overlays digital content on the real-world scenes surrounding a user, is bringing immersive interactive experiences where the real and virtual worlds are tightly coupled. To enable seamless and precise AR experiences, an image recognition system that can accurately recognize the object in the camera view with low system latency is required. However, due to the pervasiveness and severity of image distortions, an effective and robust image recognition solution for “in the wild” mobile AR is still elusive. In this article, we present CollabAR, an edge-assisted system that provides distortion-tolerant image recognition for mobile AR with imperceptible system latency. CollabAR incorporates both distortion-tolerant and collaborative image recognition modules in its design. The former enables distortion-adaptive image recognition to improve the robustness against image distortions, while the latter exploits the spatial-temporal correlation among mobile AR users to improve recognition accuracy. Moreover, as it is difficult to collect a large-scale image distortion dataset, we propose a Cycle-Consistent Generative Adversarial Network-based data augmentation method to synthesize realistic image distortion. Our evaluation demonstrates that CollabAR achieves over 85% recognition accuracy for “in the wild” images with severe distortions, while reducing the end-to-end system latency to as low as 18.2 ms.

    References

    [1]
    Z. Liu, G. Lan, J. Stojkovic, Y. Zhang, C. Joe-Wong, and M. Gorlatova. 2020. CollabAR: Edge-assisted collaborative image recognition for mobile augmented reality. In Proceedings of the ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN’20), 301–312.
    [2]
    P. Jain, J. Manweiler, and R. Roy Choudhury. 2015. Overlay: Practical mobile augmented reality. In Proceedings of the ACM Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’15), 331–344.
    [3]
    K. Chen, T. Li, H.-S. Kim, D. E. Culler, and R. H. Katz. 2018. MARVEL: Enabling mobile augmented reality with low energy and low latency. In Proceedings of the ACM Conference on Embedded Networked Sensor Systems (SenSys’18), 292–304.
    [4]
    A. Krizhevsky, I. Sutskever, and G. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS’12), Vol. 25, 1097–1105.
    [5]
    M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 4510–4520.
    [6]
    L. Liu, H. Li, and M. Gruteser. 2019. Edge assisted real-time object detection for mobile augmented reality. In Proceedings of the ACM Annual International Conference on Mobile Computing and Networking (MobiCom’19), 1–16.
    [7]
    M. Xu, M. Zhu, Y. Liu, F. X. Lin, and X. Liu. 2018. Deepcache: Principled cache for mobile deep vision. In Proceedings of the ACM Annual International Conference on Mobile Computing and Networking (MobiCom’18).129–144.
    [8]
    H. Ji and C. Liu. 2008. Motion blur identification from image gradients. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08). 1–8.
    [9]
    i-MARECULTURE. https://imareculture.eu. [Accessed Dec. 1, 2020].
    [10]
    S. F. Dodge and L. J. Karam. 2018. Quality robust mixtures of deep neural networks. IEEE Transactions on Image Processing 27, 11 (2018), 5553–5562.
    [11]
    W. Liu and W. Lin. 2012. Additive white gaussian noise level estimation in SVD domain for images. IEEE Transactions on Image Processing 22, 3 (2012) 872–883.
    [12]
    W. Zhang, B. Han, P. Hui, V. Gopalakrishnan, E. Zavesky, and F. Qian. 2018. CARS: Collaborative augmented reality for socialization. In Proceedings of the ACM International Workshop on Mobile Computing Systems & Applications (HotMobile’18). 25–30.
    [13]
    J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). 248–255.
    [14]
    G. Griffin, A. Holub, and P. Perona. 2007. Caltech-256 object category dataset.
    [15]
    S. Ghosh, R. Shet, P. Amon, A. Hutter, and A. Kaup. 2018. Robustness of deep convolutional neural networks for image degradations. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’18), 2916–2920.
    [16]
    T. S. Borkar and L. J. Karam. 2019. Deepcorrect: Correcting DNN models against image distortions. IEEE Transactions on Image Processing 28, 12 (2019), 6022–6034.
    [17]
    K. Saenko, B. Kulis, M. Fritz, and T. Darrell. 2010. Adapting visual category models to new domains. In Proceedings of European Conference on Computer Vision (ECCV’10). 213–226.
    [18]
    J. Sun, W. Cao, Z. Xu, and J. Ponce. 2015. Learning a convolutional neural network for non-uniform motion blur removal. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 769–777.
    [19]
    J. Flusser, S. Farokhi, C. Höschl, T. Suk, B. Zitová, and M. Pedone. 2015. Recognition of images degraded by gaussian blur. IEEE Transactions on Image Processing 252 (2015), 790–806.
    [20]
    W. Zhang, B. Han, and P. Hui. 2018. Jaguar: Low latency mobile augmented reality with flexible tracking. In Proceedings of the ACM International Conference on Multimedia (MM’18). 355–363.
    [21]
    S. Shen, Y. Han, X. Wang, and Y. Wang. 2019. Computation offloading with multiple agents in edge-computing-supported IoT. ACM Transactions on Sensor Networks 16, 1 (2019), 1–27.
    [22]
    P. Guo, B. Hu, R. Li, and W. Hu. 2018. Foggycache: Cross-device approximate computation reuse. In Proceedings of the ACM Annual International Conference on Mobile Computing and Networking (MobiCom’18). 19–34.
    [23]
    HoloLens. https://www.microsoft.com/en-us/hololens. [Accessed May 20, 2021].
    [24]
    Magic leap. https://www.magicleap.com/. [Accessed May 20, 2021].
    [25]
    Google ARCore. https://developers.google.com/ar/. [Accessed May 20, 2021].
    [26]
    Apple ARKit. https://developer.apple.com/documentation/arkit. [Accessed Dec. 1, 2020].
    [27]
    X. Ran, C. Slocum, M. Gorlatova, and J. Chen. 2019. ShareAR: Communication-efficient multi-user mobile augmented reality. In Proceedings of the ACM Workshop on Hot Topics in Networks (HotNets’19). 109–116.
    [28]
    K. Lebeck, K. Ruth, T. Kohno, and F. Roesner. 2018. Towards security and privacy for multi-user augmented reality: Foundations with end users. In Proceedings of IEEE Symposium on Security and Privacy (S&P’18). 392–408.
    [29]
    T. Li, N. S. Nguyen, X. Zhang, T. Wang, and B. Sheng. 2020. PROMAR: Practical reference object-based multi-user augmented reality. In Proceedings of IEEE Conference on Computer Communications (INFOCOM’20). 1359–1368.
    [30]
    K. Apicharttrisorn, B. Balasubramanian, J. Chen, R. Sivaraj, Y.-Z. Tsai, R. Jana, S. Krishnamurthy, T. Tran, and Y. Zhou. 2020. Characterization of multi-user augmented reality over cellular networks. In Proceedings of the Annual IEEE International Conference on Sensing, Communication, and Networking (SECON’20). 1–9.
    [31]
    Share AR experience with your buddy. https://niantic.helpshift.com/a/pokemon-go/?s=buddy-pokemon&f=shared-ar-experience-with-your-buddy&p=web. [Accessed May 20, 2021].
    [32]
    S. Stein. 2020. Snapchat’s augmented reality lenses can span whole city blocks. https://www.cnet.com/news/ snapchats-augmented-reality-lenses-can-span-whole-city-blocks/. [Accessed May 11, 2021].
    [33]
    J. Holland. 2017. Can holograms shape the future of car design? https://medium.com/ford/can-holograms-shape-the-future-of-car-design-dda4fcc4f22b. [Accessed May 11, 2021].
    [34]
    I. Maw. 2019. How Lockheed Martin is using augmented reality in aerospace manufacturing. https://www.engineering.com/story/how-lockheed-martin-is-using-augmented-reality-in-aerospace-manufacturing. [Accessed May 11, 2021].
    [35]
    H. Verkasalo. 2009. Contextual patterns in mobile service usage. Personal and Ubiquitous Computing 13, 5 (2009), 331–342.
    [36]
    Y. Li and W. Gao. 2019. DeltaVR: Achieving high-performance mobile VR dynamics through pixel reuse. In Proceedings of the ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN’19). 13–24.
    [37]
    Z. Zhou. 2012. Ensemble Methods: Foundations and Algorithms.Chapman and Hall/CRC.
    [38]
    S. Teerapittayanon, B. McDanel, and H.-T. Kung. 2017. Distributed deep neural networks over the cloud, the edge, and end devices. In Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS’17). 328–339.
    [39]
    N. F. Rajani and R. J. Mooney. 2017. Stacking with auxiliary features. In Proceedings of IJCAI, 2634–2640.
    [40]
    J. Chen, J. Chen, H. Chao, and M. Yang. 2018. Image blind denoising with generative adversarial network based noise modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18), 3155–3164.
    [41]
    Z. Sun, M. Ozay, Y. Zhang, X. Liu, and T. Okatani. 2018. Feature quantization for defending against distortion of images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 7957–7966.
    [42]
    J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 2223–2232.
    [43]
    M. Labbe and F. Michaud. 2013. Appearance-based loop closure detection for online large-scale and long-term operation. IEEE Transactions on Robotics 29, 3 (2013), 734–745.
    [44]
    X. Zeng, K. Cao, and M. Zhang. 2017. Mobiledeeppill: A. Small-footprint mobile deep learning system for recognizing unconstrained pill images. In Proceedings of the ACM Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’17). 56–67.
    [45]
    H. Qiu, X. Liu, S. Rallapalli, A. J. Bency, K. Chan, R. Urgaonkar, B. Manjunath, and R. Govindan. 2018. Kestrel: Video analytics for augmented multi-camera vehicle tracking. In Proceedings of the IEEE/ACM International Conference on Internet-of-Things Design and Implementation (IoTDI’18). 48–59.
    [46]
    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661, 2014.
    [47]
    X. Zhu, Y. Liu, J. Li, T. Wan, and Z. Qin. 2018. Emotion classification with data augmentation using generative adversarial networks. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’18). 349–360.
    [48]
    V. Sandfort, K. Yan, P. J. Pickhardt, and R. M. Summers. 2019. Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in ct segmentation tasks. Scientific Reports 9, 1 (2019), 1–9.
    [49]
    D. Engin, A. Genç, and H. Kemal Ekenel. 2018. Cycle-dehaze: Enhanced CycleGAN for single image dehazing. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR’18), 825–833.
    [50]
    A. Bulat, J. Yang, and G. Tzimiropoulos. 2018. To learn image super-resolution, use a GAN to learn how to do image degradation first. In Proceedings of European Conference on Computer Vision (ECCV’18). 185–200.
    [51]
    S. Nah, T. Hyun Kim, and K. Mu Lee. 2017. Deep multi-scale convolutional neural network for dynamic scene deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 3883–3891.
    [52]
    Y.-W. Tai, X. Chen, S. Kim, S. J. Kim, F. Li, J. Yang, J. Yu, Y. Matsushita, and M. S. Brown. 2013. Nonlinear camera response functions and image deblurring: Theoretical analysis and practice. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 10 (2013), 2498–2512.
    [53]
    S. Bae and F. Durand. 2007. Defocus magnification. In Computer Graphics Forum. Wiley Online Library.
    [54]
    J. Xu, L. Zhang, and D. Zhang. 2018. A trilateral weighted sparse coding scheme for real-world image denoising. In Proceedings of European Conference on Computer Vision (ECCV’18). 20–36.
    [55]
    J. Anaya and A. Barbu. 2018. RENOIR: A. Dataset for real low-light image noise reduction. Journal of Visual Communication and Image Representation 51 (2018) 144–154.
    [56]
    K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    [57]
    Y. Zhou, S. Song, and N. Cheung. 2017. On classification of distorted images with deep convolutional neural networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’17). 1213–1217.
    [58]
    S. Dodge and L. Karam. 2019. Human and DNN classification performance on images with quality distortions: A comparative study. ACM Transactions on Applied Perception 16, 2 (2019), 1–17.
    [59]
    A. Mittal, A. K. Moorthy, and A. C. Bovik 2012. No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing 21, 12 (2012) 4695–4708.
    [60]
    R. Liu, Z. Li, and J. Jia. 2008. Image partial blur detection and classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08). 1–8.
    [61]
    N. D. Narvekar and L. J. Karam. 2011. A no-reference image blur metric based on the cumulative probability of blur detection (CPBD). IEEE Transactions on Image Processing 20, 9 (2011), 2678–2683.
    [62]
    BBC Civilisations AR. https://www.bbc.co.uk/taster/pilots/civilisations-ar. [Accessed May 20, 2021].
    [63]
    QuiverVision. https://quivervision.com/. [Accessed May 20, 2021].
    [64]
    C. Chen, Y. Miao, C. X. Lu, L. Xie, P. Blunsom, A. Markham, and N. Trigoni. 2019. Motiontransformer: Transferring neural inertial tracking between domains. Proceedings of AAAI 33, 1 (2019), 8009–8016.
    [65]
    A. Mathur, A. Isopoussu, F. Kawsar, N. Berthouze, and N. D. Lane. 2019. Mic2Mic: Using cycle-consistent generative adversarial networks to overcome microphone variability in speech systems. In Proceedings of the ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN’19). 169–180.
    [66]
    X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley. 2017. Least squares generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 2794–2802.
    [67]
    T. Zhou, P. Krahenbuhl, M. Aubry, Q. Huang, and A. A. Efros. 2016. Learning dense correspondence via 3D-guided cycle consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 117–126.
    [68]
    Y. Taigman, A. Polyak, and L. Wolf. 2016. Unsupervised cross-domain image generation. arXiv:1611.02200, 2016.
    [69]
    J. Johnson, A. Alahi, and L. Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European Conference on Computer Vision (ECCV’16). 694–711.
    [70]
    K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770–778.
    [71]
    P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1125–1134.
    [72]
    D. P. Kingma and J. Ba. 2014. Adam: A. Method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    [73]
    A. Jain. 1989. Fundamentals of Digital Image Processing. Prentice Hall.
    [74]
    C. Poynton. 2012. Digital Video and HD: Algorithms and Interfaces. Elsevier.
    [75]
    X. Peng, J. Hoffman, Y. Stella, and K. Saenko. 2016. Fine-to-coarse knowledge transfer for low-res image classification. In Proceedings of the IEEE International Conference on Image Processing (ICIP’16). 3683–3687.
    [76]
    A. O. Ercan, A. E. Gamal, and L. J. Guibas. 2013. Object tracking in the presence of occlusions using multiple cameras: A sensor network approach. ACM Transactions on Sensor Networks 9, 2 (2013), 1–36.
    [77]
    Google cloud anchors. https://developers.google.com/ar/develop/developer-guides/anchors.
    [78]
    F. Li, C. Zhao, G. Ding, J. Gong, C. Liu, and F. Zhao. 2012. A reliable and accurate indoor localization method using phone inertial sensors. In Proceedings of the ACM Conference on Ubiquitous Computing (UbiComp’12). 421–430.
    [79]
    X. Ran, C. Slocum, Y.-Z. Tsai, K. Apicharttrisorn, M. Gorlatova, and J. Chen. 2020. Multi-user augmented reality with communication efficient and spatially consistent virtual objects. In Proceedings of the ACM International Conference on Emerging Networking EXperiments and Technologies (CoNext’20). 386–398.
    [80]
    Y. Lin, T. Liu, and C. Fuh. 2007. Local ensemble kernel learning for object category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’07). 1–8.
    [81]
    M. M. Derakhshani, S. Masoudnia, A. H. Shaker, O. Mersa, M. A. Sadeghi, M. Rastegari, and B. N. Araabi. 2019. Assisted excitation of activations: A learning technique to improve object detectors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 9201–9210.
    [82]
    A. Heiskanen. 2020. Fologram and all brick introduce an entirely new way of building in AR. https://aec-business.com/fologram-and-all-brick-introduce-an-entirely-new-way-of-building-in-ar/. [Accessed May 11, 2021].
    [83]
    AR-Check. https://ar-check.com/. [Accessed May 20, 2021].
    [84]
    A. Buades, B. Coll, and J.-M. Morel. 2011. Non-local means denoising. Image Processing on Line 1 (2011), 208–212.
    [85]
    M. Ye, D. Lyu, and G. Chen. 2020. Scale-iterative upscaling network for image deblurring. IEEE Access 8 (2020), 18 316–18 325.
    [86]
    L. Xu, J. S. Ren, C. Liu, and J. Jia. 2014. Deep convolutional neural network for image deconvolution. In Proceedings of NeurIPS, vol. 27, 1790–1798.
    [87]
    J. Su, D. V. Vargas, and K. Sakurai. 2019. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation 23, 5 (2019), 828–841.
    [88]
    Y. Pei, Y. Huang, Q. Zou, X. Zhang, and S. Wang. 2020. Effects of image degradation and degradation removal to CNN-based image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 4 (2020), 1239–1253.
    [89]
    R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel. 2018. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231, 2018.
    [90]
    Tensorflow lite. https://www.tensorflow.org/lite. [Accessed May 20, 2021].

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Sensor Networks
    ACM Transactions on Sensor Networks  Volume 18, Issue 1
    February 2022
    434 pages
    ISSN:1550-4859
    EISSN:1550-4867
    DOI:10.1145/3484935
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 05 October 2021
    Accepted: 01 May 2021
    Revised: 01 May 2021
    Received: 01 December 2020
    Published in TOSN Volume 18, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Edge computing
    2. collaborative augmented reality
    3. mobile image recognition
    4. cycle-consistent generative adversarial networks

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • Lord Foundation of North Carolina
    • NSF

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)245
    • Downloads (Last 6 weeks)32
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Hybrid Improved Concave Matching Algorithm and ResNet Image Recognition ModelIEEE Access10.1109/ACCESS.2024.337592812(39847-39861)Online publication date: 2024
    • (2024)Hybrid YOLOv3 and ReID intelligent identification statistical model for people flow in public placesScientific Reports10.1038/s41598-024-64905-914:1Online publication date: 25-Jun-2024
    • (2024)Delay-guaranteed Mobile Augmented Reality Task Offloading in Edge-assisted EnvironmentAd Hoc Networks10.1016/j.adhoc.2024.103539161(103539)Online publication date: Aug-2024
    • (undefined)A Collaborative Learning-based Urban Low-light Small-target Face Image Enhancement MethodACM Transactions on Sensor Networks10.1145/3616013

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media