research-article

Open access

Anatomizing Deep Learning Inference in Web Browsers

Authors:

Xuanzhe LiuAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 34, Issue 2

Article No.: 47, Pages 1 - 43

https://doi.org/10.1145/3688843

Published: 21 January 2025 Publication History

All formats PDF Full text

Abstract

Web applications have increasingly adopted Deep Learning (DL) through in-browser inference, wherein DL inference performs directly within Web browsers. The actual performance of in-browser inference and its impacts on the Quality of Experience (QoE) remain unexplored, and urgently require new QoE measurements beyond traditional ones, e.g., mainly focusing on page load time. To bridge this gap, we make the first comprehensive performance measurement of in-browser inference to date. Our approach proposes new metrics to measure in-browser inference: responsiveness, smoothness, and inference accuracy. Our extensive analysis involves 9 representative DL models across Web browsers of 50 popular PC devices and 20 mobile devices. The results reveal that in-browser inference exhibits a substantial latency gap, averaging 16.9 times slower on CPU and 4.9 times slower on GPU compared to native inference on PC devices. The gap on mobile CPU and mobile GPU is 15.8 times and 7.8 times, respectively. Furthermore, we identify contributing factors to such latency gap, including underutilized hardware instruction sets, inherent overhead in the runtime environment, resource contention within the browser, and inefficiencies in software libraries and GPU abstractions. Additionally, in-browser inference imposes significant memory demands, at times exceeding 334.6 times the size of the DL models themselves, partly attributable to suboptimal memory management. We also observe that in-browser inference leads to a significant 67.2% increase in the time it takes for GUI components to render within Web browsers, significantly affecting the overall user QoE of Web applications reliant on this technology.

References

[1]

ANGLE (software). 2024. Retrieved April 2024 from https://en.wikipedia.org/wiki/ANGLE_(software)

[2]

Selenium with Python. 2024. Retrieved April 2024 from https://selenium-python.readthedocs.io/

[3]

Using the SavedModel Format. 2024. Retrieved April 2024 from https://www.tensorflow.org/guide/saved_model

[4]

WEBGL_debug_renderer_info Extension. 2024. Retrieved April 2024 from https://developer.mozilla.org/en-US/docs/Web/API/WEBGL_debug_renderer_info

[5]

XNNPACK: High-Efficiency Floating-Point Neural Network Inference Operators for Mobile, Server, and Web. 2024. Retrieved April 2024 from https://github.com/google/XNNPACK

[6]

Grammar Checking. 2024. Retrieved July 2024 from https://www.microsoft.com/en-us/microsoft-365/microsoft-editor/grammar-checker

[7]

Teams Online. 2024. Retrieved July 2024 from https://teams.microsoft.com/

[8]

Theremix. 2024. Retrieved July 2024 from https://theremin.app/

[9]

AI Benchmark for Windows, Linux and macOS. 2023. Retrieved September 2023 from https://ai-benchmark.com/ranking_deeplearning_detailed.html

[10]

AI Face Test. 2023. Retrieved September 2023 from https://facetest1.netlify.app/

[11]

Brain.js: GPU Accelerated Neural networks in JavaScript for Browsers and Node.js. 2023. Retrieved September 2023 from https://brain.js.org/

[12]

Browser Market Share Worldwide. 2023. Retrieved September 2023 from https://gs.statcounter.com/browser-market-share

[13]

Change Your Background in Microsoft Teams Meetings. 2023. Retrieved September 2023 from https://support.microsoft.com/en-us/office/change-your-background-in-microsoft-teams-meetings-f77a2381-443a-499d-825e-509a140f4780

[14]

Core ML. 2023. Retrieved September 2023 from https://developer.apple.com/documentation/coreml

[15]

CPU Benchmarks – Over 1,000,000 CPUs benchmarked. 2023. Retrieved September 2023 from https://www.cpubenchmark.net/

[16]

CUDA. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/CUDA

[17]

Direct3D. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/Direct3D

[18]

Face Age Test. 2023. Retrieved September 2023 from https://aige.netlify.app/

[19]

FMA Instruction Set. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/FMA_instruction_set

[20]

Getting Started Converting TensorFlow to ONNX. 2023. Retrieved September 2023 from https://onnxruntime.ai/docs/tutorials/tf-get-started.html

[21]

How WebAssembly Threads Work. 2023. Retrieved September 2023 from https://web.dev/webassembly-threads/

[22]

Intel AVX-512 Instructions. 2023. Retrieved September 2023 from https://www.intel.com/content/www/us/en/developer/articles/technical/intel-avx-512-instructions.html

[23]

Intel AVX2 Instructions. 2023. Retrieved September 2023 from https://www.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/intrinsics/intrinsics-for-avx2.html

[24]

Lighthouse. 2023. Retrieved September 2023 from https://developer.chrome.com/docs/lighthouse/overview/

[25]

Metal: Accelerating Graphics and Much More. 2023. Retrieved September 2023 from https://developer.apple.com/metal/

[26]

Model Conversion. 2023. Retrieved September 2023 from https://www.tensorflow.org/js/guide/conversion

[27]

Module: tf.keras.applications. 2023. Retrieved September 2023 from https://www.tensorflow.org/api_docs/python/tf/keras/applications

[28]

Navigation Timing API. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/Navigation_timing_API

[29]

ONNX. 2023. Retrieved September 2023 from https://onnx.ai/

[30]

ONNX Runtime: Optimize and Accelerate Machine Learning Inferencing and Training. 2023. Retrieved September 2023 from https://onnxruntime.ai/

[31]

ONNX Web Runtime. 2023. Retrieved September 2023 from https://onnxruntime.ai/

[32]

OpenGL: The Industry’s Foundation for High Performance Graphics. 2023. Retrieved September 2023 from https://www.opengl.org/

[33]

Relaxed SIMD Proposal. 2023. Retrieved September 2023 from https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md

[34]

Rocket Lake. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/Rocket_Lake

[35]

Skylake (Microarchitecture). 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/Skylake_(microarchitecture)

[36]

Speed Index. 2023. Retrieved September 2023 from https://developer.chrome.com/docs/lighthouse/performance/speed-index/

[37]

Speedometer. 2023. Retrieved September 2023 from https://browserbench.org/Speedometer2.1/

[38]

Technical City: Unbiased Hardware Comparisons. 2023. Retrieved September 2023 from https://technical.city/

[39]

TensorFlow Hub. 2023. Retrieved September 2023 from https://tfhub.dev/

[40]

TensorFlow Lite. 2023. Retrieved September 2023 from https://www.tensorflow.org/lite

[41]

TensorFlow.js. 2023. Retrieved September 2023 from https://www.tensorflow.org/js

[42]

Time to First Byte. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Glossary/time_to_first_byte

[43]

Using Web Workers. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers

[44]

V8. 2023. Retrieved September 2023 from https://v8.dev/

[45]

Web LLM. 2023. Retrieved September 2023 from https://github.com/mlc-ai/web-llm

[46]

WebAssembly. 2023. Retrieved September 2023 from https://webassembly.org/

[47]

WebAssembly 128-Bit Packed SIMD Extension. 2023. Retrieved September 2023 from https://github.com/WebAssembly/spec/blob/main/proposals/simd/SIMD.md

[48]

WebDNN. 2023. Retrieved September 2023 from https://mil-tokyo.github.io/webdnn/

[49]

WebGL: 2D and 3D Graphics for the Web. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API

[50]

WebGL2RenderingContext. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/WebGL2RenderingContext

[51]

WebGL2RenderingContext: getSyncParameter() method. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/WebGL2RenderingContext/getSyncParameter

[52]

WebGLShader. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/WebGLShader

[53]

WebGPU-W3C Working Draft. 2023. Retrieved September 2023 from https://www.w3.org/TR/webgpu/

[54]

YouTube-BoundingBoxes Dataset. 2023. Retrieved September 2023 from https://research.google.com/youtube-bb/

[55]

YouTube Faces DB. 2023. Retrieved September 2023 from https://www.cs.tau.ac.il/wolf/ytfaces/

[56]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16), 265–283.

[57]

Amin Ahmad, Noah Constant, Yinfei Yang, and Daniel Cer. 2019. ReQA: An evaluation for end-to-end answer retrieval models. arXiv:1907.04780. Retrieved from https://doi.org/10.48550/arXiv.1907.04780

[58]

Soroush Bateni and Cong Liu. 2018. ApNet: Approximation-aware real-time neural network. In 2018 IEEE Real-Time Systems Symposium (RTSS). IEEE, 67–79.

[59]

Enrico Bocchi, Luca De Cicco, and Dario Rossi. 2016. Measuring the quality of experience of web users. ACM SIGCOMM Computer Communication Review 46, 4 (2016), 8–13.

Digital Library

[60]

Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems 1 (2019), 374–388.

[61]

Qi Cai, Yingwei Pan, Chong-Wah Ngo, Xinmei Tian, Lingyu Duan, and Ting Yao. 2019. Exploring object relation in mean teacher for cross-domain detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11457–11466.

[62]

Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G Andersen, Michael Kaminsky, and Subramanya Dulloor. 2019. Scaling video analytics on constrained edge nodes. Proceedings of Machine Learning and Systems 1 (2019), 406–417.

[63]

Junming Cao, Bihuan Chen, Chao Sun, Longjie Hu, Shuaihong Wu, and Xin Peng. 2022. Understanding performance problems in deep learning systems. In 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 357–369.

Digital Library

[64]

Simin Chen, Mirazul Haque, Cong Liu, and Wei Yang. 2022. DeepPerform: An efficient approach for performance testing of resource-constrained neural networks. In 37th IEEE/ACM International Conference on Automated Software Engineering, 1–13.

Digital Library

[65]

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An automated end-to-end optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18), 578–594.

[66]

Tiffany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. 2015. Glimpse: Continuous, real-time object recognition on mobile devices. In 13th ACM Conference on Embedded Networked Sensor Systems, 155–168.

Digital Library

[67]

Zhenpeng Chen, Yanbin Cao, Yuanqiang Liu, Haoyu Wang, Tao Xie, and Xuanzhe Liu. 2020. A comprehensive study on challenges in deploying deep learning based software. In 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 750–762.

Digital Library

[68]

Zhenpeng Chen, Huihan Yao, Yiling Lou, Yanbin Cao, Yuanqiang Liu, Haoyu Wang, and Xuanzhe Liu. 2021. An empirical study on deployment faults of deep learning based mobile applications. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 674–685.

Digital Library

[69]

Alex Cummaudo, Rajesh Vasa, Scott Barnett, John Grundy, and Mohamed Abdelrazek. 2020. Interpreting cloud computer vision pain-points: A mining study of stack overflow. In ACM/IEEE 42nd International Conference on Software Engineering, 1584–1596.

Digital Library

[70]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. Retrieved from https://doi.org/10.48550/arXiv.1810.04805

[71]

Bing Dong, Tianen Liu, Borui Li, Xiaolei Zhou, Shuai Wang, and Zhao-Dong Xu. 2023. WebInf: Accelerating WebGPU-based in-browser DNN inference via adaptive model partitioning. In 2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 2499–2506.

[72]

Xiaoning Du, Xiaofei Xie, Yi Li, Lei Ma, Yang Liu, and Jianjun Zhao. 2019. DeepStellar: Model-based quantitative analysis of stateful deep learning systems. In 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 477–487.

Digital Library

[73]

Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, and Haibin Ling. 2019. Lasot: A high-quality benchmark for large-scale single object tracking. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5374–5383.

[74]

Yanjie Gao, Yu Liu, Hongyu Zhang, Zhengxian Li, Yonghao Zhu, Haoxiang Lin, and Mao Yang. 2020. Estimating GPU memory consumption of deep learning models. In 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 1342–1352.

Digital Library

[75]

Qianyu Guo, Sen Chen, Xiaofei Xie, Lei Ma, Qiang Hu, Hongtao Liu, Yang Liu, Jianjun Zhao, and Xiaohong Li. 2019. An empirical study towards characterizing deep learning development and deployment across different frameworks and platforms. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 810–822.

Digital Library

[76]

Huong Ha and Hongyu Zhang. 2019. DeepPerf: Performance prediction for configurable software with deep sparse neural network. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1095–1106.

Digital Library

[77]

Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and J. F. Bastien. 2017. Bringing the web up to speed with WebAssembly. In 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, 185–200.

Digital Library

[78]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 770–778.

[79]

Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, and Hartwig Adam. 2019. Searching for MobileNetV3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1314–1324.

[80]

Yakun Huang, Xiuquan Qiao, Jian Tang, Pei Ren, Ling Liu, Calton Pu, and Jun-Liang Chen. 2021. An integrated cloud-edge-device adaptive deep learning service for cross-platform web. IEEE Transactions on Mobile Computing 22, 4 (2021), 1950–1967.

Digital Library

[81]

Abhinav Jangda, Bobby Powers, Emery D Berger, and Arjun Guha. 2019. Not so fast: Analyzing the performance of WebAssembly vs. native code. In 2019 USENIX Annual Technical Conference (USENIX ATC ’19), 107–120.

[82]

Fucheng Jia, Shiqi Jiang, Ting Cao, Wei Cui, Tianrui Xia, Xu Cao, Yuanchun Li, Qipeng Wang, Deyu Zhang, Ju Ren, Yunxin Liu, Lili Qiu, and Mao Yang. 2024. Empowering in-browser deep learning inference on Edge through Just-In-Time kernel optimization. In 22nd Annual International Conference on Mobile Systems, Applications and Services, 438–450.

Digital Library

[83]

Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing neural network queries over video at scale. arXiv:1703.02529. Retrieved from https://doi.org/10.48550/arXiv.1703.02529

[84]

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of GANs for improved quality, stability, and variation. arXiv:1710.10196. Retrieved from https://doi.org/10.48550/arXiv.1710.10196

[85]

Conor Kelton, Jihoon Ryoo, Aruna Balasubramanian, and Samir R. Das. 2017. Improving user perceived page load times using gaze. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’17), 545–559.

[86]

Yuanqi Li, Arthi Padmanabhan, Pengzhan Zhao, Yufei Wang, Guoqing Harry Xu, and Ravi Netravali. 2020. Reducto: On-camera filtering for resource-efficient real-time video analytics. In Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, 359–376.

Digital Library

[87]

Chen Ling, Lei Wang, Jun Lang, Qiufen Xia, Guoxuan Chang, Kun Wang, and Peng Zhao. 2018. LinCa: A page loading time optimization approach for users subject to Internet access restriction. In Companion Proceedings of the the Web Conference 2018, 69–70.

Digital Library

[88]

Ting Liu, Tianhao Miao, Qinghua Wu, Zhenyu Li, Guangxin He, Jiaoren Wu, Shengzhuo Zhang, Xingwu Yang, Gareth Tyson, and Gaogang Xie. 2022. Modeling and optimizing the scaling performance in distributed deep learning training. In ACM Web Conference 2022, 1764–1773.

Digital Library

[89]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, 21–37.

[90]

Wei Liu, Xinlei Yang, Hao Lin, Zhenhua Li, and Feng Qian. 2022. Fusing speed index during Web page loading. Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, 1 (2022), 1–23.

Digital Library

[91]

Xin Liu, Fatemeh Karimi Nejadasl, Jan C. van Gemert, Olaf Booij, and Silvia L. Pintea. 2023. Objects do not disappear: Video object detection by single-frame object location anticipation. In IEEE/CVF International Conference on Computer Vision, 6950–6961.

[92]

Yun Ma, Dongwei Xiang, Shuyu Zheng, Deyu Tian, and Xuanzhe Liu. 2019. Moving deep learning into web browser: How far can we go? In The World Wide Web Conference, 1234–1244.

Digital Library

[93]

Shaghayegh Mardani, Ayush Goel, Ronny Ko, Harsha V Madhyastha, and Ravi Netravali. 2021. Horcrux: Automatic JavaScript parallelism for resource-efficient web computation. In 15th USENIX Symposium on Operating Systems Design and Implementation OSDI ’21, 461–477.

[94]

Gaurav Mittal, Kaushal B. Yagnik, Mohit Garg, and Narayanan C. Krishnan. 2016. Spotgarbage: Smartphone app to detect garbage using deep learning. In 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 940–945.

Digital Library

[95]

Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning convolutional neural networks for resource efficient inference. arXiv:1611.06440. Retrieved from https://doi.org/10.48550/arXiv.1611.06440

[96]

Tomas Pfister, James Charles, and Andrew Zisserman. 2015. Flowing convnets for human pose estimation in videos. In IEEE International Conference on Computer Vision, 1913–1921.

Digital Library

[97]

Hung Viet Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, and Nachiappan Nagappan. 2020. Problems and opportunities in training deep learning software systems: An analysis of variance. In 35th IEEE/ACM International Conference on Automated Software Engineering, 771–783.

Digital Library

[98]

Ihsan Ayyub Qazi, Zafar Ayyub Qazi, Theophilus A. Benson, Ghulam Murtaza, Ehsan Latif, Abdul Manan, and Abrar Tariq. 2020. Mobile web browsing under memory pressure. ACM SIGCOMM Computer Communication Review 50, 4 (2020), 35–48.

Digital Library

[99]

Lili Quan, Qianyu Guo, Xiaofei Xie, Sen Chen, Xiaohong Li, and Yang Liu. 2022. Towards understanding the faults of JavaScript-based deep learning systems. In 37th IEEE/ACM International Conference on Automated Software Engineering, 1–13.

Digital Library

[100]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition, 779–788.

[101]

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4510–4520.

[102]

Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition, 815–823.

[103]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https://doi.org/10.48550/arXiv.1409.1556

[104]

Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou. 2020. MobileBert: A compact task-agnostic Bert for resource-limited devices. arXiv:2004.02984. Retrieved from https://doi.org/10.48550/arXiv.2004.02984

[105]

Mingxing Tan, Ruoming Pang, and Quoc V. Le. 2020. EfficientDet: Scalable and efficient object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10781–10790.

[106]

Xiaohu Tang, Yang Wang, Ting Cao, Li Lyna Zhang, Qi Chen, Deng Cai, Yunxin Liu, and Mao Yang. 2023. LUT-NN: Empower efficient neural network inference with centroid learning and table lookup. In 29th Annual International Conference on Mobile Computing and Networking, 1–15.

Digital Library

[107]

Deyu Tian and Yun Ma. 2019. Understanding quality of experiences on different mobile browsers. In 11th Asia-Pacific Symposium on Internetware, 1–10.

Digital Library

[108]

Deyu Tian, Haiyang Shen, and Yun Ma. 2022. Parallelizing DNN inference in mobile web browsers on heterogeneous hardware. In 20th Annual International Conference on Mobile Systems, Applications and Services, 519–520.

Digital Library

[109]

Bipin Upadhyaya, Ying Zou, Iman Keivanloo, and Joanna Ng. 2015. Quality of experience: User’s perception about web services. IEEE Transactions on Services Computing 8, 3 (2015), 410–421.

[110]

Krishna Wadhwani and Tamaki Kojima. 2022. SqueezeNeRF: Further factorized FastNeRF for memory-efficient inference. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2717–2725.

[111]

Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. 2018. CosFace: Large margin cosine loss for deep face recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 5265–5274.

[112]

Manni Wang, Shaohua Ding, Ting Cao, Yunxin Liu, and Fengyuan Xu. 2021. AsyMo: Scalable and efficient deep-learning inference on asymmetric mobile CPUs. In 27th Annual International Conference on Mobile Computing and Networking, 215–228.

Digital Library

[113]

Qipeng Wang, Mengwei Xu, Chao Jin, Xinran Dong, Jinliang Yuan, Xin Jin, Gang Huang, Yunxin Liu, and Xuanzhe Liu. 2022. Melon: Breaking the memory wall for resource-efficient on-device machine learning. In 20th Annual International Conference on Mobile Systems, Applications and Services, 450–463.

Digital Library

[114]

Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. 2018. ESRGAN: Enhanced super-resolution generative adversarial networks. In European Conference on Computer Vision (ECCV) Workshops.

[115]

Xiao Sophia Wang, Aruna Balasubramanian, Arvind Krishnamurthy, and David Wetherall. 2013. Demystifying page load performance with WProf. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’13), 473–485.

[116]

Mengwei Xu, Feng Qian, Qiaozhu Mei, Kang Huang, and Xuanzhe Liu. 2018. DeepType: On-device deep learning for input personalization service with minimal privacy concern. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1–26.

Digital Library

[117]

Mengwei Xu, Xiwen Zhang, Yunxin Liu, Gang Huang, Xuanzhe Liu, and Felix Xiaozhu Lin. 2020. Approximate query service on autonomous IOT cameras. In 18th International Conference on Mobile Systems, Applications, and Services, 191–205.

Digital Library

[118]

Xinlei Yang, Wei Liu, Hao Lin, Zhenhua Li, Feng Qian, Xianlong Wang, Yunhao Liu, and Tianyin Xu. 2023. Visual-aware testing and debugging for web performance optimization. In ACM Web Conference 2023, 2948–2959.

Digital Library

[119]

Yanming Yang, Xin Xia, David Lo, and John Grundy. 2022. A survey on deep learning for software engineering. ACM Computing Surveys (CSUR) 54, 10s (2022), 1–73.

Digital Library

[120]

Jihwan Yeo, Changhyun Shin, and Soo-Mook Moon. 2019. Snapshot-based loading acceleration of web apps with nondeterministic JavaScript execution. In The World Wide Web Conference, 2215–2224.

Digital Library

[121]

Qiyang Zhang, Xiang Li, Xiangying Che, Xiao Ma, Ao Zhou, Mengwei Xu, Shangguang Wang, Yun Ma, and Xuanzhe Liu. 2022. A comprehensive benchmark of deep learning libraries on mobile devices. In ACM Web Conference 2022, 3298–3307.

Digital Library

[122]

Tianyi Zhang, Cuiyun Gao, Lei Ma, Michael Lyu, and Miryung Kim. 2019. An empirical study of common challenges in developing deep learning applications. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 104–115.

[123]

Quan Zhou, Haiquan Wang, Xiaoyan Yu, Cheng Li, Youhui Bai, Feng Yan, and Yinlong Xu. 2023. MPress: Democratizing billion-scale model training on multi-GPU servers via memory-saving inter-operator parallelism. In 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 556–569.

[124]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, 2223–2232.

Index Terms

Anatomizing Deep Learning Inference in Web Browsers
1. Computing methodologies
  1. Artificial intelligence
2. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Empirical studies in ubiquitous and mobile computing

Recommendations

Moving Deep Learning into Web Browser: How Far Can We Go?
WWW '19: The World Wide Web Conference

Recently, several JavaScript-based deep learning frameworks have emerged, making it possible to perform deep learning tasks directly in browsers. However, little is known on what and how well we can do with these frameworks for deep learning in ...
Better security and privacy for web browsers: a survey of techniques, and a new implementation
FAST'11: Proceedings of the 8th international conference on Formal Aspects of Security and Trust

The web browser is one of the most security critical software components today. It is used to interact with a variety of important applications and services, including social networking services, e-mail services, and e-commerce and e-health ...
The visible Web browser

As an aid to the study of the World-Wide Web, we have developed a software application that allows a user to observe the messages passed between a Web browser and a Web server. The application is based on the Mozilla Web Browser, and displays the HTTP ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 34, Issue 2

February 2025

904 pages

EISSN:1557-7392

DOI:10.1145/3703017

Editor:
Abhik Roychoudhury
National University of Singapore, Singapore

Issue’s Table of Contents

Copyright © 2025 Copyright held by the owner/author(s).

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 January 2025

Online AM: 14 August 2024

Accepted: 22 July 2024

Revised: 09 July 2024

Received: 20 February 2024

Published in TOSEM Volume 34, Issue 2

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
367
Total Downloads

Downloads (Last 12 months)367
Downloads (Last 6 weeks)118

Reflects downloads up to 22 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents