Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Anatomizing Deep Learning Inference in Web Browsers

Published: 21 January 2025 Publication History

Abstract

Web applications have increasingly adopted Deep Learning (DL) through in-browser inference, wherein DL inference performs directly within Web browsers. The actual performance of in-browser inference and its impacts on the Quality of Experience (QoE) remain unexplored, and urgently require new QoE measurements beyond traditional ones, e.g., mainly focusing on page load time. To bridge this gap, we make the first comprehensive performance measurement of in-browser inference to date. Our approach proposes new metrics to measure in-browser inference: responsiveness, smoothness, and inference accuracy. Our extensive analysis involves 9 representative DL models across Web browsers of 50 popular PC devices and 20 mobile devices. The results reveal that in-browser inference exhibits a substantial latency gap, averaging 16.9 times slower on CPU and 4.9 times slower on GPU compared to native inference on PC devices. The gap on mobile CPU and mobile GPU is 15.8 times and 7.8 times, respectively. Furthermore, we identify contributing factors to such latency gap, including underutilized hardware instruction sets, inherent overhead in the runtime environment, resource contention within the browser, and inefficiencies in software libraries and GPU abstractions. Additionally, in-browser inference imposes significant memory demands, at times exceeding 334.6 times the size of the DL models themselves, partly attributable to suboptimal memory management. We also observe that in-browser inference leads to a significant 67.2% increase in the time it takes for GUI components to render within Web browsers, significantly affecting the overall user QoE of Web applications reliant on this technology.

References

[1]
ANGLE (software). 2024. Retrieved April 2024 from https://en.wikipedia.org/wiki/ANGLE_(software)
[2]
Selenium with Python. 2024. Retrieved April 2024 from https://selenium-python.readthedocs.io/
[3]
Using the SavedModel Format. 2024. Retrieved April 2024 from https://www.tensorflow.org/guide/saved_model
[4]
WEBGL_debug_renderer_info Extension. 2024. Retrieved April 2024 from https://developer.mozilla.org/en-US/docs/Web/API/WEBGL_debug_renderer_info
[5]
XNNPACK: High-Efficiency Floating-Point Neural Network Inference Operators for Mobile, Server, and Web. 2024. Retrieved April 2024 from https://github.com/google/XNNPACK
[7]
Teams Online. 2024. Retrieved July 2024 from https://teams.microsoft.com/
[8]
Theremix. 2024. Retrieved July 2024 from https://theremin.app/
[9]
AI Benchmark for Windows, Linux and macOS. 2023. Retrieved September 2023 from https://ai-benchmark.com/ranking_deeplearning_detailed.html
[10]
AI Face Test. 2023. Retrieved September 2023 from https://facetest1.netlify.app/
[11]
Brain.js: GPU Accelerated Neural networks in JavaScript for Browsers and Node.js. 2023. Retrieved September 2023 from https://brain.js.org/
[12]
Browser Market Share Worldwide. 2023. Retrieved September 2023 from https://gs.statcounter.com/browser-market-share
[14]
Core ML. 2023. Retrieved September 2023 from https://developer.apple.com/documentation/coreml
[15]
CPU Benchmarks – Over 1,000,000 CPUs benchmarked. 2023. Retrieved September 2023 from https://www.cpubenchmark.net/
[16]
CUDA. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/CUDA
[17]
Direct3D. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/Direct3D
[18]
Face Age Test. 2023. Retrieved September 2023 from https://aige.netlify.app/
[19]
FMA Instruction Set. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/FMA_instruction_set
[20]
Getting Started Converting TensorFlow to ONNX. 2023. Retrieved September 2023 from https://onnxruntime.ai/docs/tutorials/tf-get-started.html
[21]
How WebAssembly Threads Work. 2023. Retrieved September 2023 from https://web.dev/webassembly-threads/
[24]
Lighthouse. 2023. Retrieved September 2023 from https://developer.chrome.com/docs/lighthouse/overview/
[25]
Metal: Accelerating Graphics and Much More. 2023. Retrieved September 2023 from https://developer.apple.com/metal/
[26]
Model Conversion. 2023. Retrieved September 2023 from https://www.tensorflow.org/js/guide/conversion
[27]
Module: tf.keras.applications. 2023. Retrieved September 2023 from https://www.tensorflow.org/api_docs/python/tf/keras/applications
[28]
Navigation Timing API. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/Navigation_timing_API
[29]
ONNX. 2023. Retrieved September 2023 from https://onnx.ai/
[30]
ONNX Runtime: Optimize and Accelerate Machine Learning Inferencing and Training. 2023. Retrieved September 2023 from https://onnxruntime.ai/
[31]
ONNX Web Runtime. 2023. Retrieved September 2023 from https://onnxruntime.ai/
[32]
OpenGL: The Industry’s Foundation for High Performance Graphics. 2023. Retrieved September 2023 from https://www.opengl.org/
[34]
Rocket Lake. 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/Rocket_Lake
[35]
Skylake (Microarchitecture). 2023. Retrieved September 2023 from https://en.wikipedia.org/wiki/Skylake_(microarchitecture)
[37]
Speedometer. 2023. Retrieved September 2023 from https://browserbench.org/Speedometer2.1/
[38]
Technical City: Unbiased Hardware Comparisons. 2023. Retrieved September 2023 from https://technical.city/
[39]
TensorFlow Hub. 2023. Retrieved September 2023 from https://tfhub.dev/
[40]
TensorFlow Lite. 2023. Retrieved September 2023 from https://www.tensorflow.org/lite
[41]
TensorFlow.js. 2023. Retrieved September 2023 from https://www.tensorflow.org/js
[42]
Time to First Byte. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Glossary/time_to_first_byte
[44]
V8. 2023. Retrieved September 2023 from https://v8.dev/
[45]
Web LLM. 2023. Retrieved September 2023 from https://github.com/mlc-ai/web-llm
[46]
WebAssembly. 2023. Retrieved September 2023 from https://webassembly.org/
[47]
WebAssembly 128-Bit Packed SIMD Extension. 2023. Retrieved September 2023 from https://github.com/WebAssembly/spec/blob/main/proposals/simd/SIMD.md
[48]
WebDNN. 2023. Retrieved September 2023 from https://mil-tokyo.github.io/webdnn/
[49]
WebGL: 2D and 3D Graphics for the Web. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API
[50]
WebGL2RenderingContext. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/WebGL2RenderingContext
[51]
WebGL2RenderingContext: getSyncParameter() method. 2023. Retrieved September 2023 from https://developer.mozilla.org/en-US/docs/Web/API/WebGL2RenderingContext/getSyncParameter
[52]
[53]
WebGPU-W3C Working Draft. 2023. Retrieved September 2023 from https://www.w3.org/TR/webgpu/
[54]
YouTube-BoundingBoxes Dataset. 2023. Retrieved September 2023 from https://research.google.com/youtube-bb/
[55]
YouTube Faces DB. 2023. Retrieved September 2023 from https://www.cs.tau.ac.il/wolf/ytfaces/
[56]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16), 265–283.
[57]
Amin Ahmad, Noah Constant, Yinfei Yang, and Daniel Cer. 2019. ReQA: An evaluation for end-to-end answer retrieval models. arXiv:1907.04780. Retrieved from https://doi.org/10.48550/arXiv.1907.04780
[58]
Soroush Bateni and Cong Liu. 2018. ApNet: Approximation-aware real-time neural network. In 2018 IEEE Real-Time Systems Symposium (RTSS). IEEE, 67–79.
[59]
Enrico Bocchi, Luca De Cicco, and Dario Rossi. 2016. Measuring the quality of experience of web users. ACM SIGCOMM Computer Communication Review 46, 4 (2016), 8–13.
[60]
Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems 1 (2019), 374–388.
[61]
Qi Cai, Yingwei Pan, Chong-Wah Ngo, Xinmei Tian, Lingyu Duan, and Ting Yao. 2019. Exploring object relation in mean teacher for cross-domain detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11457–11466.
[62]
Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G Andersen, Michael Kaminsky, and Subramanya Dulloor. 2019. Scaling video analytics on constrained edge nodes. Proceedings of Machine Learning and Systems 1 (2019), 406–417.
[63]
Junming Cao, Bihuan Chen, Chao Sun, Longjie Hu, Shuaihong Wu, and Xin Peng. 2022. Understanding performance problems in deep learning systems. In 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 357–369.
[64]
Simin Chen, Mirazul Haque, Cong Liu, and Wei Yang. 2022. DeepPerform: An efficient approach for performance testing of resource-constrained neural networks. In 37th IEEE/ACM International Conference on Automated Software Engineering, 1–13.
[65]
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An automated end-to-end optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18), 578–594.
[66]
Tiffany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. 2015. Glimpse: Continuous, real-time object recognition on mobile devices. In 13th ACM Conference on Embedded Networked Sensor Systems, 155–168.
[67]
Zhenpeng Chen, Yanbin Cao, Yuanqiang Liu, Haoyu Wang, Tao Xie, and Xuanzhe Liu. 2020. A comprehensive study on challenges in deploying deep learning based software. In 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 750–762.
[68]
Zhenpeng Chen, Huihan Yao, Yiling Lou, Yanbin Cao, Yuanqiang Liu, Haoyu Wang, and Xuanzhe Liu. 2021. An empirical study on deployment faults of deep learning based mobile applications. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 674–685.
[69]
Alex Cummaudo, Rajesh Vasa, Scott Barnett, John Grundy, and Mohamed Abdelrazek. 2020. Interpreting cloud computer vision pain-points: A mining study of stack overflow. In ACM/IEEE 42nd International Conference on Software Engineering, 1584–1596.
[70]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. Retrieved from https://doi.org/10.48550/arXiv.1810.04805
[71]
Bing Dong, Tianen Liu, Borui Li, Xiaolei Zhou, Shuai Wang, and Zhao-Dong Xu. 2023. WebInf: Accelerating WebGPU-based in-browser DNN inference via adaptive model partitioning. In 2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 2499–2506.
[72]
Xiaoning Du, Xiaofei Xie, Yi Li, Lei Ma, Yang Liu, and Jianjun Zhao. 2019. DeepStellar: Model-based quantitative analysis of stateful deep learning systems. In 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 477–487.
[73]
Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, and Haibin Ling. 2019. Lasot: A high-quality benchmark for large-scale single object tracking. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5374–5383.
[74]
Yanjie Gao, Yu Liu, Hongyu Zhang, Zhengxian Li, Yonghao Zhu, Haoxiang Lin, and Mao Yang. 2020. Estimating GPU memory consumption of deep learning models. In 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 1342–1352.
[75]
Qianyu Guo, Sen Chen, Xiaofei Xie, Lei Ma, Qiang Hu, Hongtao Liu, Yang Liu, Jianjun Zhao, and Xiaohong Li. 2019. An empirical study towards characterizing deep learning development and deployment across different frameworks and platforms. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 810–822.
[76]
Huong Ha and Hongyu Zhang. 2019. DeepPerf: Performance prediction for configurable software with deep sparse neural network. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1095–1106.
[77]
Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and J. F. Bastien. 2017. Bringing the web up to speed with WebAssembly. In 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, 185–200.
[78]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
[79]
Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, and Hartwig Adam. 2019. Searching for MobileNetV3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1314–1324.
[80]
Yakun Huang, Xiuquan Qiao, Jian Tang, Pei Ren, Ling Liu, Calton Pu, and Jun-Liang Chen. 2021. An integrated cloud-edge-device adaptive deep learning service for cross-platform web. IEEE Transactions on Mobile Computing 22, 4 (2021), 1950–1967.
[81]
Abhinav Jangda, Bobby Powers, Emery D Berger, and Arjun Guha. 2019. Not so fast: Analyzing the performance of WebAssembly vs. native code. In 2019 USENIX Annual Technical Conference (USENIX ATC ’19), 107–120.
[82]
Fucheng Jia, Shiqi Jiang, Ting Cao, Wei Cui, Tianrui Xia, Xu Cao, Yuanchun Li, Qipeng Wang, Deyu Zhang, Ju Ren, Yunxin Liu, Lili Qiu, and Mao Yang. 2024. Empowering in-browser deep learning inference on Edge through Just-In-Time kernel optimization. In 22nd Annual International Conference on Mobile Systems, Applications and Services, 438–450.
[83]
Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing neural network queries over video at scale. arXiv:1703.02529. Retrieved from https://doi.org/10.48550/arXiv.1703.02529
[84]
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of GANs for improved quality, stability, and variation. arXiv:1710.10196. Retrieved from https://doi.org/10.48550/arXiv.1710.10196
[85]
Conor Kelton, Jihoon Ryoo, Aruna Balasubramanian, and Samir R. Das. 2017. Improving user perceived page load times using gaze. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’17), 545–559.
[86]
Yuanqi Li, Arthi Padmanabhan, Pengzhan Zhao, Yufei Wang, Guoqing Harry Xu, and Ravi Netravali. 2020. Reducto: On-camera filtering for resource-efficient real-time video analytics. In Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, 359–376.
[87]
Chen Ling, Lei Wang, Jun Lang, Qiufen Xia, Guoxuan Chang, Kun Wang, and Peng Zhao. 2018. LinCa: A page loading time optimization approach for users subject to Internet access restriction. In Companion Proceedings of the the Web Conference 2018, 69–70.
[88]
Ting Liu, Tianhao Miao, Qinghua Wu, Zhenyu Li, Guangxin He, Jiaoren Wu, Shengzhuo Zhang, Xingwu Yang, Gareth Tyson, and Gaogang Xie. 2022. Modeling and optimizing the scaling performance in distributed deep learning training. In ACM Web Conference 2022, 1764–1773.
[89]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, 21–37.
[90]
Wei Liu, Xinlei Yang, Hao Lin, Zhenhua Li, and Feng Qian. 2022. Fusing speed index during Web page loading. Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, 1 (2022), 1–23.
[91]
Xin Liu, Fatemeh Karimi Nejadasl, Jan C. van Gemert, Olaf Booij, and Silvia L. Pintea. 2023. Objects do not disappear: Video object detection by single-frame object location anticipation. In IEEE/CVF International Conference on Computer Vision, 6950–6961.
[92]
Yun Ma, Dongwei Xiang, Shuyu Zheng, Deyu Tian, and Xuanzhe Liu. 2019. Moving deep learning into web browser: How far can we go? In The World Wide Web Conference, 1234–1244.
[93]
Shaghayegh Mardani, Ayush Goel, Ronny Ko, Harsha V Madhyastha, and Ravi Netravali. 2021. Horcrux: Automatic JavaScript parallelism for resource-efficient web computation. In 15th USENIX Symposium on Operating Systems Design and Implementation OSDI ’21, 461–477.
[94]
Gaurav Mittal, Kaushal B. Yagnik, Mohit Garg, and Narayanan C. Krishnan. 2016. Spotgarbage: Smartphone app to detect garbage using deep learning. In 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 940–945.
[95]
Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning convolutional neural networks for resource efficient inference. arXiv:1611.06440. Retrieved from https://doi.org/10.48550/arXiv.1611.06440
[96]
Tomas Pfister, James Charles, and Andrew Zisserman. 2015. Flowing convnets for human pose estimation in videos. In IEEE International Conference on Computer Vision, 1913–1921.
[97]
Hung Viet Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, and Nachiappan Nagappan. 2020. Problems and opportunities in training deep learning software systems: An analysis of variance. In 35th IEEE/ACM International Conference on Automated Software Engineering, 771–783.
[98]
Ihsan Ayyub Qazi, Zafar Ayyub Qazi, Theophilus A. Benson, Ghulam Murtaza, Ehsan Latif, Abdul Manan, and Abrar Tariq. 2020. Mobile web browsing under memory pressure. ACM SIGCOMM Computer Communication Review 50, 4 (2020), 35–48.
[99]
Lili Quan, Qianyu Guo, Xiaofei Xie, Sen Chen, Xiaohong Li, and Yang Liu. 2022. Towards understanding the faults of JavaScript-based deep learning systems. In 37th IEEE/ACM International Conference on Automated Software Engineering, 1–13.
[100]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition, 779–788.
[101]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4510–4520.
[102]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition, 815–823.
[103]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https://doi.org/10.48550/arXiv.1409.1556
[104]
Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou. 2020. MobileBert: A compact task-agnostic Bert for resource-limited devices. arXiv:2004.02984. Retrieved from https://doi.org/10.48550/arXiv.2004.02984
[105]
Mingxing Tan, Ruoming Pang, and Quoc V. Le. 2020. EfficientDet: Scalable and efficient object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10781–10790.
[106]
Xiaohu Tang, Yang Wang, Ting Cao, Li Lyna Zhang, Qi Chen, Deng Cai, Yunxin Liu, and Mao Yang. 2023. LUT-NN: Empower efficient neural network inference with centroid learning and table lookup. In 29th Annual International Conference on Mobile Computing and Networking, 1–15.
[107]
Deyu Tian and Yun Ma. 2019. Understanding quality of experiences on different mobile browsers. In 11th Asia-Pacific Symposium on Internetware, 1–10.
[108]
Deyu Tian, Haiyang Shen, and Yun Ma. 2022. Parallelizing DNN inference in mobile web browsers on heterogeneous hardware. In 20th Annual International Conference on Mobile Systems, Applications and Services, 519–520.
[109]
Bipin Upadhyaya, Ying Zou, Iman Keivanloo, and Joanna Ng. 2015. Quality of experience: User’s perception about web services. IEEE Transactions on Services Computing 8, 3 (2015), 410–421.
[110]
Krishna Wadhwani and Tamaki Kojima. 2022. SqueezeNeRF: Further factorized FastNeRF for memory-efficient inference. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2717–2725.
[111]
Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. 2018. CosFace: Large margin cosine loss for deep face recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 5265–5274.
[112]
Manni Wang, Shaohua Ding, Ting Cao, Yunxin Liu, and Fengyuan Xu. 2021. AsyMo: Scalable and efficient deep-learning inference on asymmetric mobile CPUs. In 27th Annual International Conference on Mobile Computing and Networking, 215–228.
[113]
Qipeng Wang, Mengwei Xu, Chao Jin, Xinran Dong, Jinliang Yuan, Xin Jin, Gang Huang, Yunxin Liu, and Xuanzhe Liu. 2022. Melon: Breaking the memory wall for resource-efficient on-device machine learning. In 20th Annual International Conference on Mobile Systems, Applications and Services, 450–463.
[114]
Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. 2018. ESRGAN: Enhanced super-resolution generative adversarial networks. In European Conference on Computer Vision (ECCV) Workshops.
[115]
Xiao Sophia Wang, Aruna Balasubramanian, Arvind Krishnamurthy, and David Wetherall. 2013. Demystifying page load performance with WProf. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’13), 473–485.
[116]
Mengwei Xu, Feng Qian, Qiaozhu Mei, Kang Huang, and Xuanzhe Liu. 2018. DeepType: On-device deep learning for input personalization service with minimal privacy concern. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1–26.
[117]
Mengwei Xu, Xiwen Zhang, Yunxin Liu, Gang Huang, Xuanzhe Liu, and Felix Xiaozhu Lin. 2020. Approximate query service on autonomous IOT cameras. In 18th International Conference on Mobile Systems, Applications, and Services, 191–205.
[118]
Xinlei Yang, Wei Liu, Hao Lin, Zhenhua Li, Feng Qian, Xianlong Wang, Yunhao Liu, and Tianyin Xu. 2023. Visual-aware testing and debugging for web performance optimization. In ACM Web Conference 2023, 2948–2959.
[119]
Yanming Yang, Xin Xia, David Lo, and John Grundy. 2022. A survey on deep learning for software engineering. ACM Computing Surveys (CSUR) 54, 10s (2022), 1–73.
[120]
Jihwan Yeo, Changhyun Shin, and Soo-Mook Moon. 2019. Snapshot-based loading acceleration of web apps with nondeterministic JavaScript execution. In The World Wide Web Conference, 2215–2224.
[121]
Qiyang Zhang, Xiang Li, Xiangying Che, Xiao Ma, Ao Zhou, Mengwei Xu, Shangguang Wang, Yun Ma, and Xuanzhe Liu. 2022. A comprehensive benchmark of deep learning libraries on mobile devices. In ACM Web Conference 2022, 3298–3307.
[122]
Tianyi Zhang, Cuiyun Gao, Lei Ma, Michael Lyu, and Miryung Kim. 2019. An empirical study of common challenges in developing deep learning applications. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 104–115.
[123]
Quan Zhou, Haiquan Wang, Xiaoyan Yu, Cheng Li, Youhui Bai, Feng Yan, and Yinlong Xu. 2023. MPress: Democratizing billion-scale model training on multi-GPU servers via memory-saving inter-operator parallelism. In 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 556–569.
[124]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, 2223–2232.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 34, Issue 2
February 2025
904 pages
EISSN:1557-7392
DOI:10.1145/3703017
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 January 2025
Online AM: 14 August 2024
Accepted: 22 July 2024
Revised: 09 July 2024
Received: 20 February 2024
Published in TOSEM Volume 34, Issue 2

Check for updates

Author Tags

  1. Deep learning
  2. Web browser
  3. measurement

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 367
    Total Downloads
  • Downloads (Last 12 months)367
  • Downloads (Last 6 weeks)118
Reflects downloads up to 22 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media