Abstract
The proliferation of pre-trained ML models in public Web-based model zoos facilitates the engineering of ML pipelines to address complex inference queries over datasets and streams of unstructured content. Constructing optimal plan for a query is hard, especially when constraints (e.g. accuracy or execution time) must be taken into consideration, and the complexity of the inference query increases. To address this issue, we propose a method for optimizing ML inference queries that selects the most suitable ML models to use, as well as the order in which those models are executed. We formally define the constraint-based ML inference query optimization problem, formulate it as a Mixed Integer Programming (MIP) problem, and develop an optimizer that maximizes accuracy given constraints. This optimizer is capable of navigating a large search space to identify optimal query plans on various model zoos.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Anderson, M.R., et al.: Physical representation-based predicate optimization for a visual analytics database. In: 2019 IEEE 35th ICDE, pp. 1466–1477. IEEE (2019)
Barbieri, F., et al.: TweetEval: unified benchmark and comparative evaluation for tweet classification. arXiv preprint arXiv:2010.12421 (2020)
Cai, Z., et al.: Learning complexity-aware cascades for pedestrian detection. IEEE PAMI 42(9), 2195–2211 (2019)
Cao, J., et al.: Thia: accelerating video analytics using early inference and fine-grained query planning. arXiv preprint arXiv:2102.08481 (2021)
Chang, J.Y., Lee, S.: An optimization of disjunctive queries: union-pushdown. In: Proceedings of COMPSAC, pp. 356–361. IEEE (1997)
Chowdhary, K., et al.: Natural language processing. In: Chowdhary, K.R. (ed.) Fundamentals of Artificial Intelligence, pp. 603–649. Springer, New Delhi (2020). https://doi.org/10.1007/978-81-322-3972-7_19
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE CVPR, pp. 7310–7311 (2017)
Jiang, J., et al.: Chameleon: scalable adaptation of video analytics. In: Proceedings of SIGCOMM, pp. 253–266 (2018)
Kang, D., et al.: NoScope: optimizing neural network queries over video at scale. arXiv preprint arXiv:1703.02529 (2017)
Karanasos, K., et al.: Extending relational query processing with ml inference. CIDR (2020)
Kastrati, F., Moerkotte, G.: Generating optimal plans for Boolean expressions. In: IEEE ICDE, pp. 1013–1024. IEEE (2018)
Kemper, A., et al.: Optimizing disjunctive queries with expensive predicates. ACM SIGMOD Rec. 23(2), 336–347 (1994)
Li, Z., Hai, R., Bozzon, A., Katsifodimos, A.: Metadata representations for Queryable ML model zoos. arXiv preprint arXiv:2207.09315 (2022)
Li, Z., et al.: Optimizing machine learning inference queries for multiple objectives. In: 39th ICDE Workshop on DBML. IEEE (2023)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lu, Y., et al.: Accelerating machine learning inference with probabilistic predicates. In: Proceedings of the SIGMOD, pp. 1493–1508 (2018)
Papadimitriou, C.H., et al.: Multiobjective query optimization. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2001, pp. 52–59. Association for Computing Machinery, New York (2001)
Redmon, J., et al.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE CVPR, pp. 7263–7271 (2017)
Shen, H., et al.: Fast video classification via adaptive cascading of deep models. In: Proceedings of the IEEE CVPR, pp. 3646–3654 (2017)
Trummer, I., Koch, C.: Approximation schemes for many-objective query optimization. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1299–1310 (2014)
Trummer, I., Koch, C.: Multi-objective parametric query optimization. SIGMOD Rec. 45(1), 24–31 (2016)
Zhang, H., et al.: Live video analytics at scale with approximation and delay-tolerance. In: 14th USENIX (NSDI), pp. 377–392 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Z. et al. (2023). Optimizing ML Inference Queries Under Constraints. In: Garrigós, I., Murillo RodrÃguez, J.M., Wimmer, M. (eds) Web Engineering. ICWE 2023. Lecture Notes in Computer Science, vol 13893. Springer, Cham. https://doi.org/10.1007/978-3-031-34444-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-34444-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34443-5
Online ISBN: 978-3-031-34444-2
eBook Packages: Computer ScienceComputer Science (R0)