research-article

Open access

Low-Bandwidth Self-Improving Transmission of Rare Training Data

Authors:

Padmanabhan Pillai,

Mahadev SatyanarayananAuthors Info & Claims

ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and Networking

Article No.: 86, Pages 1 - 15

https://doi.org/10.1145/3570361.3613300

Published: 02 October 2023 Publication History

Abstract

A severe bandwidth mismatch between incoming sensor data rate and wireless backhaul bandwidth often exists on unmanned probes when collecting new training data for machine learning (ML). To overcome this mismatch, we describe a self-improving ML-based transmission system called Hawk. Starting from a weak model that is trained on just a few examples, it seamlessly pipelines semi-supervised learning, active learning, and transfer learning, with asynchronous bandwidth-sensitive data transmission to a distant human for labeling. When a significant number of true positives (TPs) have been labeled, Hawk trains an improved model to replace the old model. This iterative workflow, called Live Learning, continues until a sufficient number of TPs have been collected. For very rare events on challenging datasets, and bandwidths as low as 12 kbps, a team of 7 probes using Hawk discovers up to 87% of the TPs that could have been discovered via full preview, transmission and labeling of all mission data. Hawk also uses diversity sampling and few-shot learning.

References

[1]

CVAT: Open Data Annotation Platform. https://www.cvat.ai/. Last accessed on July 13, 2023.

[2]

Zeromq: An open-source universal messaging library. (https://zeromq.org). Last accessed: May 20, 2022.

[3]

Energy Efficiency and Fuel Efficiency: Fuel Cell Technology. (https://www.energy.gov/eere/fuelcells/fuel-cells), 2015.

[4]

Better camouflage is needed to hide from new electronic sensors. The Economist (March 29, 2023).

[5]

Adamy, D. EW 103: Tactical Battlefield Communications Electronic Warfare. Artech House, 2008.

[6]

Ahmad, T., Dhamija, A. R., Jafarzadeh, M., Cruz, S., Rabinowitz, R., Li, C., and Boult, T. E. Variable Few Shot Class Incremental and Open World Learning. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2022).

[7]

Arazo, E., Ortego, D., Albert, P., O'Connor, N. E., and McGuinness, K. Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning. ), 2020.

[8]

Bengio, Y., Louradour, J., Collobert, R., and Weston, J. Curriculum Learning. In Proceedings of the 26th Annual International Conference on Machine Learning (Montreal, Canada, 2009).

Digital Library

[9]

Bouguelia, M.-R., Belaid, Y., and Belaid, A. Identifying and Mitigating Labelling Errors in Active Learning. In Pattern Recognition: Applications and Methods (2015).

[10]

Cao, F., Estert, M., Qian, W., and Zhou, A. Density-based clustering over an evolving data stream with noise. In Proceedings of the 2006 SIAM International Conference on Data Mining (2006).

[11]

Cascante-Bonilla, P., Tan, F., Qi, Y., and Ordonez, V. Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-21) (2021).

[12]

Chao, X., and Zhang, L. Few-shot imbalanced classification based on data augmentation. Multimedia Systems (2021), 1--9.

[13]

Cui, Y., Jia, M., Lin, T.-Y., Song, Y., and Belongie, S. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019).

[14]

Doran, G., Lu, S., Mandrake, L., and Wagstaff, K. Mars orbital image (hirise) labeled data set version 3. NASA: Washington, DC, USA (2019).

[15]

Fei, G., Wang, S., and Liu, B. Learning cumulatively to become more knowledgeable. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016).

Digital Library

[16]

Feng, C., Mao, M., Zhang, X., Liao, Y., Xiao, X., Liu, H., and Liu, K. Programmable microfluidics for dynamic multiband camouflage. Microsystems & Nanoengineering 9, 1 (April 2023). ).

[17]

George, S. Low-Bandwidth Remote Sensing of Rare Events. PhD thesis, Carnegie Mellon University, Computer Science Department, March 2023. Technical Report CMU-CS-23-104.

[18]

George, S., Harkes, J., Eiszler, T., and Sturzinger, E. Hawk Source Code. (https://github.com/cmusatyalab/hawk). Last accessed July 22, 2023.

[19]

George, S., Turki, H., Feng, Z., Ramanan, D., Pillai, P., and Satyanarayanan, M. Edge-Based Privacy-Sensitive Live Learning for Discovery of Training Data. In NetAISys '23: Proceedings of the 1st International Workshop on Networked AI Systems (Helsinki, Finland, June 2023).

[20]

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of IEEE Computer Vision and Pattern Recognition (2016).

[21]

Heidemann, J., Stojanovic, M., and Zorzi, M. Underwater sensor networks: applications, advances and challenges. Philosophical Transactions of the Royal Society A 370 (2012).

[22]

HP. OMEN Transcend Laptop 16-u0097nr. (https://www.hp.com/us-en/shop/pdp/omen-transcend-laptop-16-u0097nr). Last accessed July 12, 2023.

[23]

InSitu. ScanEagle: The UAS that invented the agile ISR category. (https://www.insitu.com/products/scaneagle). Last accessed July 12, 2023.

[24]

Jeong, M., Choi, S., and Kim, C. Few-shot open-set recognition by transformation consistency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021).

[25]

Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., NanoCode012, Kwon, Y., TaoXie, Fang, J., imyhxy, Michael, K., Lorna, V, A., Montes, D., Nadar, J., Laughing, tkianai, yxNONG, Skalski, P., Wang, Z., Hogan, A., Mammana, L., AlexWang1900, Patel, D., Yiwei, D., You, F., Hajek, J., Diaconu, L., and Minh, M. T. ultralytics/yolov5: v6.1 - TensorRT, TensorFlow Edge TPU and OpenVINO Export and Inference. ), Feb. 2022.

[26]

Kim, J., Hur, Y., Park, S., Yang, E., Hwang, S. J., and Shin, J. Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS) (Vancouver, Canada, 2020).

Digital Library

[27]

Lee, D.-H. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks. In ICML 2013 Workshop : Challenges in Representation Learning (WREPL) (Atlanta, GA, 2013).

[28]

Li, T., Sahu, A. K., Talwalkar, A., and Smith, V. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Processing Magazine 37, 3 (May 2020).

[29]

Li, Y., Wang, T., Kang, B., Tang, S., Wang, C., Li, J., and Feng, J. Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020).

[30]

Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision (2017).

[31]

Liu, B. Lifelong machine learning: a paradigm for continuous learning. Front. Comput. Sci. 11, 3 (2017).

[32]

Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., and Yu, S. X. Large-Scale Long-Tailed Recognition in an Open World. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019).

[33]

Lynn, S. K., and Barrett, L. F. 'Utilizing' signal detection theory. Psychological science (2014).

[34]

Mahajan, D., Girshick, R., Ramanathan, V., He, K., Paluri, M., Li, Y., Bharambe, A., and Maaten, L. V. D. Exploring the limits of weakly supervised pretraining. In Proceedings of the European Conference on Computer Vision (ECCV) (2018).

Digital Library

[35]

Mekki, K., Bajic, E., Chaxel, F., and Meyer, F. A comparative study of LPWAN technologies for large-scale IoT deployment. ICT Express 5, 1 (2019).

[36]

Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., Krishnamurthy, J., Lao, N., Mazaitis, K., Mohamed, T., Nakashole, N., Platanios, E., Ritter, A., Samadi, M., Settles, B., Wang, R., Wijaya, D., Gupta, A., Chen, X., Saparov, A., Greaves, M., and Welling, J. Never-Ending Learning. Communications of the ACM 61, 5 (May 2018).

Digital Library

[37]

Mullapudi, R. T., Poms, F., Mark, W. R., Ramanan, D., and Fatahalian, K. Background Splitting: Finding Rare Classes in a Sea of Background. In Proc. of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021).

[38]

Mullapudi, R. T., Poms, F., Mark, W. R., Ramanan, D., and Fatahalian, K. Learning Rare Category Classifiers on a Tight Labeling Budget. In Proc. of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (Montreal, Canada, 2021).

[39]

NASA. Communications with Earth. (https://mars.nasa.gov/msl/mission/communications/). Last accessed July 12, 2023.

[40]

NASA. Mars Curiosity Rover. (https://mars.nasa.gov/msl/spacecraft/rover/summary/). Last accessed July 12, 2023.

[41]

Ochal, M., Patacchiol, M., Storkey, A., Vazquez, J., and Wang, S. Few-shot learning with class imbalance. arXiv preprint arXiv:2101.02523, 2021.

[42]

Patterson, G., Horn, G. V., Belongie, S., Perona, P., and Hays, J. Tropel: Crowdsourcing detectors with minimal training. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (2015), vol. 3.

[43]

Pedersen, M., Bruslund, Haurum, J., Gade, R., and Moeslund, T. Detection of marine animals in a new underwater dataset with varying visibility. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019).

[44]

Satyanarayanan, M., Gao, W., and Lucia, B. The Computing Landscape of the 21st Century. In Proceedings of the 20th International Workshop on Mobile Computing Systems and Applications (HotMobile '19) (Santa Cruz, CA, 2019).

Digital Library

[45]

Settles, B. Active Learning. Morgan & Claypool Synthesis Series on Machine Learning, 2012.

[46]

Shao, J., Wang, Q., and Liu, F. Learning to Sample: an Active Learning Framework. In Proceedings of the IEEE International Conference on Data Mining (ICDM) (2019).

[47]

Tan, M., and Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning (2019).

[48]

Thrun, S., and Mitchell, T. Lifelong robot learning. In The Biology and Technology of Intelligent Autonomous Agents (1995), L. Steels, Ed., Springer, pp. 165--196.

[49]

Tsaousis, C. FireQOS Reference. (https://firehol.org/fireqos-manual.html).

[50]

Xia, G.-S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., and Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018).

[51]

Yao, S., Zhao, Y., Zhang, A., Su, L., and Abdelzaher, T. DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework. In Proceedings of SenSys '17 (Delft, Netherlands, 2017).

Digital Library

[52]

Zhang, C., Song, N., Lin, G., Zheng, Y., Pan, P., and Xu, Y. Few-Shot Incremental Learning with Continually Evolved Classifiers. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021).

[53]

Zhang, Y., Kang, B., Hooi, B., Yan, S., and Feng, J. Deep long-tailed learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023). ).

Digital Library

[54]

Zhdanov, F. Diverse mini-batch active learning. arXiv preprint arXiv:1901.05954 (2019).

[55]

Zhu, X., Anguelov, D., and Ramanan, D. Capturing Long-Tail Distributions of Object Subcategories. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (2014).

Digital Library

[56]

Ziko, I., Dolz, J., Granger, E., and Ayed, I. B. Laplacian regularized few-shot learning. In Proceedings of the 37th International Conference on Machine Learning (2020), vol. 119.

Recommendations

Tri-Training: Exploiting Unlabeled Data Using Three Classifiers

In many practical data mining applications, such as Web page classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning algorithms such as co-training have ...
DCPE co-training for classification

Co-training is a well-known semi-supervised learning technique that applies two basic learners to train the data source, which uses the most confident unlabeled data to augment labeled data in the learning process. In the paper, we use the diversity of ...
Self-Training with Selection-by-Rejection
ICDM '12: Proceedings of the 2012 IEEE 12th International Conference on Data Mining

Practical machine learning and data mining problems often face shortage of labeled training data. Self-training algorithms are among the earliest attempts of using unlabeled data to enhance learning. Traditional self-training algorithms label unlabeled ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and Networking

October 2023

1605 pages

ISBN:9781450399906

DOI:10.1145/3570361

Chairs:
Xavier Costa,
Joerg Widmer,
Co-chairs:
Diego Perino,
Domenico Giustiniano,
Program Chair:
Haitham Al Hassanieh,
Program Co-chairs:
Arash Asadi,
Landon Cox

Copyright © 2023 Owner/Author(s).

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 October 2023

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Defense Advanced Research Projects Agency
United States Navy
National Science Foundation
Lockheed Martin

Conference

ACM MobiCom '23

Sponsor:

SIGMOBILE

ACM MobiCom '23: 29th Annual International Conference on Mobile Computing and Networking

October 2 - 6, 2023

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 440 of 2,972 submissions, 15%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
617
Total Downloads

Downloads (Last 12 months)617
Downloads (Last 6 weeks)50

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents