Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Saliency-Aware Class-Agnostic Food Image Segmentation

Published: 15 July 2021 Publication History

Abstract

Advances in image-based dietary assessment methods have allowed nutrition professionals and researchers to improve the accuracy of dietary assessment, where images of food consumed are captured using smartphones or wearable devices. These images are then analyzed using computer vision methods to estimate energy and nutrition content of the foods. Food image segmentation, which determines the regions in an image where foods are located, plays an important role in this process. Current methods are data dependent and thus cannot generalize well for different food types. To address this problem, we propose a class-agnostic food image segmentation method. Our method uses a pair of eating scene images, one before starting eating and one after eating is completed. Using information from both the before and after eating images, we can segment food images by finding the salient missing objects without any prior information about the food class. We model a paradigm of top-down saliency that guides the attention of the human visual system based on a task to find the salient missing objects in a pair of images. Our method is validated on food images collected from a dietary study that showed promising results.

References

[1]
Kiyoharu Aizawa and Makoto Ogawa. 2015. FoodLog: Multimedia tool for healthcare applications. IEEE MultiMedia 22, 2 (April 2015), 4–8. https://doi.org/10.1109/MMUL.2015.39
[2]
Fabio Anselmi, Joel Z. Leibo, Lorenzo Rosasco, Jim Mutch, Andrea Tacchetti, and Tomaso Poggio. 2016. Unsupervised learning of invariant representations. Theoretical Computer Science 633 (2016), 112–121. https://doi.org/10.1016/j.tcs.2015.06.048
[3]
Dana H. Ballard, Mary M. Hayhoe, and Jeff B. Pelz. 1995. Memory representations in natural tasks. Journal of Cognitive Neuroscience. 7, 1 (1995), 66–80. https://doi.org/10.1162/jocn.1995.7.1.66
[4]
Ali Borji, Ming-Ming Cheng, Huaizu Jiang, and Jia Li. 2015. Salient object detection: A benchmark. IEEE Transactions on Image Processing 24, 12 (Dec. 2015), 5706–5722. https://doi.org/10.1109/TIP.2015.2487833
[5]
Ali Borji and Laurent Itti. 2013. State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (Jan. 2013), 185–207.
[6]
Ali Borji, Dicky N. Sihite, and Laurent Itti. 2013. Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study. IEEE Transactions on Image Processing 22, 1 (Jan. 2013), 55–69. https://doi.org/10.1109/TIP.2012.2210727
[7]
Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. 2014. Food-101—Mining discriminative components with random forests. In Proceedings of the 2014 European Conference on Computer Vision. 446–461. https://doi.org/10.1007/978-3-319-10599-4_29
[8]
Chunshui Cao, Xianming Liu, Yi Yang, Yinan Yu, Jiang Wang, Zilei Wang, Yongzhen Huang, et al. 2015. Look and think twice: Capturing top-down visual attention with feedback convolutional neural networks. In Proceedings of the 2015 IEEE International Conference on Computer Vision. 2956–2964. https://doi.org/10.1109/ICCV.2015.338
[9]
Hsin-Chen Chen, Wenyan Jia, Xin Sun, Zhaoxin Li, Yuecheng Li, John D. Fernstrom, Lora E. Burke, Thomas Baranowski, and Mingui Sun. 2015. Saliency-aware food image segmentation for personal dietary assessment using a wearable computer. Measurement Science and Technology 26, 2 (2015), 025702.
[10]
Bethany L. Daugherty, TusaRebecca E. Schap, Reynolette Ettienne-Gittens, Fengqing M. Zhu, Marc Bosch, Edward J. Delp, David S. Ebert, Deborah A. Kerr, and Carol J. Boushey. 2012. Novel technologies for assessing dietary intake: Evaluating the usability of a mobile telephone food record among adults and adolescents. Journal of Medical Internet Research 14, 2 (April 2012), e58. https://doi.org/10.2196/jmir.1967
[11]
Joachim Dehais, Marios Anthimopoulos, and Stavroula Mougiakakou. 2016. Food image segmentation for dietary assessment. In Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management. 23–28.
[12]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248–255. https://doi.org/10.1109/CVPR.2009.5206848
[13]
Zijun Deng, Xiaowei Hu, Lei Zhu, Xuemiao Xu, Jing Qin, Guoqiang Han, and Pheng-Ann Heng. 2018. R3Net: Recurrent residual refinement network for saliency detection. In Proceedings of the 2018 International Joint Conference on Artificial Intelligence. 684–690. http://dl.acm.org/citation.cfm?id=3304415.3304513
[14]
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. 2014. DeCAF: A deep convolutional activation feature for generic visual recognition. In Proceedings of the 31st International Conference on Machine Learning—Volume 32. I-647–I-655.
[15]
Shaobo Fang, Chang Liu, Fengqing Zhu, Edward J. Delp, and Carol J. Boushey. 2015. Single-view food portion estimation based on geometric models. In Proceedings of the 2015 IEEE International Symposium on Multimedia. 385–390. https://doi.org/10.1109/ISM.2015.67
[16]
S. Fang, Z. Shao, R. Mao, C. Fu, E. J. Delp, F. Zhu, D. A. Kerr, and C. J. Boushey. 2018. Single-view food portion estimation: Learning image-to-energy mappings using generative adversarial networks. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP’18). 251–255. https://doi.org/10.1109/ICIP.2018.8451461
[17]
Shaobo Fang, Fengqing Zhu, Carol J. Boushey, and Edward J. Delp. 2017. The use of co-occurrence patterns in single image based food portion estimation. In Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing. 462–466. https://doi.org/10.1109/GlobalSIP.2017.8308685
[18]
Garcia Ginny, Thankam S. Sunil, and Pedro Hinojosa. 2012. The fast food and obesity link: Consumption patterns and severity of obesity. Obesity Surgery 22, 5 (May 2012), 810–818. https://doi.org/10.1007/s11695-012-0601-8
[19]
Amber J. Hammons and Barbara H. Fiese. 2011. Is frequency of shared family meals related to the nutritional health of children and adolescents? Pediatrics 127, 6 (June 2011), e1565–e1574.
[20]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.
[21]
Yoshiyuki Kawano and Keiji Yanai. 2015. Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. In Computer Vision—ECCV 2014 Workshops. Lecture Notes in Computer Science, Vol. 8927. Springer, 3–17. https://doi.org/10.1007/978-3-319-16199-0_1
[22]
Yoshiyuki Kawano and Keiji Yanai. 2015. FoodCam: A real-time food recognition system on a smartphone. Multimedia Tools and Applications 74, 14 (July 2015), 5263–5287. https://doi.org/10.1007/978-3-319-04117-9_38
[23]
Deborah A. Kerr, Amelia J. Harray, Christina M. Pollard, Satvinder S. Dhaliwal, Edward J. Delp, Peter A. Howat, Mark R. Pickering, et al. 2016. The connecting health and technology study: A 6-month randomized controlled trial to improve nutrition behaviours using a mobile food record and text messaging support in young adults. International Journal of Behavioral Nutrition and Physical Activity 13, 1 (2016), 52. https://doi.org/10.1186/s12966-016-0376-8
[24]
Salman Khan, Xuming He, Fatih Porikli, Mohammed Bennamoun, Ferdous Sohel, and Roberto Togneri. 2017. Learning deep structured network for weakly supervised change detection. In Proceedings of the 2017 International Joint Conference on Artificial Intelligence. 2008–2015. http://dl.acm.org/citation.cfm?id=3172077.3172167
[25]
Salman H. Khan, Xuming He, Fatih Porikli, and Mohammed Bennamoun. 2017. Forest change detection in incomplete satellite images with deep neural networks. IEEE Transactions on Geoscience and Remote Sensing 55, 9 (Sept. 2017), 5407–5423. https://doi.org/10.1109/TGRS.2017.2707528
[26]
Fanyu Kong and Jindong Tan. 2012. DietCam: Automatic dietary assessment with mobile camera phones. Pervasive and Mobile Computing 8, 1 (2012), 147–163. https://doi.org/10.1016/j.pmcj.2011.07.003
[27]
Guanbin Li and Yizhou Yu. 2015. Visual saliency based on multiscale deep features. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. 5455–5463. https://doi.org/10.1109/CVPR.2015.7299184
[28]
Zhiming Luo, Akshaya Mishra, Andrew Achka, Justin Eichel, Shaozi Li, and Pierre-Marc Jodoin. 2017. Non-local deep features for salient object detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 6593–6601. https://doi.org/10.1109/CVPR.2017.698
[29]
David R. Martin, Charless C. Fowlkes, and Jitendra Malik. 2004. Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 5 (May 2004), 530–549. https://doi.org/10.1109/TPAMI.2004.1273918
[30]
Yuji Matsuda, Hajime Hoashi, and Keiji Yana. 2012. Recognition of multiple-food images by detecting candidate regions. In Proceedings of the 2012 IEEE International Conference on Multimedia and Expo. 25–30. https://doi.org/10.1109/ICME.2012.157
[31]
Megan McCrory, Mingui Sun, Edward Sazonov, Gary Frost, Alex Anderson, Wenyan Jia, Modou L. Jobarteh, et al. 2019. Methodology for objective, passive, image-and sensor-based assessment of dietary intake, meal-timing, and food-related activity in Ghana and Kenya (P13-028-19). Current Developments in Nutrition 3, Suppl. 1 (2019), nzz036–P13.
[32]
A. E. Mesas, M. Muñoz-Pareja, E. López-García, and F. Rodríguez-Artalejo. 2012. Selected eating behaviours and excess body weight: A systematic review. Obesity Reviews 13, 2 (Feb. 2012), 106–135. https://doi.org/10.1111/j.1467-789X.2011.00936.x
[33]
Karin Nordström, Christian Coff, Håkan Jönsson, Lennart Nordenfelt, and Ulf Görman. 2013. Food and health: Individual, cultural, or scientific matters? Genes & Nutrition 8, 4 (July 2013), 357–363. https://doi.org/10.1007/s12263-013-0336-8
[34]
Koichi Okamoto and Keiji Yanai. 2016. An automatic calorie estimation system of food images on a smartphone. In Proceedings of the 2016 International Workshop on Multimedia Assisted Dietary Management. 63–70. https://doi.org/10.1145/2986035.2986040
[35]
World Health Organization. 2009. Global Health Risks Mortality and Burden of Disease Attributable to Selected Major Risks. World Health Organization. https://apps.who.int/iris/handle/10665/44203
[36]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS’17). 1–4.
[37]
Federico Perazzi, Philipp Krahenbuhl, Yael Pritch, and Alexander Hornung. 2012. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. 733–740. https://doi.org/10.1109/CVPR.2012.6247743
[38]
Robert J. Peters and Laurent Itti. 2007. Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. 1–8. https://doi.org/10.1109/CVPR.2007.383337
[39]
Carmen Piernas and Barry M. Popkin. 2011. Food portion patterns and trends among U.S. children and the relationship to total eating occasion size, 1977-2006. Journal of Nutrition 141, 6 (June 2011), 1159–1164. https://doi.org/10.3945/jn.111.138727
[40]
Parisa Pouladzadeh, Shervin Shirmohammadi, and Rana Almaghrabi. 2014. Measuring calorie and nutrition from food image. IEEE Transactions on Instrumentation and Measurement 63, 8 (Aug. 2014), 1947–1956. https://doi.org/10.1109/TIM.2014.2303533
[41]
P. R. Deshmukh-Taskar, T. A. Nicklas, C. E. O’Neil, D. R. Keast, J. D. Radcliffe, and S. Cho. 2010. The relationship of breakfast skipping and type of breakfast consumption with nutrient intake and weight status in children and adolescents: The national health and nutrition examination survey 1999-2006. Journal of the American Dietetic Association 110, 6 (June 2010), 869–878. https://doi.org/10.1016/j.jada.2010.03.023
[42]
Achanta Radhakrishna, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Susstrunk. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 11 (Nov. 2012). https://doi.org/10.1109/TPAMI.2012.120
[43]
Vasili Ramanishka, Abir Das, Jianming Zhang, and Kate Saenko. 2017. Top-down visual saliency guided by captions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 3135–3144. https://doi.org/10.1109/CVPR.2017.334
[44]
Vijay Rengarajan, Abhijith Punnappurath, A. N. Rajagopalan, and Guna Seetharaman. 2014. Efficient change detection for very large motion blurred images. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. 315–322. https://doi.org/10.1109/CVPRW.2014.55
[45]
Ken Sakurada and Takayuki Okatani. 2015. Change detection from a street image pair using CNN features and superpixel segmentation. In Proceedings of the 2015 British Machine Vision Conference. Article 61, 12 pages. https://doi.org/10.5244/C.29.61
[46]
Jee-Seon Shim, Kyungwon Oh, and Hyeon Chang Kim. 2014. Dietary assessment methods in epidemiologic studies. Epidemiology and Health 36 (July 2014), e2014009–e2014009. https://doi.org/10.4178/epih/e2014009
[47]
Wataru Shimoda and Keiji Yanai. 2015. CNN-based food image segmentation without pixel-wise annotation. In New Trends in Image Analysis and Processing—ICIAP 2015. Lecture Notes in Computer Science, Vol. 9281. Springer, 449–457. https://doi.org/10.1007/978-3-319-23222-5_55
[48]
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 2015 International Conference on Learning Representations.
[49]
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2818–2826.
[50]
Rick Szeliski. 2004. Image Alignment and Stitching: A Tutorial. Technical Report MSR-TR-2004-92. Microsoft Research. https://www.microsoft.com/en-us/research/publication/image-alignment-and-stitching-a-tutorial/.
[51]
Maria F. Vasiloglou, Stavroula Mougiakakou, Emilie Aubry, Anika Bokelmann, Rita Fricker, Filomena Gomes, Cathrin Guntermann, Alexa Meyer, Diana Studerus, and Zeno Stanga. 2018. A comparative study on carbohydrate estimation: GoCARB vs. Dietitians. Nutrients 10, 6 (2018), 741.
[52]
Yu Wang, Fengqing Zhu, Carol J. Boushey, and Edward J. Delp. 2017. Weakly supervised food image segmentation using class activation maps. In Proceedings of the 2017 IEEE International Conference on Image Processing. 1277–1281. https://doi.org/10.1109/ICIP.2017.8296487
[53]
Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, and Xiang Ruan. 2017. Amulet: Aggregating multi-level convolutional features for salient object detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision. 202–211. https://doi.org/10.1109/ICCV.2017.31
[54]
Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, and Baocai Yi. 2017. Learning uncertain convolutional features for accurate saliency detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision. 212–221. https://doi.org/10.1109/ICCV.2017.32
[55]
Weiyu Zhang, Qian Yu, Behjat Siddiquie, Ajay Divakaran, and Harpreet Sawhney. 2015. “Snap-n-Eat”: Food recognition and nutrition estimation on a smartphone. Journal of Diabetes Science and Technology 9, 3 (May 2015), 525–533. https://doi.org/10.1177/1932296815582222
[56]
Fengqing Zhu, Marc Bosch, Insoo Woo, SungYe Kim, Carol J. Boushey, David S. Ebert, and Edward J. Delp. 2015. Multiple hypotheses image segmentation and classification with application to dietary assessment. IEEE Journal of Biomedical and Health Informatics 19, 1 (Jan. 2015), 377–388. https://doi.org/10.1109/JBHI.2014.2304925
[57]
Fengqing Zhu, Marc Bosch, Insoo Woo, SungYe Kim, Carol J. Boushey, David S. Ebert, and Edward J. Delp. 2010. The use of mobile devices in aiding dietary assessment and evaluation. IEEE Journal of Selected Topics in Signal Processing 4, 4 (Aug. 2010), 756–766. https://doi.org/10.1109/JSTSP.2010.2051471

Cited By

View all
  • (2024)Exploring Deep Learning–Based Models for Sociocultural African Food Recognition SystemHuman Behavior and Emerging Technologies10.1155/2024/44433162024:1Online publication date: 18-Sep-2024
  • (2024)A Review of Image-Based Food Recognition and Volume Estimation Artificial Intelligence SystemsIEEE Reviews in Biomedical Engineering10.1109/RBME.2023.328314917(136-152)Online publication date: 2024
  • (2024)Salient Object Detection Based on High-level Semantic Guidance and Multi-modal Interaction2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)10.1109/IAEAC59436.2024.10504007(1819-1824)Online publication date: 15-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computing for Healthcare
ACM Transactions on Computing for Healthcare  Volume 2, Issue 3
Survey Paper
July 2021
226 pages
EISSN:2637-8051
DOI:10.1145/3476113
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 July 2021
Accepted: 01 November 2020
Revised: 01 April 2020
Received: 01 July 2019
Published in HEALTH Volume 2, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Food segmentation
  2. image-based dietary assessment

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)21
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Exploring Deep Learning–Based Models for Sociocultural African Food Recognition SystemHuman Behavior and Emerging Technologies10.1155/2024/44433162024:1Online publication date: 18-Sep-2024
  • (2024)A Review of Image-Based Food Recognition and Volume Estimation Artificial Intelligence SystemsIEEE Reviews in Biomedical Engineering10.1109/RBME.2023.328314917(136-152)Online publication date: 2024
  • (2024)Salient Object Detection Based on High-level Semantic Guidance and Multi-modal Interaction2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)10.1109/IAEAC59436.2024.10504007(1819-1824)Online publication date: 15-Mar-2024
  • (2023)Siamese Transformer for Saliency Prediction Based on Multi-Prior Enhancement and Cross-Modal Attention CollaborationIEICE Transactions on Information and Systems10.1587/transinf.2022EDP7220E106.D:9(1572-1583)Online publication date: 1-Sep-2023
  • (2023)HRTransNet: HRFormer-Driven Two-Modality Salient Object DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.320256333:2(728-742)Online publication date: Feb-2023
  • (2023)A Study on Food Value Estimation From Images: Taxonomies, Datasets, and TechniquesIEEE Access10.1109/ACCESS.2023.327447511(45910-45935)Online publication date: 2023
  • (2023)HSIFoodIngr-64: A Dataset for Hyperspectral Food-Related Studies and a Benchmark Method on Food Ingredient RetrievalIEEE Access10.1109/ACCESS.2023.324324311(13152-13162)Online publication date: 2023
  • (2022)SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2021.312714932:7(4486-4497)Online publication date: 1-Jul-2022
  • (2022)Image Based Food Energy Estimation With Depth Domain Adaptation2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR54900.2022.00054(262-267)Online publication date: Aug-2022
  • (2022)BGRDNet: RGB-D salient object detection with a bidirectional gated recurrent decoding networkMultimedia Tools and Applications10.1007/s11042-022-12799-y81:18(25519-25539)Online publication date: 1-Jul-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media