Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3647649.3647671acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicigpConference Proceedingsconference-collections
research-article
Open access

A Real-life Chinese Dishes Recognition System Evolved from Full Training to Transfer Learning and Domain Adaptation

Published: 03 May 2024 Publication History
  • Get Citation Alerts
  • Abstract

    This study presents a real-life Chinese dishes recognition system. For enhancing the prediction accuracy, the system training strategy is evolved from full training to transfer learning and domain adaptation. Firstly, a Chinese dishes database with 28 types, 16,904 images and 45,061 instances is collected. Secondly, five networks pre-trained on Microsoft COCO are transferred for this specific task, and the network leading to the best results is selected as the backbone of the dishes recognition system. Thirdly, the backbone network trained with full training is compared to that with fine-tuning. Fourthly, domain adaptation using contrastive learning based unpaired image-to-image translation from Japanese dishes (UEC Food100) is considered for improving the backbone performance. Massive experiments suggest that transfer learning benefits the Chinese dishes recognition by fine-tuning hidden parameters, while domain adaptation remains challenging due to high data dependency and massive time consumption. Meanwhile, ≥ 200 instances per dishes type should be prepared for upgrading the menus list of the prototype system. Conclusively, transfer learning is promising for improving real-life Chinese dishes recognition, and domain adaptation requires further investigation.

    References

    [1]
    Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6154–6162.
    [2]
    Xin Chen, Yu Zhu, Hua Zhou, Liang Diao, and Dongyan Wang. 2017. Chinesefoodnet: A large-scale image dataset for chinese food recognition. arXiv preprint arXiv:1705.02743 (2017).
    [3]
    SiYuan Cheng, BinFei Chu, BiNeng Zhong, ZiKai Zhang, Xin Liu, ZhenJun Tang, and XianXian Li. 2021. DRNet: Towards fast, accurate and practical dish recognition. Science China Technological Sciences 64, 12 (2021), 2651–2661.
    [4]
    Takumi Ege and Keiji Yanai. 2018. Multi-task learning of dish detection and calorie estimation. In Proceedings of the joint workshop on multimedia for cooking and eating activities and multimedia assisted dietary management. 53–58.
    [5]
    Abdulnaser Fakhrou, Jayakanth Kunhoth, and Somaya Al Maadeed. 2021. Smartphone-based food recognition system using multiple deep CNN models. Multimedia Tools and Applications 80, 21-23 (2021), 33011–33032.
    [6]
    Honghao Gao, Kaili Xu, Min Cao, Junsheng Xiao, Qiang Xu, and Yuyu Yin. 2021. The deep features and attention mechanism-based method to dish healthcare under social iot systems: an empirical study with a hand-deep local–global net. IEEE Transactions on Computational Social Systems 9, 1 (2021), 336–347.
    [7]
    Shota Horiguchi, Sosuke Amano, Makoto Ogawa, and Kiyoharu Aizawa. 2018. Personalized classifier for food image recognition. IEEE Transactions on Multimedia 20, 10 (2018), 2836–2848.
    [8]
    Shuqiang Jiang, Weiqing Min, Linhu Liu, and Zhengdong Luo. 2019. Multi-scale multi-view deep feature aggregation for food recognition. IEEE Transactions on Image Processing 29 (2019), 265–276.
    [9]
    Yoshiyuki Kawano and Keiji Yanai. 2015. Foodcam: A real-time food recognition system on a smartphone. Multimedia Tools and Applications 74 (2015), 5263–5287.
    [10]
    Shanzhen Lan, Chengjuan Wan, Lan Chen, Mingxue Jin, and Shaode Yu. 2022. Deep learning-based recognition of Chinese dishes in a waiterless restaurant. In 2022 16th IEEE International Conference on Signal Processing (ICSP), Vol. 1. IEEE, 390–394.
    [11]
    Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980–2988.
    [12]
    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740–755.
    [13]
    Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 21–37.
    [14]
    Runyu Mao, Jiangpeng He, Zeman Shao, Sri Kalyan Yarlagadda, and Fengqing Zhu. 2021. Visual aware hierarchy based food recognition. In International conference on pattern recognition. Springer, 571–598.
    [15]
    Javier Marın, Aritro Biswas, Ferda Ofli, Nicholas Hynes, Amaia Salvador, Yusuf Aytar, Ingmar Weber, and Antonio Torralba. 2021. Recipe1m+: A dataset for learning cross-modal embeddings for cooking recipes and food images. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 1 (2021), 187–203.
    [16]
    Yuji Matsuda, Hajime Hoashi, and Keiji Yanai. 2012. Recognition of multiple-food images by detecting candidate regions. In 2012 IEEE International Conference on Multimedia and Expo. IEEE, 25–30.
    [17]
    Yuji Matsuda and Keiji Yanai. 2012. Multiple-food recognition considering co-occurrence employing manifold ranking. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). IEEE, 2017–2020.
    [18]
    Simon Mezgec and Barbara Koroušić Seljak. 2017. NutriNet: a deep learning food and drink image recognition system for dietary assessment. Nutrients 9, 7 (2017), 657.
    [19]
    Weiqing Min, Linhu Liu, Zhengdong Luo, and Shuqiang Jiang. 2019. Ingredient-guided cascaded multi-attention network for food recognition. In Proceedings of the 27th ACM International Conference on Multimedia. 1331–1339.
    [20]
    Weiqing Min, Zhiling Wang, Yuxin Liu, Mengjiang Luo, Liping Kang, Xiaoming Wei, Xiaolin Wei, and Shuqiang Jiang. 2023. Large scale visual food recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
    [21]
    Taesung Park, Alexei A Efros, Richard Zhang, and Jun-Yan Zhu. 2020. Contrastive learning for unpaired image-to-image translation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16. Springer, 319–345.
    [22]
    Parisa Pouladzadeh, Gregorio Villalobos, Rana Almaghrabi, and Shervin Shirmohammadi. 2012. A novel SVM based food recognition method for calorie measurement applications. In 2012 IEEE international conference on multimedia and expo workshops. IEEE, 495–498.
    [23]
    Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779–788.
    [24]
    Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015).
    [25]
    Wei Wang, Weiqing Min, Tianhao Li, Xiaoxiao Dong, Haisheng Li, and Shuqiang Jiang. 2022. A review on vision-based analysis for automatic dietary assessment. Trends in Food Science & Technology 122 (2022), 223–237.
    [26]
    Hui Wu, Michele Merler, Rosario Uceda-Sosa, and John R Smith. 2016. Learning to make better mistakes: Semantics-aware visual food recognition. In Proceedings of the 24th ACM international conference on Multimedia. 172–176.
    [27]
    Ruihan Xu, Luis Herranz, Shuqiang Jiang, Shuang Wang, Xinhang Song, and Ramesh Jain. 2015. Geolocalized modeling for dish recognition. IEEE transactions on multimedia 17, 8 (2015), 1187–1199.
    [28]
    Huayi Zhou, Fei Jiang, and Hongtao Lu. 2023. SSDA-YOLO: Semi-supervised domain adaptive YOLO for cross-domain object detection. Computer Vision and Image Understanding 229 (2023), 103649.
    [29]
    Lian Zou, Shaode Yu, Tiebao Meng, Zhicheng Zhang, Xiaokun Liang, Yaoqin Xie, 2019. A technical review of convolutional neural network-based mammographic breast cancer diagnosis. Computational and mathematical methods in medicine 2019 (2019).

    Index Terms

    1. A Real-life Chinese Dishes Recognition System Evolved from Full Training to Transfer Learning and Domain Adaptation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing
      January 2024
      480 pages
      ISBN:9798400716720
      DOI:10.1145/3647649
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 May 2024

      Check for updates

      Author Tags

      1. Chinese dishes recognition
      2. deep learning
      3. domain adaptation
      4. fine tuning
      5. full training
      6. transfer learning

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ICIGP 2024

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 61
        Total Downloads
      • Downloads (Last 12 months)61
      • Downloads (Last 6 weeks)33

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media