Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3503161.3548203acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

DOMFN: A Divergence-Orientated Multi-Modal Fusion Network for Resume Assessment

Published: 10 October 2022 Publication History

Abstract

In talent management, resume assessment aims to analyze the quality of a job seeker's resume, which can assist recruiters to discover suitable candidates and benefit job seekers improving resume quality in return. Recent machine learning based methods on large-scale public resume datasets have provided the opportunity for automatic assessment for reducing manual costs. However, most existing approaches are still content-dominated and ignore other valuable information. Inspired by practical resume evaluations that consider both the content and layout, we construct the multi-modalities from resumes but face a new challenge that sometimes the performance of multi-modal fusion is even worse than the best uni-modality. In this paper, we experimentally find that this phenomenon is due to the cross-modal divergence. Therefore, we need to consider when is it appropriate to perform multi-modal fusion? To address this problem, we design an instance-aware fusion method, i.e., Divergence-Orientated Multi-Modal Fusion Network (DOMFN), which can adaptively fuse the uni-modal predictions and multi-modal prediction based on cross-modal divergence. Specifically, DOMFN computes a functional penalty score to measure the divergence of cross-modal predictions. Then, the learned divergence can be used to decide whether to conduct multi-modal fusion and be adopted into an amended loss for reliable training. Consequently, DOMFN rejects multi-modal prediction when the cross-modal divergence is too large, avoiding the overall performance degradation, so as to achieve better performance than uni-modalities. In experiments, qualitative comparison with baselines on real-world dataset demonstrates the superiority and explainability of the proposed DOMFN, e.g., we find a meaningful phenomenon that multi-modal fusion has positive effects for assessing resumes from UI Designer and Enterprise Service positions, whereas affects the assessment of Technology and Product Operation positions.

Supplementary Material

MP4 File (MM22-fp1869.mp4)
This video mainly introduces a resume assessment method which integrates multi-modal information. Inspired by practical resume evaluations that consider both the content and layout, it's easy to construct the multi-modalities from resumes but face a new challenge that sometimes the performance of multi-modal fusion is even worse than the best uni-modality. In this video, there is elaborating an instance-aware fusion method, Divergence-Orientated Multi-Modal Fusion Network (DOMFN), which can adaptively fuse the uni-modal predictions and multi-modal prediction based on cross-modal divergence. Specifically, DOMFN computes a functional penalty score to measure the divergence of cross-modal predictions. Then, the learned divergence can be used to decide whether to conduct multi-modal fusion and be adopted into an amended loss for reliable training. Finally, the video presents the superiority and interpretability of DOMFN by introducing experimental results on real-world datasets.

References

[1]
Jan Ketil Arnulf, Lisa Tegner, and Øyunn Larssen. 2010. Impression making by résumé layout: Its impact on the probability of being shortlisted. European Journal of Work and Organizational Psychology, Vol. 19, 2 (2010), 221--230.
[2]
Tadas Baltru"aitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2019. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, 2 (2019), 423--443.
[3]
Gavin Brown, Jeremy L. Wyatt, and Peter Ti n o. 2005. Managing Diversity in Regression Ensembles. J. Mach. Learn. Res., Vol. 6 (2005), 1621--1650.
[4]
Kyunghyun Cho, B van Merrienboer, Caglar Gulcehre, F Bougares, H Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha, Qatar, 1724--1734.
[5]
Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2019. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In Proceedings of the International Conference on Learning Representations. Addis Ababa, Ethiopia.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Minneapolis, 4171--4186.
[7]
Jeffrey L. Elman. 1990. Finding Structure in Time. Cognitive Science, Vol. 14, 2 (1990), 179--211.
[8]
Ben Greiner. 2004. An online recruitment system for economic experiments. (2004).
[9]
Wei Han, Hui Chen, and Soujanya Poria. 2021a. Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Punta Cana, Dominican, 9180--9192.
[10]
Zongbo Han, Changqing Zhang, Huazhu Fu, and Joey Tianyi Zhou. 2021b. Trusted Multi-View Classification. In Proceedings of the International Conference on Learning Representations. Virtual Event.
[11]
Zongbo Han, Changqing Zhang, Huazhu Fu, and Joey Tianyi Zhou. 2021c. Trusted Multi-View Classification. In Proceedings of the International Conference on Learning Representations. Austria.
[12]
Christopher G Harris. 2017. Finding the best job applicants for a job posting: A comparison of human resources search strategies. In 2017 IEEE International Conference on Data Mining Workshops. IEEE, New Orleans, 189--194.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, 770--778.
[14]
Jack Hessel and Lillian Lee. 2020. Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Virtual Event, 861--877.
[15]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[16]
Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).
[17]
Hamid Reza Vaezi Joze, Amirreza Shaban, Michael L Iuzzolino, and Kazuhito Koishida. 2020. MMTM: Multimodal transfer module for CNN fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 13289--13299.
[18]
Zhen-zhong Lan, Lei Bao, Shoou-I Yu, Wei Liu, and Alexander G Hauptmann. 2014. Multimedia classification and event detection using double fusion. Multimedia tools and applications, Vol. 71, 1 (2014), 333--347.
[19]
Hao Lin, Hengshu Zhu, Yuan Zuo, Chen Zhu, Junjie Wu, and Hui Xiong. 2017. Collaborative Company Profiling: Insights from an Employee's Perspective. In Proceedings of the AAAI Conference on Artificial Intelligence. San Francisco, California, 1417--1423.
[20]
Yong Liu and Xin Yao. 1999. Ensemble learning via negative correlation. Neural networks, Vol. 12, 10 (1999), 1399--1404.
[21]
Zhiyuan Liu, Yankai Lin, and Maosong Sun. 2020. Representation learning for natural language processing. Springer Nature.
[22]
Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, AmirAli Bagher Zadeh, and Louis-Philippe Morency. 2018. Efficient Low-rank Multimodal Fusion With Modality-Specific Factors. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia, 2247--2256.
[23]
Yong Luo, Huaizheng Zhang, Yongjie Wang, Yonggang Wen, and Xinwen Zhang. 2018. ResumeNet: A learning-based framework for automatic resume quality assessment. In Proceedings of the IEEE International Conference on Data Mining. Singapore, 307--316.
[24]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, Vol. 26 (2013).
[25]
Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, and Chen Sun. 2021. Attention bottlenecks for multimodal fusion. Advances in Neural Information Processing Systems, Vol. 34 (2021).
[26]
Chuan Qin, Hengshu Zhu, Tong Xu, Chen Zhu, Chao Ma, Enhong Chen, and Hui Xiong. 2020. An Enhanced Neural Network Approach to Person-Job Fit in Talent Recruitment. ACM Trans. Inf. Syst., Vol. 38, 2 (2020), 15:1--15:33.
[27]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018).
[28]
Dazhong Shen, Hengshu Zhu, Chen Zhu, Tong Xu, Chao Ma, and Hui Xiong. 2018. A Joint Learning Approach to Intelligent Job Interview Assessment. In Proceedings of the International Joint Conference on Artificial Intelligence. Stockholm, Sweden, 3542--3548.
[29]
Ekaterina Shutova, Douwe Kiela, and Jean Maillard. 2016. Black holes and white rabbits: Metaphor identification with visual features. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. San Diego, 160--170.
[30]
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition., 14 pages.
[31]
Amit Singh, Catherine Rose, Karthik Visweswariah, Vijil Chenthamarakshan, and Nandakishore Kambhatla. 2010. PROSPECT: a system for screening candidates for recruitment. In Proceedings of the ACM International Conference on Information and Knowledge Management. Toronto, Ontario, Canada, 659--668.
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017), 5998--6008.
[33]
Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, and Guangming Shi. 2021. PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network. In Proceedings of the AAAI Conference on Artificial Intelligence. Virtual Event, 2782--2790.
[34]
Weiyao Wang, Du Tran, and Matt Feiszli. 2020. What makes training multi-modal classification networks hard?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 12695--12705.
[35]
Yanzhao Wu, Ling Liu, Zhongwei Xie, Ka-Ho Chow, and Wenqi Wei. 2021. Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual Event, 16469--16477.
[36]
Nan Xu, Wenji Mao, and Guandan Chen. 2019. Multi-interactive memory network for aspect based multimodal sentiment analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 371--378.
[37]
Zhen Xu, David R So, and Andrew M Dai. 2021. MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. Virtual Event, 10532--10540.
[38]
Rui Yan, Ran Le, Yang Song, Tao Zhang, Xiangliang Zhang, and Dongyan Zhao. 2019. Interview choice reveals your preference on the market: to improve job-resume matching through profiling memories. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Anchorage, 914--922.
[39]
Yang Yang, Ke-Tao Wang, De-Chuan Zhan, Hui Xiong, and Yuan Jiang. 2019. Comprehensive Semi-Supervised Multi-Modal Learning. In Proceedings of the International Joint Conference on Artificial Intelligence. Macao, China, 4092--4098.
[40]
Yang Yang, Yi-Feng Wu, De-Chuan Zhan, Zhi-Bin Liu, and Yuan Jiang. 2018. Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, UK, 2594--2603.
[41]
Yang Yang, De-Chuan Zhan, Ying Fan, and Yuan Jiang. 2017a. Instance Specific Discriminative Modal Pursuit: A Serialized Approach. In Proceedings of The 9th Asian Conference on Machine Learning. Seoul, Korea, 65--80.
[42]
Yang Yang, De-Chuan Zhan, Ying Fan, Yuan Jiang, and Zhi-Hua Zhou. 2017b. Deep Learning for Fixed Model Reuse. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, California, 2831--2837.
[43]
Yang Yang, De-Chuan Zhan, Yi-Feng Wu, Zhi-Bin Liu, Hui Xiong, and Yuan Jiang. 2021. Semi-Supervised Multi-Modal Clustering and Classification with Incomplete Modalities. IEEE Trans. Knowl. Data Eng., Vol. 33, 2 (2021), 682--695.
[44]
Chen Zhang and Hao Wang. 2018. Resumevis: A visual analytics system to discover semantic information in semi-structured resume data. ACM Transactions on Intelligent Systems and Technology, Vol. 10, 1 (2018), 1--25.
[45]
Chao Zhang, Zichao Yang, Xiaodong He, and Li Deng. 2020. Multimodal intelligence: Representation learning, information fusion, and applications. IEEE Journal of Selected Topics in Signal Processing, Vol. 14, 3 (2020), 478--493.
[46]
Le Zhang, Zenglin Shi, Ming-Ming Cheng, Yun Liu, Jia-Wang Bian, Joey Tianyi Zhou, Guoyan Zheng, and Zeng Zeng. 2019. Nonlinear regression via deep negative correlation learning. IEEE transactions on pattern analysis and machine intelligence, Vol. 43, 3 (2019), 982--998.

Cited By

View all
  • (2024)CMAF: Cross-Modal Augmentation via Fusion for Underwater Acoustic Image RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363642720:5(1-25)Online publication date: 11-Jan-2024
  • (2023)Contextualized Knowledge Graph Embedding for Explainable Talent Training Course RecommendationACM Transactions on Information Systems10.1145/359702242:2(1-27)Online publication date: 27-Sep-2023
  • (2023)RecruitPro: A Pretrained Language Model with Skill-Aware Prompt Learning for Intelligent RecruitmentProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599894(3991-4002)Online publication date: 6-Aug-2023
  • Show More Cited By

Index Terms

  1. DOMFN: A Divergence-Orientated Multi-Modal Fusion Network for Resume Assessment

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '22: Proceedings of the 30th ACM International Conference on Multimedia
    October 2022
    7537 pages
    ISBN:9781450392037
    DOI:10.1145/3503161
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. instance-aware fusion
    2. multi-modal learning
    3. resume assessment

    Qualifiers

    • Research-article

    Funding Sources

    • Jiangsu Shuangchuang (Mass Innovation and Entrepreneurship) Talent Program
    • Natural Science Foundation of Jiangsu Province of China under Grant
    • Young Elite Scientists Sponsorship Program by CAST
    • CAAI-Huawei MindSpore Open Fund
    • NSFC

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)99
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CMAF: Cross-Modal Augmentation via Fusion for Underwater Acoustic Image RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363642720:5(1-25)Online publication date: 11-Jan-2024
    • (2023)Contextualized Knowledge Graph Embedding for Explainable Talent Training Course RecommendationACM Transactions on Information Systems10.1145/359702242:2(1-27)Online publication date: 27-Sep-2023
    • (2023)RecruitPro: A Pretrained Language Model with Skill-Aware Prompt Learning for Intelligent RecruitmentProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599894(3991-4002)Online publication date: 6-Aug-2023
    • (2023)The 4th International Workshop on Talent and Management Computing (TMC'2023)Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599200(5909-5910)Online publication date: 6-Aug-2023
    • (2023)ResuFormer: Semantic Structure Understanding for Resumes via Multi-Modal Pre-training2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00242(3154-3167)Online publication date: Apr-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media