Authors:
Xiaolong Sun
1
;
2
;
Henrik Holm
2
;
Sina Molavipour
2
;
Fitsum Gebre
2
;
Yash Pawar
2
;
Kamiar Radnosrati
2
and
Serveh Shalmashi
2
Affiliations:
1
KTH Royal Institute of Technology, Stockholm, Sweden
;
2
Ericsson AB, Stockholm, Sweden
Keyword(s):
Recommendation System, Information Retrieval, Language Models.
Abstract:
Historical failure records can provide insights to investigate if a similar situation occurred during the troubleshooting process in software. However, in the era of information explosion, massive amounts of data make it unrealistic to rely solely on manual inspection of root causes, not to mention mapping similar records. With the ongoing development and breakthroughs of Natural Language Processing (NLP), we propose an end-to-end recommendation system that can instantly generate a list of similar records given a new raw failure record. The system consists of three stages: 1) general and tailored pre-processing of raw failure records; 2) information retrieval; 3) information re-ranking. In the process of model selection, we undertake a thorough exploration of both frequency-based models and language models. To mitigate issues stemming from imbalances in the available labeled data, we propose an updated Recall@K metric that utilizes an adaptive K. We also develop a multi-stage trainin
g pipeline to deal with limited labeled data and investigate how different strategies affect performance. Our comprehensive experiments demonstrate that our two-stage BERT model, fine-tuned on extra domain data, achieves the best score over the baseline models.
(More)