Thực tập kỹ thuật
@author: hsthe29
Given a question, related paragraphs from a Wikipedia article (in the shuffle order), the task is finding paragraph which answers the question of each test case.
F1 measure based on precision and recall is used to rank competing submissions.
For detailed, visit: https://challenge.zalo.ai/portal/question-answering
Visit: https://challenge.zalo.ai/portal/question-answering
I used pretrained bert (bert-base-multilingual-cased
) for fine-tuning
num_train_epochs = 6.0
max_seq_length = 512
train_batch_size = 16
learning_rate = 2e-5
Run the project using flags
. For more details on the flags defined, read the main.py
file or run this command: python main.py -h
Please download the init checkpoint at here and put into the folder checkpoint.
Run command sh train.sh
If you want to only predict, please download the checkpoint of trained model at here
Run command: sh predict.sh
.
Training logs are store at logs/
.
For visualizing: $ tensorboard --logdir=logs/tensorboard
With validation set (data/train/val.csv
), this model reach 83.6% of F1 score