Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Code for "Speedy Gonzales: A Collection of Fast Task-Specific Models for Spanish"

Notifications You must be signed in to change notification settings

dccuchile/speedy-gonzales

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Speedy Gonzales: A Collection of Fast Task-Specific Models for Spanish

Summary

Model Parameters Speedup Score
Fine-tuning
BETO uncased 110M 1.00x 81.02
BETO cased 110M 1.00x 84.82
DistilBETO 67M 2.00x 76.73
ALBETO tiny 5M 18.05x 74.97
ALBETO base 12M 0.99x 83.25
ALBETO large 18M 0.28x 82.02
ALBETO xlarge 59M 0.07x 84.13
ALBETO xxlarge 223M 0.03x 85.17
BERTIN 125M 1.00x 83.97
RoBERTa BNE base 125M 1.00x 84.83
RoBERTa BNE large 355M 0.28x 68.42
Task-specific Knowledge Distillation
ALBETO tiny 5M 18.05x 76.49
ALBETO base-2 12M 5.96x 72.98
ALBETO base-4 12M 2.99x 80.06
ALBETO base-6 12M 1.99x 82.70
ALBETO base-8 12M 1.49x 83.78
ALBETO base-10 12M 1.19x 84.32

All results

Model Text Classification
(Accuracy)
Sequence Tagging
(F1 Score)
Question Answering
(F1 Score / Exact Match)
MLDoc PAWS-X XNLI POS NER MLQA SQAC TAR / XQuAD
Fine-tuning
BETO uncased 96.38 84.25 77.76 97.81 80.85 64.12 / 40.83 72.22 / 53.45 74.81 / 54.62
BETO cased 96.65 89.80 81.98 98.95 87.14 67.65 / 43.38 78.65 / 60.94 77.81 / 56.97
DistilBETO 96.35 75.80 76.59 97.67 78.13 57.97 / 35.50 64.41 / 45.34 66.97 / 46.55
ALBETO tiny 95.82 80.20 73.43 97.34 75.42 51.84 / 28.28 59.28 / 39.16 66.43 / 45.71
ALBETO base 96.07 87.95 79.88 98.21 82.89 66.12 / 41.10 77.71 / 59.84 77.18 / 57.05
ALBETO large 92.22 86.05 78.94 97.98 82.36 65.56 / 40.98 76.36 / 56.54 76.72 / 56.21
ALBETO xlarge 95.70 89.05 81.68 98.20 81.42 68.26 / 43.76 78.64 / 59.26 80.15 / 59.66
ALBETO xxlarge 96.85 89.85 82.42 98.43 83.06 70.17 / 45.99 81.49 / 62.67 79.13 / 58.40
BERTIN 96.47 88.65 80.50 99.02 85.66 66.06 / 42.16 78.42 / 60.05 77.05 / 57.14
RoBERTa BNE base 96.82 89.90 81.12 99.00 86.80 67.31 / 44.50 80.53 / 62.72 77.16 / 55.46
RoBERTa BNE large 97.00 90.00 51.62 61.83 21.47 67.69 / 44.88 80.41 / 62.14 77.34 / 56.97
Task-specific Knowledge Distillation
ALBETO tiny 96.40 85.05 75.99 97.36 72.51 54.17 / 32.22 63.03 / 43.35 67.47 / 46.13
ALBETO base-2 96.20 76.75 73.65 97.17 69.69 48.62 / 26.17 58.40 / 39.00 63.41 / 42.35
ALBETO base-4 96.35 86.40 78.68 97.60 74.58 62.19 / 38.28 71.41 / 52.87 73.31 / 52.43
ALBETO base-6 96.40 88.45 81.66 97.82 78.41 66.35 / 42.01 76.99 / 59.00 75.59 / 56.72
ALBETO base-8 96.70 89.75 82.55 97.96 80.23 67.39 / 42.94 77.79 / 59.63 77.89 / 56.72
ALBETO base-10 96.88 89.95 82.26 98.00 81.10 68.29 / 44.29 79.89 / 62.04 78.21 / 56.21

Citation

Speedy Gonzales: A Collection of Fast Task-Specific Models for Spanish

To cite this resource in a publication please use the following:

@inproceedings{canete-bravo-marquez-2024-speedy,
    title = "Speedy Gonzales: A Collection of Fast Task-Specific Models for {S}panish",
    author = "Ca{\~n}ete, Jos{\'e}  and
      Bravo-Marquez, Felipe",
    editor = "Bollegala, Danushka  and
      Shwartz, Vered",
    booktitle = "Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024)",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.starsem-1.14",
    pages = "176--189",
    abstract = "Large language models (LLM) are now a very common and successful path to approach language and retrieval tasks. While these LLM achieve surprisingly good results it is a challenge to use them on more constrained resources. Techniques to compress these LLM into smaller and faster models have emerged for English or Multilingual settings, but it is still a challenge for other languages. In fact, Spanish is the second language with most native speakers but lacks of these kind of resources. In this work, we evaluate all the models publicly available for Spanish on a set of 6 tasks and then, by leveraging on Knowledge Distillation, we present Speedy Gonzales, a collection of inference-efficient task-specific language models based on the ALBERT architecture. All of our models (fine-tuned and distilled) are publicly available on: https://huggingface.co/dccuchile.",
}

About

Code for "Speedy Gonzales: A Collection of Fast Task-Specific Models for Spanish"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published