-
Jožef Stefan Institute
- Ljubljana, Slovenia
- @TajaKuzman
- in/taja-kuzman
Pinned Loading
-
IPTC-Media-Topic-Classification
IPTC-Media-Topic-Classification PublicDevelopment of a multilingual IPTC Media Topic classifier for single-label topic classification of the 17 top-level topic labels from the IPTC Media Topic hierarchical schema.
Jupyter Notebook
-
AGILE-Automatic-Genre-Identification-Benchmark
AGILE-Automatic-Genre-Identification-Benchmark PublicA benchmark for evaluating robustness of automatic genre identification models to test their usability for the automatic enrichment of large text collections with genre information.
Jupyter Notebook 4
-
pandachat-rag-benchmark
pandachat-rag-benchmark PublicPandaChat-RAG benchmark for evaluation of RAG systems on a non-synthetic Slovenian test dataset.
Python
-
NER-recognition
NER-recognition PublicAn evaluation of various encoder Transformer-based large language models on the named entity recognition task. The models are compared on 6 datasets, manually-annotated with named entitites.
Jupyter Notebook
-
Parlamint-translation
Parlamint-translation PublicA pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic…
Jupyter Notebook 2
If the problem persists, check the GitHub status page or contact support.