Text Summarization Using The T5 Transformer Model

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072
Text Summarization Using the T5 Transformer Model

Ratan Ravichandran1, Sri Bharath Sharma P2, Shriyans Shriniwas Arkal3, Shubhangee Das4,
Prof. Sasikala Nagarajan5
1-5Department of Artificial Intelligence and Machine Learning, Dayananda Sagar University, Bangalore, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - In our information-filled world, it is crucial to Finally, complete content and organizational editing
focus on the essential content amidst the overwhelming before formatting. Please take note of the following items
volume of information available. Unfortunately, people often when proofreading spelling and grammar:
spend a significant amount of time sifting through irrelevant
details, inadvertently overlooking crucial information. To 2. Literature Review
address this issue, we present a project that utilizes the T5
transformer model in natural language processing to develop Adhika Pramita Widyassari et al.[1] provides an overview of
an abstractive text summarization system. By leveraging various techniques and methods used in automatic text
advanced language modeling techniques, our project aims to summarization, with a particular focus on the Natural
enhance efficiency, comprehension, and decision-making Language Toolkit (NLTK). The author explores different
processes across various domains. approaches, including extractive and abstractive
summarization, and discusses how NLTK can be utilized in
Key Words: Abstractive summarization, T5 transformer these techniques.
model, Natural language processing.
 Preprocessing: NLTK performs essential text
1.INTRODUCTION preprocessing tasks like tokenization, stemming, and
stop-word removal, aiding in information extraction
In our information-filled world, focusing on what truly by breaking text into words or sentences and
matters is essential for success. On average, a person spends reducing words to their root form.
a significant amount of their lifetime reading useless
information, often missing out on significant bits by  Sentence Scoring: NLTK facilitates extractive
subconsciously dismissing them. To solve this problem, we summarization by offering tools to calculate
built a text summarizer that condenses lengthy text into sentence similarity (e.g., cosine similarity) and
shorter concise summaries, providing a quick overview of assign scores, enabling the selection of relevant
the main information. sentences based on their importance.
Text summarization is a vital tool in today's information-  Feature Extraction: NLTK's part-of-speech tagging
driven world, allowing us to distil the essence of lengthy and named entity recognition assist in identifying
texts into concise summaries. By employing advanced entities and key terms, enhancing summary
natural language processing techniques, text summarizers accuracy and relevance.
extract key information, enabling readers to grasp the main
ideas quickly. In this report, we explore the effectiveness and  Language Modeling: In abstractive summarization,
applications of text summarizers, shedding light on their NLTK helps build language models (e.g., n-gram
potential to enhance efficiency, comprehension, and models) for generating concise and coherent
decision-making processes across various domains. summaries by predicting probable next words or
phrases.
1.1 The T5 Transformer Model
 Evaluation: NLTK includes evaluation metrics (e.g.,
To achieve this, we use the T5 transformer model which is ROUGE, BLEU) to assess summary quality by
a powerful language model that can understand and generate comparing them with reference summaries and
human-like text. Constructing a text summarizer based on T5 measuring similarity or effectiveness.
is beneficial because it allows for concise and accurate
summarization of lengthy documents. T5's ability to capture Khilji et al. [2] examines Abstractive Text Analysis, described
contextual relationships and generate coherent summaries as a natural language processing (NLP) technique that aims to
makes it an ideal choice for text summarization tasks, generate a concise and coherent summary of a given text by
enabling efficient information extraction and facilitating understanding its content and generating new sentences.
quick comprehension of complex texts. Abstractive summarization involves creating novel sentences
that capture the key information and main ideas of the source
text in a more human-like manner.
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 896
3. PROPOSED ARCHITECTURE summaries, while summaries in the feature column provide

a condensed overview.
4.2 Model Creation

For model creation, we use a T5 transformer architecture
tailored for sequence-to-sequence language tasks. The
DataCollatorForSeq2Seq ensures proper tokenization and
data collation. The AutoModelForSeq2SeqLM class loads pre-
trained T5 weights, to generate coherent sequences, such as
text summarization.
4.1 Model Training

The trainer is configured with the necessary components,
including training arguments, tokenizer, data collator, and
Fig -1: System Architecture
datasets for training and evaluation. By calling the train
Our system employs the cutting-edge T5 transformer model function, the training process begins, during which the
for effective text summarization. The process starts with model learns to generate concise summaries from the given
data preprocessing, including cleaning and organizing. input data. Once the training is complete, the trained model
Tokenization divides data into smaller units for processing. is saved to a specified path for later. Additionally, the trained
The T5 model is then trained to understand input and model and a data file are downloaded and copied , enabling
generate informative summaries. further analysis or storage of the results. The model is
trained for 10 epochs and 25 epochs and the results are
Once trained, it condenses key information in new input accordingly evaluated.
documents. Evaluation is done using the ROUGE metric,
measuring similarity to human-written summaries. Higher 5. RESULTS AND ANALYSIS
scores indicate better summarization. This architecture
ROUGE, which stands for "Recall-Oriented Understudy for
leverages T5's power to process input, generate concise
Gisting Evaluation," is a set of metrics used to evaluate the
summaries, and assess quality, streamlining information
quality of summaries or generated text in natural language
extraction for quicker comprehension and decision-making.
processing tasks. It is commonly used in automatic
3.1 Architecture Workflow summarization and machine translation evaluation. They are
assessed using ROUGE-1, ROUGE-2, and ROUGE-L. ROUGE
This code implements a sequence-to-sequence (Seq2Seq) metrics provide a way to quantitatively assess the quality of
neural network, leveraging the T5 model, to achieve text summaries or generated text by comparing them to a
summarization. The process encompasses data preparation, reference summary. These metrics are widely used in
where libraries are imported, and the "multi_news" dataset research and evaluation of text generation models to
is loaded, split, and organized. Tokenization and measure their effectiveness in capturing the key information
preprocessing are employed to adapt the data, utilizing the or meaning from the source text.
"t5-small" tokenizer and defining a summarization prefix.
The core of the workflow involves model training, where the

pre-trained T5 model is fine-tuned for summarization. The
Seq2SeqTrainer facilitates this training, optimizing the
model's capacity to generate accurate and concise
summaries. After training, the model predicts summaries,
and Rouge scores are calculated using the Rouge library to
assess the quality of these summaries.
4. EXPERIMENTATION
4.1 Dataset
This dataset, multi_news found on HuggingFace, consists of Fig -2: ROUGE Scores for 10 epochs
two columns: a feature column containing news text
separated by "|||||," and a target column with human-written
summaries. The target column serves as the reference for
tuning and domain adaptation to tailor models for specific

industries, enabling more precise and contextually relevant
summaries. Furthermore, addressing the challenge of multi-
document summarization is crucial for accommodating
scenarios involving related documents, requiring methods to
generate coherent summaries from multiple sources.
ACKNOWLEDGEMENTS
We are deeply grateful to our guide, Prof. Sasikala Nagarajan
for their support and mentorship throughout the course of
this project.
Fig -3: ROUGE Scores for 25 epochs REFERENCES
Fig 1 shows the ROUGE scores, it shows that over the span of [1] Adhika Pramita Widyassari, Supriadi Rustad, Guruh Fajar
10 epochs, the model's scores get better. Notably, the highest Shidik, Edi Noersasongko, Abdul Syukur, Affandy Affandy, De
scores are achieved in the order of ROUGE-L, followed by Rosal Ignatius Moses Setiadi, “Review of automatic text
ROUGE-2, ROUGE-1, and ROUGE. This pattern indicates the summarization techniques & methods", Journal of King Saud
model's ability to create coherent and fluent summaries while University 2022
preserving essential information. Despite this progress, the
[2] Khilji, Abdullah & Sinha, Utkarsh & Singh, Pintu & Ali,
ROUGE scores remain relatively low, which means there is
Adnan & Pakray, Dr. Partha "Abstractive Text Summarization
room to improve.
Approaches with Analysis of Evaluation Techniques",
Fig 2 shows the model’s progress when it is trained with 25 Computational Intelligence in Communications and Business
epochs. Throughout 25 epochs, the model's ROUGE scores Analytics 2021
demonstrate progressive enhancement. The highest scores
[3] Ilya Sutskever, Oriol Vinyals, Quoc V. Le, “Sequence to
are consistently observed in the order of ROUGE-L, followed
Sequence Learning with Neural Networks”, arXiv Cornell
by ROUGE-2, ROUGE-1, and ROUGE. This pattern highlights
University 2014.
the model's capability to generate summaries that are not
only coherent but also more fluent compared to the original [4 ]Jakob Uszkoreit, “Transformer: A Novel Neural Network
text, while preserving crucial information. Architecture for Language Understanding”, Google Research
2017
The model's improvement in ROUGE scores can be attributed
to a few key factors. Firstly, longer training exposes the [5] Abigail Rai, Study of Various Methods for Tokenization,
model to a wider range of information, leading to better Applications of Internet things pp 193-200 2020.
performance. Additionally, extended training duration
enhances the model's grasp of human language, resulting in
improved summaries. Furthermore, as the model learns
more, its accuracy in producing summaries that align with
human-generated content also increases, ensuring factual
correctness.
3. CONCLUSIONS AND FUTURE WORK

Our successful project focused on abstractive text
summarization introduces a system powered by the T5
transformer language model. The project highlights the
utility of abstractive summarization in automating data
extraction and elevating decision-making processes. Notably,
a comparative analysis reveals that the abstractive model
outperforms the extractive counterpart, capturing more
comprehensive details.
Looking forward, this technology bears the potential to

revolutionize how humans comprehend and utilize textual
content, enhancing its accessibility and efficacy across
various domains. Future enhancements could include fine-

Text Summarization Using The T5 Transformer Model

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Text Summarization Using The T5 Transformer Model

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Text Summarization Using The T5 Transformer Model

Uploaded by

Copyright:

Available Formats

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072

Text Summarization Using the T5 Transformer Model

3. PROPOSED ARCHITECTURE summaries, while summaries in the feature column provide

4.2 Model Creation

4.1 Model Training

The core of the workflow involves model training, where the

tuning and domain adaptation to tailor models for specific

Fig -3: ROUGE Scores for 25 epochs REFERENCES

3. CONCLUSIONS AND FUTURE WORK

Looking forward, this technology bears the potential to

You might also like