Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Catch Up On Large Language Models

A practical guide to large language models without the hype

Marco Peixeiro
Towards Data Science

--

Photo by Gary Bendig on Unsplash

If you are here, it means that like me you were overwhelmed by the constant flow of information, and hype posts surrounding large language models (LLMs).

This article is my attempt at helping you catch up on the subject of large language models without the hype. After all, it is a transformative technology, and I believe it is important for us to understand it, hopefully making you curious to learn even more and build something with it.

In the following sections, we will define what LLMs are and how they work, of course covering the Transformer architecture. We also explore the different methods of training LLMs and conclude the article with a hands-on project where we use Flan-T5 for sentiment analysis using Python.

Let’s get started!

LLMs and generative AI: are they the same thing?

Generative AI is a subset of machine learning that focuses on models who’s primary function is to generate something: text, images, video, code, etc.

Generative models train on enormous amounts of data created by humans to learn patterns and structure which allow them to create new data.

--

--

Senior data scientist | Author | Instructor. I write hands-on articles with a focus on practical skills.