Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
35 views

LLM

Uploaded by

R Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

LLM

Uploaded by

R Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Large Language Models (LLMs)

1. Definition:

○ Large Language Models (LLMs) are advanced neural network


architectures, typically based on the Transformer model, that are
trained on vast amounts of text data.
2. Key Characteristics:

○ Scale: LLMs have a very large number of parameters, often in the


billions or trillions (e.g., GPT-3 has 175 billion parameters).
○ Pretraining: They are pre trained on a vast amount of text data
from books, articles, websites, and other textual sources.
○ Fine-tuning: After pretraining, LLMs can be fine-tuned on
domain-specific data to improve their performance for specific
tasks.
3. Training Process:

○ Self-supervised Learning: LLMs are typically trained using


self-supervised learning, where the model predicts the next word
or token in a sentence given the context.
○ Massive Datasets: They are trained on diverse datasets,
enabling them to learn a wide range of language patterns, facts,
and relationships.
4. Applications:

○ Text Generation: Used for generating human-like text, writing


essays, articles, and even code.
○ Machine Translation: LLMs can be used to translate text from
one language to another.
○ Summarization: They can summarize long documents or articles.
○ Question Answering: LLMs can answer questions based on
provided text or knowledge databases.
○ Sentiment Analysis: Used to analyze and classify the sentiment
in text (e.g., positive, negative, or neutral).
5. Examples of Popular LLMs:

○ GPT (Generative Pretrained Transformer): Developed by


OpenAI, used for tasks like text generation, summarization, and
conversation.
○ BERT (Bidirectional Encoder Representations from
Transformers): Developed by Google, primarily used for tasks
like sentiment analysis, question answering, and language
understanding.
○ T5 (Text-to-Text Transfer Transformer): A versatile model by
Google that frames all NLP tasks as a text-to-text problem.
○ BLOOM: An open-source LLM developed by BigScience that is
designed for multiple languages.
6. Challenges:

○ Bias and Fairness: LLMs can inherit biases from the data they
are trained on, leading to biased or unethical outputs.
○ Computational Costs: Training large models requires significant
computational power and energy resources.
○ Overfitting: LLMs may overfit to certain patterns in the data,
reducing their generalization ability.
7. Future Directions:

○ Multimodal Models: Models that can handle both text and other
data types (e.g., images, audio) for richer understanding and
generation.
○ Efficiency Improvements: Research is focused on making LLMs
more efficient in terms of computation, training data, and
fine-tuning techniques.
○ Ethical AI: Ensuring LLMs are fair, transparent, and free from
harmful biases.
Transformer Model

The Transformer model is a deep learning architecture introduced in the paper


"Attention is All You Need" by Vaswani et al. in 2017. It has since become
the foundation for many state-of-the-art models in natural language
processing (NLP) and other sequence-based tasks. Unlike previous models
like RNNs and LSTMs, the Transformer relies entirely on self-attention
mechanisms rather than recurrent or convolutional layers.

Key Components of the Transformer Model:

1. Self-Attention Mechanism:
○ Purpose: It allows the model to weigh the importance of different
words in a sentence relative to each other, regardless of their
position.
○ How it works:
■ The input is converted into three vectors for each word:
Query (Q), Key (K), and Value (V).
■ The attention score is computed by taking the dot product of
the Query and Key vectors, followed by a softmax
operation.
■ The attention scores determine how much focus each word
should have on other words when generating a word’s
representation.
○ Advantages: Allows the model to capture long-range
dependencies and relationships between words, unlike
RNNs/LSTMs which struggle with long sequences.
2. Positional Encoding:

○ Purpose: Since the Transformer does not use recurrence, it


needs a way to incorporate word order into its processing.
○ How it works: Positional encodings are added to the input
embeddings to give the model information about the position of
each word in the sequence. This allows the model to distinguish
between different positions of the words.
3. Encoder-Decoder Architecture:

○ The Transformer is made up of an encoder and a decoder, which


are both stacks of identical layers.
○ Encoder: It takes the input sequence and processes it into a
continuous representation.
■ Each encoder layer consists of:
1. Multi-head Self-Attention: Helps the encoder attend
to different words in the sentence simultaneously.
2. Feed-Forward Neural Network: A simple fully
connected network applied after attention.
3. Layer Normalization and Residual Connections for
stability and better gradient flow.
○ Decoder: It generates the output sequence, attending to both the
encoder output and previous decoder outputs (during training,
previous tokens are shifted to the right).
■ Each decoder layer has:
1. Masked Multi-head Self-Attention: Prevents
attending to future positions during training
(necessary for autoregressive generation).
2. Multi-head Attention over Encoder Output: Helps
the decoder focus on relevant parts of the input
sequence.
3. Feed-Forward Neural Network and Layer
Normalization.
4. Multi-Head Attention:

○ Purpose: Multi-head attention enables the model to focus on


different parts of the input sequence at the same time, learning
various relationships from multiple "heads."
○ How it works: Instead of performing a single attention operation,
multiple attention operations are performed in parallel (each with
different weight matrices), and their results are concatenated and
linearly transformed.
5. Feed-Forward Networks:

○ After each attention mechanism, a feed-forward network (typically


a fully connected layer) is applied to each position in the
sequence independently.
○ The feed-forward network is typically composed of two layers with
a ReLU activation function in between.
6. Layer Normalization and Residual Connections:

○ Residual Connections: Skip connections are used between the


input and output of each sub-layer (such as attention and
feed-forward networks), which helps with gradient flow.
○ Layer Normalization: Normalizes the output of each layer to
stabilize and speed up training.
7. Final Linear Layer and Softmax:

○ After passing through the decoder, a linear layer followed by


softmax is used to generate the output token probabilities.

Architecture Diagram:

The typical Transformer architecture consists of:

● Encoder Stack: A series of identical layers (usually 6 or more).


● Decoder Stack: A series of identical layers (also usually 6 or more).
● Final Output Layer: A linear transformation followed by a softmax to
produce the probabilities for the next token.

Advantages of Transformers:

1. Parallelization: Unlike RNNs or LSTMs, Transformers do not process


sequences sequentially, which allows for parallel processing and faster
training.
2. Long-Range Dependencies: Self-attention allows the model to capture
relationships between distant words in a sequence, which is a limitation
in traditional RNNs and LSTMs.
3. Scalability: The Transformer architecture scales well with large
datasets and large models (e.g., GPT-3), making it suitable for
state-of-the-art models.

Variants of Transformer Models:

1. BERT (Bidirectional Encoder Representations from Transformers):


A Transformer model trained to predict missing words in sentences
using bidirectional context (both left and right).
2. GPT (Generative Pretrained Transformer): A Transformer model
trained auto regressively for text generation.
3. T5 (Text-to-Text Transfer Transformer): Frames all NLP tasks as a
text-to-text problem, where both the input and output are sequences of
text.
4. Transformer-XL: Extends the Transformer to handle longer sequences
by introducing recurrence across segments.
Applications of Transformers:

1. Machine Translation: Popularized by models like Google Translate


using Transformer-based architectures.
2. Text Summarization: Automatically generating concise summaries
from long documents.
3. Question Answering: Models like BERT can read documents and
answer specific questions.
4. Text Generation: Models like GPT generate coherent and contextually
relevant text based on given prompts.
5. Speech Recognition: Transformers are increasingly used in speech
processing tasks.

Summary:

● Transformers revolutionized NLP by enabling efficient, parallelized


training and capturing long-range dependencies in data.
● They form the basis for many state-of-the-art models in tasks such as
text generation, translation, and summarization.

Long Short-Term Memory (LSTM)

Long Short-Term Memory (LSTM) is a type of Recurrent Neural Network


(RNN) architecture designed to address the limitations of standard RNNs,
particularly the problem of learning long-range dependencies in sequences.
LSTMs are widely used in tasks like time-series forecasting, speech
recognition, and natural language processing.

Key Features of LSTMs:

1. Designed to Handle Long-Term Dependencies:

○ Standard RNNs suffer from the vanishing gradient problem,


where gradients become extremely small during backpropagation
through long sequences, making it hard for the model to learn
long-term dependencies. LSTMs solve this by using a more
complex cell structure that helps maintain information over longer
sequences.
2. Memory Cells:

○ The core of an LSTM is its memory cell, which is designed to


store information over time. The cell can maintain its state over
multiple time steps, allowing it to learn from long-range
dependencies.

Components of an LSTM:

An LSTM unit consists of four primary components:

1. Forget Gate:

○ Purpose: Decides what information should be discarded from the


memory cell.
○ How it works: The forget gate looks at the previous hidden state
(h[t-1]) and the current input (x[t]) and outputs a number
between 0 and 1 for each number in the cell state. A value of 0
means "forget" and 1 means "keep."
○ Mathematical Equation: f[t]=σ(Wf⋅[h[t−1],x[t]]+bf)f[t] =
\sigma(W_f \cdot [h[t-1], x[t]] + b_f) Where:
■ f[t]f[t] is the forget gate's output.
■ σ\sigma is the sigmoid activation function.
■ WfW_f is the weight matrix for the forget gate.
■ bfb_f is the bias term.
2. Input Gate:

○ Purpose: Controls what new information should be added to the


memory cell.
○ How it works: It has two parts:
■ A sigmoid layer that decides which values will be updated.
■ A tanh layer that creates a vector of new candidate values
for the memory state.
○ Mathematical Equation: i[t]=σ(Wi⋅[h[t−1],x[t]]+bi)i[t] =
\sigma(W_i \cdot [h[t-1], x[t]] + b_i)
C~[t]=tanh⁡(WC⋅[h[t−1],x[t]]+bC)\tilde{C}[t] = \tanh(W_C \cdot
[h[t-1], x[t]] + b_C) Where:
■ i[t]i[t] is the input gate's output.
■ C~[t]\tilde{C}[t] is the candidate memory state.
3. Cell State Update:

○ Purpose: Updates the memory cell with new information.


○ How it works: The forget gate removes unnecessary information
from the cell state, and the input gate adds new information to the
cell state.
○ Mathematical Equation: C[t]=f[t]⋅C[t−1]+i[t]⋅C~[t]C[t] = f[t] \cdot
C[t-1] + i[t] \cdot \tilde{C}[t] Where:
■ C[t]C[t] is the updated cell state.
■ C[t−1]C[t-1] is the previous cell state.
4. Output Gate:

○ Purpose: Decides what the next hidden state should be.


○ How it works: The output gate looks at the current cell state and
decides what part of the cell state should be output to the next
layer.
○ Mathematical Equation: o[t]=σ(Wo⋅[h[t−1],x[t]]+bo)o[t] =
\sigma(W_o \cdot [h[t-1], x[t]] + b_o) h[t]=o[t]⋅tanh⁡(C[t])h[t] = o[t]
\cdot \tanh(C[t]) Where:
■ o[t]o[t] is the output gate's output.
■ h[t]h[t] is the hidden state.

Workflow of an LSTM:

1. Input Data: The LSTM receives the input data at each time step.
2. Forget Gate: It decides what information to discard from the previous
time step.
3. Input Gate: It updates the cell state with new information.
4. Cell State Update: The memory cell updates its state based on the
forget and input gates.
5. Output Gate: It generates the output for the current time step and
passes the new hidden state to the next time step.
Key Advantages of LSTM:

1. Long-Term Memory: LSTMs are able to capture long-range


dependencies in sequential data, making them ideal for tasks involving
time-series or natural language, where context from earlier steps is
important.

2. Solving the Vanishing Gradient Problem: By maintaining a separate


memory cell and using gates to regulate the flow of information, LSTMs
can maintain information over longer sequences and prevent the
vanishing gradient problem that traditional RNNs face.

3. Effective in Sequence-to-Sequence Tasks: LSTMs have been


particularly successful in tasks like machine translation, speech
recognition, and text generation, where the model must learn complex
sequences and their dependencies.

Applications of LSTM:

1. Natural Language Processing (NLP):

○ Machine translation (e.g., translating sentences from one


language to another).
○ Sentiment analysis.
○ Named entity recognition (NER).
○ Part-of-speech tagging.
2. Time-Series Forecasting:

○ Stock price prediction.


○ Weather prediction.
○ Energy consumption forecasting.
3. Speech and Audio Processing:

○ Speech recognition (e.g., converting spoken words into text).


○ Audio signal processing.
4. Music Generation:

○ LSTM models are used to generate music by learning the


structure of musical sequences.

Limitations of LSTMs:

1. Computational Complexity: LSTMs are computationally more


expensive than simpler models like feedforward neural networks,
especially when processing long sequences.

2. Training Time: Training LSTMs can be time-consuming due to the


complexity of backpropagating through time (BPTT), especially for very
long sequences.

3. Difficulty with Extremely Long Sequences: While LSTMs can handle


long-range dependencies better than RNNs, they still struggle with
extremely long sequences where the memory might decay over time,
especially in highly complex tasks.

Variants of LSTM:

1. Bidirectional LSTM: This variant processes the input sequence in both


forward and backward directions, allowing the model to consider future
context as well as past context.
2. GRU (Gated Recurrent Unit): A simplified version of LSTM with fewer
gates, often used for tasks requiring faster computation or fewer
parameters.

Summary:

● LSTMs are a specialized type of RNN that addresses the vanishing


gradient problem and is particularly useful for sequence-based tasks.
● They contain a memory cell with gates (forget, input, and output) that
help maintain long-term dependencies in data.
● LSTMs are widely used in NLP, time-series prediction, and speech
processing, although they can be computationally expensive and still
struggle with very long sequences.

Recurrent Neural Network (RNN)

A Recurrent Neural Network (RNN) is a type of neural network architecture


designed for sequential data or time-series data. Unlike feedforward neural
networks, which process inputs independently, RNNs maintain a "memory" of
previous inputs, allowing them to capture temporal dependencies in
sequential tasks. RNNs are widely used in applications like natural language
processing (NLP), time-series forecasting, speech recognition, and more.

Key Features of RNNs:

1. Sequential Processing:

○ RNNs are designed to process data sequentially, one element at a


time, while maintaining a hidden state that captures information
from previous time steps. This makes RNNs ideal for tasks where
the order of inputs matters (e.g., in sentences or time-series data).
2. Shared Weights:

○ In RNNs, the same set of weights is used across all time steps,
which helps the network generalize across different parts of the
sequence and reduces the number of parameters compared to
fully connected layers for each time step.
3. Hidden State:

○ The hidden state (h[t]) of the RNN carries information from


previous time steps and is updated at each time step based on
the current input and the previous hidden state. This gives the
RNN a form of memory.

Basic RNN Architecture:

1. Input Sequence:

○ An input sequence x[1],x[2],…,x[t]x[1], x[2], \dots, x[t] is fed into


the RNN one element at a time.
2. Hidden State Update:

○ At each time step tt, the RNN receives the current input x[t]x[t]
and the previous hidden state h[t−1]h[t-1]. The hidden state is
updated using a combination of the current input and the previous
hidden state.
○ The hidden state update is typically computed as:
h[t]=tanh⁡(Wh⋅[h[t−1],x[t]]+bh)h[t] = \tanh(W_h \cdot [h[t-1], x[t]] +
b_h) Where:
■ h[t]h[t] is the hidden state at time step tt.
■ WhW_h is the weight matrix for the hidden layer.
■ bhb_h is the bias term.
■ tanh⁡\tanh is a common activation function.
3. Output Layer:
○ At each time step, the RNN can produce an output y[t]y[t] based
on the hidden state: y[t]=Wy⋅h[t]+byy[t] = W_y \cdot h[t] + b_y
Where:
■ y[t]y[t] is the output at time step tt.
■ WyW_y is the weight matrix for the output layer.
■ byb_y is the output bias.
4. Backpropagation Through Time (BPTT):

○ RNNs are trained using a variant of backpropagation called


Backpropagation Through Time (BPTT). BPTT unfolds the
RNN across time steps and computes gradients for each time
step's hidden states and weights. The gradients are then used to
update the weights, similar to standard backpropagation in
feedforward networks.

Advantages of RNNs:

1. Handling Sequential Data:

○ RNNs are capable of processing sequences of data, such as


sentences, speech, or time-series data, making them ideal for
many NLP and time-series tasks.
2. Parameter Sharing:

○ The same weights are used across all time steps, reducing the
number of parameters compared to other neural network
architectures like fully connected networks, leading to better
generalization and less overfitting.
3. Learning Temporal Dependencies:

○ RNNs can capture temporal dependencies and relationships


between elements in a sequence, which is useful in applications
like language modeling or speech recognition.

Limitations of RNNs:
1. Vanishing and Exploding Gradient Problems:

○ Vanishing Gradients: When training RNNs over long sequences,


the gradients can become very small, making it difficult for the
model to learn long-term dependencies. This is known as the
vanishing gradient problem.
○ Exploding Gradients: On the flip side, the gradients can also
grow exponentially, causing instability during training.
○ Both issues arise due to the repeated multiplication of gradients
during backpropagation through time.
2. Difficulty in Learning Long-Term Dependencies:

○ Although RNNs maintain hidden states over time, they still


struggle to capture dependencies over long sequences due to the
vanishing gradient problem.
3. Computationally Expensive:

○ RNNs are inherently slow to train because of the sequential


processing and the need to update the hidden state at each time
step. This makes parallelization difficult.

Variants of RNNs:

To address some of the limitations of standard RNNs, several advanced RNN


variants have been developed:

1. Long Short-Term Memory (LSTM):

○ LSTMs are a special kind of RNN designed to overcome the


vanishing gradient problem by introducing memory cells and
gates (forget, input, and output gates). LSTMs can capture
long-range dependencies better than traditional RNNs.
2. Gated Recurrent Unit (GRU):

○ GRUs are a simplified version of LSTMs, with fewer gates (update


and reset) but still capable of handling long-range dependencies.
They are computationally more efficient than LSTMs.
3. Bidirectional RNN (Bi-RNN):

○ In a bidirectional RNN, the sequence is processed in both forward


and backward directions. This allows the model to have access to
both past and future context, which is useful for tasks like speech
recognition and machine translation.
4. Deep RNNs:

○ Deep RNNs are networks with multiple stacked RNN layers. By


stacking multiple layers, the model can capture more complex
representations and features.

Applications of RNNs:

1. Natural Language Processing (NLP):

○ Language modeling and text generation.


○ Machine translation (e.g., translating text from one language to
another).
○ Named Entity Recognition (NER).
○ Sentiment analysis.
○ Part-of-speech tagging.
2. Speech Recognition:
○ Converting spoken language into text (e.g., Google
Speech-to-Text).
3. Time-Series Forecasting:

○ Stock price prediction.


○ Weather prediction.
○ Energy consumption forecasting.
4. Video Analysis:

○ Action recognition in video sequences.


○ Activity recognition from video data.
5. Music Generation:
○ RNNs can be used to generate music by learning the patterns in
musical sequences.

Summary:

● RNNs are a class of neural networks designed to handle sequential


data and capture temporal dependencies.
● They process inputs sequentially and maintain a hidden state to carry
information across time steps.
● Despite their ability to handle sequential data, RNNs suffer from the
vanishing gradient and exploding gradient problems, limiting their
ability to learn long-range dependencies.
● Variants like LSTMs and GRUs address these problems and are more
widely used in practical applications such as NLP, speech recognition,
and time-series forecasting.

A Retrieval-Augmented Generation (RAG) system is a combination of


retrieval-based and generation-based models, typically used to improve the
performance of natural language processing (NLP) tasks like question
answering, summarization, and document generation. It integrates external
knowledge or documents into the generation process, enhancing the model's
ability to generate accurate and contextually relevant text.

Here are the key steps involved in a RAG system:

1. Query Encoding

● Input: The system receives an input query or prompt. This could be a


question, a sentence, or a task-specific query.
● Processing: The query is encoded into a vector representation using a
pretrained encoder model, typically based on architectures like BERT
or T5.
● Purpose: This step transforms the input query into a form that can be
compared against a large corpus or knowledge base.

2. Document Retrieval

● Input: The encoded query vector.


● Processing: The query vector is used to retrieve relevant documents or
passages from an external database or corpus. This step often utilizes a
retrieval model like BM25, DPR (Dense Passage Retrieval), or other
similarity-based models.
● Purpose: The goal is to select a subset of documents or passages that
are relevant to the input query. The retrieved documents contain the
necessary context or external knowledge required for answering the
query.

3. Fusion of Retrieved Documents

● Input: A set of retrieved documents or passages.


● Processing: The retrieved documents are combined or "fused" together
with the query to form a more informative input for the generation
model. This step could involve concatenating the query with the
retrieved documents or encoding them together into a joint
representation.
● Purpose: This fusion step helps the system leverage external
knowledge to enhance the response generated in the next step. The
system essentially has a richer context to draw from.

4. Generation of Response

● Input: The query and the fused document representations.


● Processing: The combined input is passed to a generative model like
T5, GPT, or BART. This model generates a response based on the
information available in both the query and the retrieved documents.
● Purpose: The goal is for the generative model to produce a coherent,
informative, and relevant response by using both the query context and
the additional knowledge from the retrieved documents.

5. Output Response

● Output: The final generated response or answer to the query.


● Purpose: The output is typically a natural language sentence,
paragraph, or list of items that provides the answer to the query, drawing
on both internal learned knowledge and external retrieved content.

6. Feedback Loop (Optional)

● Input: The system’s response and any feedback (e.g., from users or
evaluations).
● Processing: This feedback can be used for fine-tuning the retrieval
and generation components of the RAG system. Feedback may involve
adjusting the retrieval model to improve document relevance or
fine-tuning the generative model to improve response quality.
● Purpose: To continually improve the system's performance over time,
making it more accurate and effective in responding to queries.

Summary of Key Steps:

1. Query Encoding: Convert the input query into a vector representation.


2. Document Retrieval: Use the query vector to retrieve relevant
documents from an external knowledge base.
3. Fusion: Combine the query with the retrieved documents to create a
rich context.
4. Response Generation: Generate a response using a generative model
based on the query and retrieved documents.
5. Output: Provide the final generated response.
6. Feedback Loop (Optional): Continuously improve the system based
on feedback.

Applications of RAG Systems:

● Question Answering (QA): Providing answers to specific queries by


retrieving relevant documents and generating a detailed response.
● Summarization: Creating summaries based on retrieved content,
ensuring the summary is contextually accurate.
● Knowledge Augmented Tasks: Enhancing NLP models by infusing
external knowledge into tasks like content creation, machine translation,
etc.

By integrating a retrieval step with a generation model, a RAG system can


significantly improve the quality and relevance of generated responses,
particularly for tasks that require external knowledge or factual accuracy.

Classification and Regression are two fundamental types of supervised


learning tasks in machine learning. Both involve predicting an output
based on input data, but the type of output they predict differs.

1. Classification

● Definition: Classification is a type of machine learning task where the


goal is to predict a discrete label or category for a given input.

● Output: The output of a classification model is a category or class


label. For example, a model might predict whether an email is spam or
not spam (binary classification), or predict the species of a flower based
on its characteristics (multi-class classification).

● Example Tasks:
○ Binary Classification: Predicting whether a patient has a
disease or not (Yes/No).
○ Multi-Class Classification: Predicting the type of an animal from
a set of options like "Cat," "Dog," or "Rabbit."
○ Multi-Label Classification: Predicting multiple categories for a
single input, e.g., categorizing a movie into genres like "Action"
and "Comedy."
● Algorithms: Common algorithms used in classification include:

○ Logistic Regression
○ Decision Trees
○ Random Forests
○ Support Vector Machines (SVM)
○ K-Nearest Neighbors (KNN)
○ Naive Bayes
○ Neural Networks
● Evaluation Metrics:

○ Accuracy: The percentage of correct predictions.


○ Precision, Recall, and F1 Score: Metrics to evaluate model
performance, especially in imbalanced datasets.
○ Confusion Matrix: A table used to describe the performance of a
classification model, showing true positives, false positives, true
negatives, and false negatives.

2. Regression

● Definition: Regression is a type of machine learning task where the


goal is to predict a continuous numerical value based on input
features.
● Output: The output of a regression model is a real-valued number. For
example, predicting the price of a house based on its features (size,
location, etc.), or forecasting the stock price in the future.
● Example Tasks:
○ Linear Regression: Predicting a continuous variable, like
predicting the temperature for the next day based on historical
data.
○ Polynomial Regression: Predicting a continuous variable with a
non-linear relationship, such as predicting sales based on
advertising spend.
○ Time Series Forecasting: Predicting future values based on past
observations (e.g., predicting the number of passengers on a bus
next month).
● Algorithms: Common algorithms used in regression include:
○ Linear Regression
○ Ridge and Lasso Regression
○ Decision Trees
○ Random Forests
○ Support Vector Machines (SVM)
○ K-Nearest Neighbors (KNN)
○ Neural Networks
● Evaluation Metrics:
○ Mean Squared Error (MSE): Measures the average squared
difference between predicted and actual values.
○ Root Mean Squared Error (RMSE): The square root of the MSE,
giving an error metric in the same units as the predicted values.
○ Mean Absolute Error (MAE): Measures the average absolute
difference between predicted and actual values.
○ R-squared (R²): A statistical measure of the proportion of
variance in the dependent variable that is predictable from the
independent variables.

Summary of Differences:

Aspect Classification Regression


Output Discrete labels (categories) Continuous numerical value

Type of Categorical class prediction Continuous value prediction


Prediction

Example Spam detection, image Stock price prediction, house


Tasks classification, sentiment price prediction, temperature
analysis forecasting

Algorithms Logistic Regression, SVM, Linear Regression, Decision


Decision Trees, Random Trees, SVM, Random Forests
Forests

Evaluation Accuracy, Precision, Recall, MSE, RMSE, MAE, R²


Metrics F1 Score

Both classification and regression are essential tasks in machine learning, and
selecting the appropriate one depends on whether the output is categorical or
continuous.

Generative AI Overview:

Generative AI is a subset of artificial intelligence that focuses on creating new


data that mimics or resembles the data it has been trained on. Unlike
traditional AI, which is typically focused on classification or prediction tasks,
generative AI can produce novel content, such as text, images, audio, and
even video, that is contextually relevant and original based on learned
patterns.

Key Concepts:

1. Learning from Data:

○ Generative AI models learn patterns, structures, and distributions


from a given set of training data. By understanding the underlying
features and relationships in this data, the models can generate
new samples that are similar to the original data.
2. Types of Generative Models:

○ Generative Adversarial Networks (GANs):


■ GANs consist of two neural networks: a generator and a
discriminator. The generator creates fake data, and the
discriminator tries to distinguish between real and fake data.
The two networks compete, leading the generator to
improve its ability to create realistic data.
■ Applications: Image generation, style transfer, deepfake
videos, and art generation.
○ Variational Autoencoders (VAEs):
■ VAEs are a type of generative model that learns to encode
input data into a compressed latent space and then
reconstructs it from that space. They focus on capturing the
probability distribution of the data and are often used for
tasks such as image generation and anomaly detection.
■ Applications: Image generation, denoising, and
representation learning.
3. Generative AI for Content Creation:

○ Generative AI can create various forms of content, including:


■ Text Generation: Writing articles, stories, and even code
using models like GPT (Generative Pretrained
Transformers).
■ Image Generation: Creating realistic images from scratch
or from text descriptions (e.g., DALL-E, MidJourney).
■ Music Composition: Composing original music based on a
given genre or style (e.g., OpenAI's Jukedeck).
■ Video Creation: Generating realistic video clips or
manipulating video content, often leading to the creation of
deepfake videos.
4. Applications of Generative AI:

○ Art and Design: Creating original pieces of art, animation, and


graphic design.
○ Entertainment: Writing scripts, generating music, and creating
video game environments.
○ Healthcare: Generating synthetic medical data for research and
training AI systems.
○ Business and Marketing: Crafting personalized advertisements,
generating content for social media, and automating report
generation.
○ Deepfakes and Synthetic Media: Creating highly realistic
synthetic media, such as deepfake videos, which can mimic real
individuals or events.
5. Challenges and Ethical Considerations:

○ Data Bias: If the training data is biased, the generated content


may also inherit these biases.
○ Misuse: Generative AI can be used for creating misleading
content, such as fake news or malicious deepfakes.
○ Content Authenticity: As generative AI becomes more
advanced, distinguishing between real and generated content
becomes more difficult, raising questions about authenticity and
trust.

Examples of Generative AI Models:


1. GANs (Generative Adversarial Networks):

○ Used for tasks like generating realistic images or videos by having


the generator create content and the discriminator evaluate its
authenticity. Famous models include StyleGAN and BigGAN.
2. VAEs (Variational Autoencoders):

○ Used for creating new data points that resemble a given dataset
by learning a probabilistic mapping between input data and a
latent space. Commonly used in image and speech synthesis.
3. Transformers (GPT, T5, BERT):

○ GPT-3 (Generative Pretrained Transformer-3) is a state-of-the-art


language model capable of generating human-like text based on
prompts. It can be used for writing essays, answering questions,
or generating creative content.

Conclusion:

Generative AI is transforming multiple industries by enabling the creation of


new, meaningful content from data. While it holds great potential for creative
and practical applications, it also raises important ethical and societal
concerns regarding the authenticity and impact of the generated content.

Prompt Engineering

Prompt engineering refers to the process of designing and crafting effective


input prompts for language models, especially large language models (LLMs)
like GPT, to get optimal responses. Since LLMs rely heavily on the input
provided, prompt engineering is crucial in guiding the model to produce
accurate, relevant, and coherent output.
Key Aspects of Prompt Engineering:

1. Clarity and Specificity:

○ The prompt should be clear and specific about the task. Vague or
ambiguous instructions can lead to imprecise or irrelevant
responses.
○ Example: Instead of asking, "Tell me about climate change," you
might specify, "Explain the causes and effects of climate change
in simple terms."
2. Format and Structure:

○ The structure of the prompt plays a significant role in shaping the


output. Providing context or examples within the prompt helps the
model understand the desired output format.
○ Example: "Write a short poem about nature in the style of
Shakespeare."
3. Using Context or Background Information:

○ Adding relevant background information or context can improve


the quality of the response. This is particularly useful in tasks
requiring specific knowledge or domain expertise.
○ Example: "Considering recent research in AI ethics, explain the
potential societal impacts of machine learning."
4. Temperature and Output Control:

○ When interacting with LLMs, you can control the "temperature"


setting (if available). A lower temperature (e.g., 0.2) produces
more deterministic, conservative outputs, while a higher
temperature (e.g., 0.8) encourages more creative and diverse
outputs.
○ Example: "In 50 words or less, summarize the novel '1984' with
high creativity."
5. Iterative Refinement:
○ Often, the first prompt does not yield the desired result. Iteratively
refining the prompt by adjusting the phrasing or providing
feedback to the model can help improve the output.
○ Example: If the initial prompt “Give me a summary of the book" is
too vague, refining it to “Provide a concise summary of the book's
plot and key themes” might yield a better result.
6. Task Framing and Goal Definition:

○ The way you frame the task or define the goal can greatly
influence how the model responds. Different prompts can lead to
different styles, tones, or forms of response (e.g., conversational,
formal, concise, elaborate).
○ Example: For a task requiring explanation, you could say: "Explain
in simple terms," or for a more academic tone: "Discuss in-depth."

Common Techniques in Prompt Engineering:

1. Few-Shot Learning:

○ Providing a few examples in the prompt can guide the model on


how to respond. This helps in tasks where you want the model to
follow a specific pattern or format.
○ Example: "Translate the following English sentences to French:
■ ‘Hello, how are you?’
■ ‘What time is it?’
■ Now, translate: ‘I love learning new languages.’"
2. Zero-Shot Learning:

○ In some cases, the model can be prompted to perform a task


without any examples or prior training on that specific task. This is
called zero-shot learning and often works with highly general
prompts.
○ Example: "Translate 'Good morning' into Spanish."
3. Chain of Thought Prompting:
○ This technique involves asking the model to think through the
problem step-by-step, which can be particularly useful for tasks
that require reasoning.
○ Example: "What is 25 times 12? Explain the steps."
4. Prompt Tuning:

○ In certain applications, you may fine-tune a model with specific


prompts to adapt it to particular tasks or domains, improving its
performance on specialized tasks.

Applications of Prompt Engineering:

1. Text Generation:

○ Crafting prompts that yield specific forms of text generation, such


as stories, articles, or summaries.
2. Chatbots and Virtual Assistants:

○ Creating conversational prompts for generating responses that


are engaging and contextually accurate.
3. Question Answering:

○ Designing prompts that allow the model to extract and synthesize


information from structured or unstructured data.
4. Translation and Paraphrasing:

○ Using prompts to translate text or generate paraphrases based on


a given sentence or paragraph.
5. Content Filtering and Summarization:

○ Crafting prompts that guide the model to produce summarized


content or generate content based on certain constraints or
guidelines.
Summary:

Prompt engineering is an essential skill when working with language models.


It involves formulating precise, structured, and context-rich prompts that guide
the model to generate high-quality and relevant outputs. By refining the
prompt and experimenting with different techniques, users can leverage the
full potential of generative AI models for a wide range of applications.

CNN is a type of neural network that uses convolutional layers to extract features from images
Convolutional Neural Networks (CNNs)

Convolutional Neural Networks (CNNs) are a class of deep learning models primarily used for
analyzing visual data, such as images and videos. They are designed to automatically and
adaptively learn spatial hierarchies of features through convolutional layers.

Key Components of a CNN:

1. Convolutional Layer (Conv Layer):

○This is the core building block of a CNN. It performs a convolution operation,


applying a set of filters (also called kernels) over the input image (or the output of
previous layers) to extract features such as edges, textures, and patterns.
○ The filters are small matrices that slide over the input image, performing a dot
product operation at each location to produce a feature map.
○ Stride determines how much the filter moves after each operation. A stride of 1
means the filter moves one pixel at a time, while a stride of 2 means it moves two
pixels at a time.
○ Padding is often used to ensure the spatial dimensions of the output are not reduced
excessively, especially when applying many layers.
2. ReLU Activation (Rectified Linear Unit):

○ After each convolutional operation, the output is passed through a non-linear


activation function like ReLU.
○ ReLU helps introduce non-linearity, enabling the network to learn more complex
patterns by activating only positive values and setting negative ones to zero.
○ Mathematically, ReLU(x)=max⁡(0,x)\text{ReLU}(x) = \max(0, x).
3. Pooling Layer (Subsampling/Max Pooling):

○ Pooling layers are used to reduce the spatial dimensions (width and height) of the
input, decreasing computational complexity while retaining important information.
○ Max Pooling is the most common type of pooling, where the maximum value is
selected from a patch of the image.
○ Average Pooling is another method, where the average value is computed.
○ Pooling helps the model become more invariant to small translations, distortions, and
distortions in the input image.
4. Fully Connected Layer (Dense Layer):

○ After passing through multiple convolutional and pooling layers, the output is
flattened into a 1D vector and passed through one or more fully connected layers.
○ These layers connect every neuron to every other neuron in the previous layer,
enabling the model to learn complex relationships between the features.
○ The fully connected layer typically ends with a softmax or sigmoid activation,
depending on the task (classification or regression).
5. Softmax / Sigmoid Output Layer:

○ For classification tasks, the final layer often uses the Softmax activation for
multi-class problems or Sigmoid for binary classification. These layers produce
probabilities that sum to 1 (Softmax) or output a probability between 0 and 1
(Sigmoid) for the respective classes.

Working of a CNN (Step-by-Step):

1. Input Image: The input is typically an image (e.g., 224x224 pixels with RGB channels).
2. Convolution: The input image is passed through several convolutional layers where filters
are applied to extract low-level features (edges, textures, etc.).
3. Activation (ReLU): After each convolution operation, the result is passed through the ReLU
activation function to introduce non-linearity.
4. Pooling: A pooling layer is applied to downsample the feature maps, reducing the spatial
dimensions.
5. Flattening: The 2D feature maps are flattened into a 1D vector to feed into fully connected
layers.
6. Fully Connected Layer: The flattened vector is passed through one or more dense layers to
learn high-level features and patterns.
7. Output: For classification tasks, the final layer uses Softmax or Sigmoid to output class
probabilities.

Advantages of CNNs:

1. Automatic Feature Extraction:

○ CNNs automatically learn the important features of the data (such as edges,
textures, and shapes) without manual feature engineering.
2. Parameter Sharing:
○ The same filter (kernel) is used across the entire image, reducing the number of
parameters and computational complexity compared to traditional fully connected
neural networks.
3. Translation Invariance:

○ Through pooling layers, CNNs are less sensitive to the exact location of features,
making them robust to translations (shifts) in the image.
4. Scalability:

○ CNNs can scale well with large datasets, making them suitable for real-world tasks
like image and video analysis.

Applications of CNNs:

1. Image Classification:

○ CNNs are widely used for tasks like object classification (e.g., recognizing whether
an image contains a cat or a dog).
2. Object Detection:


CNNs can be used to detect specific objects within an image (e.g., finding faces in a
photo).
3. Semantic Segmentation:

○CNNs are used to assign a class label to each pixel in an image, which is essential
for tasks like medical image analysis.
4. Image Generation (e.g., GANs):

○ In combination with other models like Generative Adversarial Networks (GANs),


CNNs are used to generate realistic images from random noise.
5. Video Analysis:

○ CNNs are used to analyze videos, recognizing objects, detecting motion, or


performing activity recognition.

Summary:

Convolutional Neural Networks (CNNs) are a powerful and efficient class of neural networks
designed to handle visual data by automatically extracting spatial features and reducing the
complexity of the model. Through convolutional layers, pooling layers, and fully connected layers,
CNNs can recognize complex patterns and are widely used in image recognition, object detection,
and video analysis.
The RAG (Retrieval-Augmented Generation) pipeline is a hybrid approach that combines the
power of retrieval-based and generation-based models to answer queries more effectively. It
enhances the capabilities of language models by retrieving relevant information from external
documents or databases and using that information to generate more accurate and contextually
relevant responses.

How the RAG Pipeline Works:

The RAG pipeline consists of several steps:

1. Query Input:

● The process begins when a user submits a query or prompt that requires a response. This
could be a question, a task, or any form of text input.

2. Retrieval Phase:

● Retriever Model: The query is passed through a retrieval model (typically a Dense
Retriever or TF-IDF based retriever) that searches an external database or corpus to find
relevant documents or passages that may contain the information needed to answer the
query.

● The retriever returns a ranked list of documents, passages, or snippets relevant to the query.

Techniques for retrieval:

○ Dense Retrieval (using embeddings): The query and the documents in the database
are converted into vectors using pre-trained models (such as BERT or other
transformers). Cosine similarity or other distance metrics are used to rank documents
based on their similarity to the query.
○ Sparse Retrieval (using keyword matching): Simple keyword matching or vector
space models (like BM25) are used for document retrieval.

3. Document Ranking and Selection:

● After the retrieval phase, the documents are ranked based on their relevance to the query.
● Often, the top-k most relevant documents (e.g., the top 5 or 10) are selected for use in the
next step.

4. Generation Phase:

● Generator Model: A generative language model (usually a model like GPT or BART)
takes the query and the retrieved documents as context.
● The model uses the relevant passages retrieved by the retriever to generate a more
accurate, fluent, and coherent response. This step involves combining the retrieved
information with the model's internal knowledge (learned during training) to generate a final
response.

Key Steps in Generation:

○ The query is concatenated with the retrieved documents.


○ The generative model generates a response using the retrieved information as
context.
○ If necessary, the response is refined and filtered to ensure relevance and
correctness.

5. Final Output:

● The output from the generator model is the final answer to the query, which incorporates
both the model's knowledge and the relevant external information retrieved during the first
phase.
● This answer is then returned to the user.

Key Components of the RAG Pipeline:

● Retriever: Responsible for fetching relevant documents from an external knowledge base.
● Generator: Generates a natural language response based on the retrieved documents and
the query.
● Knowledge Base: A large collection of documents or a database from which the retriever
fetches information. This could be anything from a search engine index to a more structured
database.
● Fusion: In some RAG implementations, the retriever and generator models can be jointly
trained to optimize the interaction between retrieval and generation, improving the quality of
the final output.

Benefits of the RAG Pipeline:

1. Improved Knowledge Access: By augmenting the language model with external


information, the RAG model can answer questions about topics it may not have seen during
training.
2. Reduced Model Size: Since the retrieval component allows the model to access external
knowledge, the generative model doesn’t need to be as large as traditional LLMs that store
all knowledge internally.
3. Enhanced Accuracy: By retrieving relevant documents, the model can provide more
accurate, up-to-date, and context-specific answers, especially when handling rare or niche
queries.

Example Use Case:

Let’s say you want to ask a question about a recent scientific discovery:

1. Query: "What is the latest breakthrough in quantum computing?"

2. Retrieval:

○The retriever model searches a large corpus (e.g., research papers, articles, or
scientific journals) to find documents related to quantum computing advancements.
○ It might return articles such as "Quantum Computing Breakthrough in 2023," "New
Quantum Algorithms" and "Recent Advances in Quantum Hardware."
3. Generation:

○ The generator model takes the retrieved documents and the query and generates a
response like: "The latest breakthrough in quantum computing involves the
development of a new quantum error-correcting code that promises to improve the
reliability of quantum computers, demonstrated by researchers at MIT in 2023."
4. Output: The generated response is returned to the user.

RAG Variants and Models:

● RAG-Token: A token-level version where the retrieval process is integrated into the
generation process at the token level. Each token generation step may depend on the
retrieved documents.
● RAG-Sequence: A sequence-level version where entire documents are retrieved and
provided as context for generating a sequence of tokens.

Conclusion:

The RAG pipeline is powerful for tasks that require answering questions or generating content based
on up-to-date or extensive external knowledge. By combining retrieval and generation, it enhances
the ability of language models to provide accurate and relevant answers while reducing the
dependency on vast internal knowledge storage.
Concept of "Context Window" in LLMs

The context window in large language models (LLMs) refers to the portion of the input text that the
model considers at any given time when generating predictions or responses. It represents the span
of text (such as words, tokens, or characters) that the model can "see" or "attend to" during the
process of understanding or generating language.

Key Aspects of the Context Window:

1. Fixed Size: The context window has a fixed size, typically measured in terms of the number
of tokens (words or subwords) the model can process simultaneously. For example, GPT-3
has a context window of 2048 tokens, while GPT-4 has a much larger one (up to 32,768
tokens in some cases). Once this window is exceeded, the model can no longer attend to
earlier tokens unless they are within the current window.

2. Token Representation: Each token in the context window corresponds to a unit of meaning
(e.g., a word or subword), and the model processes these tokens in parallel to understand
and generate text. The context window ensures the model can take in surrounding tokens to
generate coherent responses.

3. Sliding Window: In some models, the context window can slide as new tokens are
processed. Once a certain number of tokens are consumed or generated, older tokens fall
outside the window, and new tokens are incorporated.

Significance of the Context Window

1. Model's Memory Limitation:

○ The size of the context window directly limits the amount of information the model
can process at once. If the context window is too small, the model may miss
important dependencies from earlier parts of the text. If the window is large, the
computational cost increases.
○ This limitation becomes evident in tasks that require understanding long documents,
maintaining coherence over long dialogues, or recalling information from earlier in a
conversation or text.
2. Handling Long-Range Dependencies:

○ For LLMs, handling long-range dependencies (i.e., connections between words or


concepts that are far apart) is crucial. A small context window might lead to
difficulties in maintaining these dependencies, especially in tasks such as
summarization, translation, or question-answering where context matters.
○ Larger context windows allow the model to maintain longer-range dependencies and
improve the quality of its output by utilizing information from a broader span of text.
3. Efficiency vs. Performance:

○ Larger context windows generally lead to better performance in tasks requiring


long-term coherence, but they also demand more computational resources. Models
with larger context windows need more memory and processing power.
○ As a result, there's a trade-off between performance (in terms of understanding or
generating long texts) and the computational cost of processing larger context
windows.
4. Fine-tuning:

○ When fine-tuning an LLM on a specific task, the context window can affect how the
model generalizes to tasks requiring long-form reasoning. A task involving multiple
steps or long conversations can benefit from a large context window, ensuring that
the model retains relevant information throughout the process.
5. Context Window in Chatbots and Conversational Models:

○ In conversational models, the context window is vital for keeping track of the
conversation history. A larger context window allows the model to consider earlier
parts of the conversation when generating the next response, leading to more
coherent and contextually relevant interactions.

Challenges with the Context Window:

1. Out-of-Window Information:

○ As a model processes a fixed-size context window, it may lose access to information


outside the window. In cases of long text or multi-turn conversations, earlier parts of
the input may not be accessible, leading to potential gaps in understanding.
2. Attention Mechanisms:

○Attention mechanisms in models like transformers determine which tokens in the


context window are most important to focus on. In the case of larger windows, the
model needs to allocate resources efficiently to avoid overburdening itself with
irrelevant information.
3. Memory Augmented Models:

○ Researchers are exploring ways to extend the concept of the context window through
memory-augmented networks or retrieval-augmented generation (RAG) models.
These approaches try to address the limitations of fixed context windows by
introducing external memory or external retrieval systems to maintain access to more
extensive information.
Conclusion:

The context window in LLMs plays a critical role in determining how much of the input text the model
can "remember" and use for generating accurate and coherent outputs. Larger context windows
allow models to handle more information at once, improving performance in tasks that require
long-term dependencies, but at the cost of greater computational demands. Balancing window size
with efficiency and accuracy is crucial for optimizing the performance of LLMs.

What is fine-tuning in the context of LLMs, and why is it important?

Fine-tuning in the context of LLMs involves taking a pre-trained model and further training it on a
smaller, task-specific dataset. This process helps the model adapt its general language
understanding to the nuances of the specific application, thereby improving performance.

This is an important technique because it leverages the broad language knowledge acquired during
pre-training while modifying the model to perform well on specific applications, such as sentiment
analysis, text summarization, or question-answering.

How do LLMs handle out-of-vocabulary (OOV) words or tokens?

LLMs handle out-of-vocabulary (OOV) words or tokens using techniques like subword tokenization
(e.g., Byte Pair Encoding or BPE, and WordPiece). These techniques break down unknown words
into smaller, known subword units that the model can process.

This approach ensures that even if a word is not seen during training, the model can still understand
and generate text based on its constituent parts, improving flexibility and robustness.

What are embedding layers, and why are they important in LLMs?

Embedding layers are a significant component in LLMs used to convert categorical data, such as
words, into dense vector representations. These embeddings capture semantic relationships
between words by representing them in a continuous vector space where similar words exhibit
stronger proximity. The importance of embedding layers in LLMs includes:

● Dimensionality reduction: They reduce the dimensionality of the input


data, making it more manageable for the model to process.
● Semantic understanding: Embeddings capture nuanced semantic
meanings and relationships between words, enhancing the model's
ability to understand and generate human-like text.
● Transfer learning: Pre-trained embeddings can be used across
different models and tasks, providing a solid foundation of language
understanding that can be fine-tuned for specific applications

How do you measure the performance of an LLM?

Researchers and practitioners have developed numerous evaluation metrics to gauge the
performance of an LLM. Common metrics include:

● Perplexity: Measures how well the model predicts a sample, commonly


used in language modeling tasks.
● Accuracy: Used for tasks like text classification to measure the
proportion of correct predictions.
● F1 Score: A harmonic mean of precision and recall, used for tasks like
named entity recognition.
● BLEU (Bilingual Evaluation Understudy) score: Measures the quality
of machine-generated text against reference translations, commonly
used in machine translation.
● ROUGE (Recall-Oriented Understudy for Gisting Evaluation): A set
of metrics that evaluate the overlap between generated text and
reference text, often used in summarization tasks. They help quantify
the model's effectiveness and guide further improvements.

What are some approaches to reduce the computational cost of LLMs?

To reduce the computational cost of LLMs, we can employ:

● Model pruning: Removing less important weights or neurons from the


model to reduce its size and computational requirements.
● Quantization: Converting the model weights from higher precision
(e.g., 32-bit floating-point) to lower precision (e.g., 8-bit integer) reduces
memory usage and speeds up inference.
● Distillation: Training a smaller model (student) to mimic the behavior of
a larger, pre-trained model (teacher) to achieve similar performance with
fewer resources.
● Sparse attention: Using techniques like sparse transformers to limit the
attention mechanism to a subset of tokens, reduces computational load.
● Efficient architectures: Developing and using efficient model
architectures specifically designed to minimize computational demands
while maintaining performance, such as the Reformer or Longformer.

How can you incorporate external knowledge into an LLM?

Incorporating external knowledge into an LLM can be achieved through several methods:

● Knowledge graph integration: Augmenting the model's input with


information from structured knowledge graphs to provide contextual
information.
● Retrieval-Augmented Generation (RAG): Combines retrieval methods
with generative models to fetch relevant information from external
sources during text generation.
● Fine-tuning with domain-specific data: Training the model on
additional datasets that contain the required knowledge to specialize it
for specific tasks or domains.
● Prompt engineering: Designing prompts that guide the model to utilize
external knowledge effectively during inference.

How do you evaluate the effectiveness of a prompt?

Evaluating the effectiveness of a prompt involves:

● Output quality: Assessing the relevance, coherence, and accuracy of


the model's responses.
● Consistency: Checking if the model consistently produces high-quality
outputs across different inputs.
● Task-specific metrics: Using task-specific evaluation metrics, such as
BLEU for translation or ROUGE for summarization, to measure
performance.
● Human evaluation: Involving human reviewers to provide qualitative
feedback on the model's outputs.
● A/B testing: Comparing different prompts to determine which one yields
better performance.

You might also like