Demystifying Large Language Models: Unraveling the Mysteries of Language Transformer Models, Build from Ground up, Pre-train, Fine-tune and Deployment
By James Chen
()
About this ebook
This book is a comprehensive guide aiming to demystify the world of transformers -- the architecture that powers Large Language Models (LLMs) like GPT and BERT. From PyTorch basics and mathematical foundations to implementing a Transformer from scratch, you'll gain a deep understanding of the inner workings of these models.
Tha
Read more from James Chen
Essentials of Technical Analysis for Financial Markets Rating: 0 out of 5 stars0 ratingsLearn OpenCV with Python by Examples Rating: 0 out of 5 stars0 ratingsMachine Learning and Deep Learning With Python Rating: 0 out of 5 stars0 ratingsI Want That Pencil: Sharpen Your Cashflow, Pencil Your Future. Rating: 0 out of 5 stars0 ratings
Related to Demystifying Large Language Models
Related ebooks
Large Language Models Rating: 2 out of 5 stars2/5Mastering Large Language Models: Advanced techniques, applications, cutting-edge methods, and top LLMs (English Edition) Rating: 0 out of 5 stars0 ratingsLarge Language Models - LLMs Rating: 0 out of 5 stars0 ratingsIntroduction to LLMs for Business Leaders: Responsible AI Strategy Beyond Fear and Hype: Byte-Sized Learning Series Rating: 0 out of 5 stars0 ratingsMachine Learning Engineering with Python: Manage the lifecycle of machine learning models using MLOps with practical examples Rating: 0 out of 5 stars0 ratingsLLM Engineer's Handbook: Master the art of engineering large language models from concept to production Rating: 0 out of 5 stars0 ratingsGenerative AI Foundations in Python: Discover key techniques and navigate modern challenges in LLMs Rating: 0 out of 5 stars0 ratingsMastering Transformers: The Journey from BERT to Large Language Models and Stable Diffusion Rating: 0 out of 5 stars0 ratingsPython Mastery: From Absolute Beginner to Pro Rating: 0 out of 5 stars0 ratingsKnowledge Reasoning: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsGenerative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs Rating: 0 out of 5 stars0 ratingsPrompt Engineering Unleashed: Crafting the Future of AI Communication Rating: 0 out of 5 stars0 ratingsMastering Large Language Models with Python Rating: 0 out of 5 stars0 ratingsGROKKING ALGORITHMS: Tips and Tricks of Grokking Functional Programming Rating: 0 out of 5 stars0 ratingsDeep Learning Essentials: Your hands-on guide to the fundamentals of deep learning and neural network modeling Rating: 0 out of 5 stars0 ratingsLearn Python Generative AI: Journey from autoencoders to transformers to large language models (English Edition) Rating: 0 out of 5 stars0 ratingsChatGPT Will Won't Save The World Rating: 0 out of 5 stars0 ratingsGetting Data Science Done: Managing Projects From Ideas to Products Rating: 0 out of 5 stars0 ratings200 Tips for Mastering Generative AI Rating: 0 out of 5 stars0 ratingsThe Lindahl Letter: 3 Years of AI/ML Research Notes Rating: 0 out of 5 stars0 ratingsPrompt Engineering for AI Techniques, Strategies, and Best Practice Rating: 0 out of 5 stars0 ratingsResponsible Data Science Rating: 0 out of 5 stars0 ratingsPrinciples of Genome Analysis and Genomics Rating: 2 out of 5 stars2/5Machine Learning: Unraveling the Algorithms of Intelligence Rating: 0 out of 5 stars0 ratingsMachine Learning and Generative AI for Marketing: Take your data-driven marketing strategies to the next level using Python Rating: 0 out of 5 stars0 ratings
Intelligence (AI) & Semantics For You
2084: Artificial Intelligence and the Future of Humanity Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 4 out of 5 stars4/5Midjourney Mastery - The Ultimate Handbook of Prompts Rating: 5 out of 5 stars5/5Artificial Intelligence: A Guide for Thinking Humans Rating: 4 out of 5 stars4/5ChatGPT For Dummies Rating: 4 out of 5 stars4/5Coding with AI For Dummies Rating: 0 out of 5 stars0 ratingsNexus: A Brief History of Information Networks from the Stone Age to AI Rating: 4 out of 5 stars4/5The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/5Summary of Super-Intelligence From Nick Bostrom Rating: 4 out of 5 stars4/5Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5ChatGPT For Fiction Writing: AI for Authors Rating: 5 out of 5 stars5/5Artificial Intelligence For Dummies Rating: 3 out of 5 stars3/5AI for Educators: AI for Educators Rating: 5 out of 5 stars5/5The Roadmap to AI Mastery: A Guide to Building and Scaling Projects Rating: 3 out of 5 stars3/5Writing AI Prompts For Dummies Rating: 0 out of 5 stars0 ratingsKiller ChatGPT Prompts: Harness the Power of AI for Success and Profit Rating: 2 out of 5 stars2/5Our Final Invention: Artificial Intelligence and the End of the Human Era Rating: 4 out of 5 stars4/5Dark Aeon: Transhumanism and the War Against Humanity Rating: 5 out of 5 stars5/5Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures Rating: 3 out of 5 stars3/5100M Offers Made Easy: Create Your Own Irresistible Offers by Turning ChatGPT into Alex Hormozi Rating: 0 out of 5 stars0 ratingsThe Wolf Is at the Door: How to Survive and Thrive in an AI-Driven World Rating: 0 out of 5 stars0 ratings
Reviews for Demystifying Large Language Models
0 ratings0 reviews
Book preview
Demystifying Large Language Models - James Chen
1. Introduction
Our world is becoming smarter each day thanks to something called Artificial Intelligence, or AI for short. This field of technology, from being a future concept to a tangible reality, is infusing and changing many parts of our lives. This book is an invite to learn about exciting parts of this bright new world.
Everything started with an idea named Machine Learning (ML). It's like teaching a computer to learn from data, just like we learn from our experiences. A lot of the tech magic we see today like autonomous driving, voice assistants or email filters would not be possible without it.
Then came to Deep Learning (DL), a special kind of Machine Learning (ML). It's like imitating how our brain works to help computers recognize patterns and make predictions.
On taking a closer look at Deep Learning, we find something called Language Models. Particularly, Generative AI and Large Language Models (LLM) have a unique place, they can create text that looks like written by human, which is really exciting!
At the heart of these changes, there are Transformer models designed to work with language in unique and powerful ways. The magic of the Transformer model is its incredible ability to understand language context, which makes it perfect for tasks like language translation, text summarization, sentiment analysis, and creating conversational chatbots like ChatGPT, where the Transformer model works as the backbone. This is the main topic of this book.
To explore the amazing world from AI to the language models, there are some tools that experts love to use, two of them are Python and PyTorch.
Python is a programming language that many people love, because it's easy to read, write and understand. It's like the friendly neighborhood of programming languages. Plus, it has a lot of extra libraries and packages that are specifically designed for Machine Learning, Deep Learning and AI. This makes Python a favorite for many people in these fields.
One of these extra libraries and packages is called PyTorch, like a big cabinet filled with useful tools just for Machine Learning and Deep Learning. It makes creating and training models like Transformer models much easier and simpler.
When we're working on such complex tasks like training a language model, we want tools that make our work easier and faster. This is exactly what Python and PyTorch offer. They help streamline complex tasks so we can spend more time on achieving our goals and making progress.
Therefore, this book is all about taking this exciting journey from the big world of AI to the specialized area of Transformer models, and this book will use Python and PyTorch to help you learn how to build, train, and fine-tune transformer models.
Welcome aboard and get ready to learn about how these technologies are helping to shape our future.
1.1.
What is AI, ML, DL, Generative AI and Large Language Model
AI, ML, and DL, etc. — you've likely seen these terms thrown around a lot. They shape the core of the rapidly evolving tech industry, but what exactly do they mean and how are they interconnected?
Let's clarify. As a very high-level overview as shown in Figure 1.1, Artificial Intelligence (AI) includes Machine Learning (ML), which includes Deep Learning (DL). The Generative AI is a subset of Deep Learning, the Large Language Mode is inside Generative AI. There are also some other things included inside the Generative AI, such as Generative Adversarial Network (GAN) and so on.

A diagram of machine learning Description automatically generatedFigure 1.1 AI, ML, DL, Generative AI and Large Language Model
Artificial Intelligence (AI)
Artificial intelligence is to create the machines and applications that can imitate human perceptions and behaviors, it can mimic human cognitive functions such as learning, thinking, planning and problem solving. The AI machines and applications learn from the data collected from a variety of sources to improve the way they mimic humans. The fundamental objective of AI is to create systems that can perform tasks that usually require human intelligence. This includes problem-solving, understanding the natural human language, recognizing patterns, and making decisions. AI acts as the umbrella term under which ML and DL fall.
As some examples of artificial intelligence, autonomous driving vehicles like Google's Waymo self-driving cars; machine translation like Google Translate; chatbot like ChatGPT by OpenAI, and so on. It’s widely used in the areas such as image recognition and classification, facial recognition, natural language processing, speech recognition, computer vision, etc.
Machine Learning (ML)
Machine learning, an approach to achieve artificial intelligence, is the computer programs that use mathematical algorithms and data analytics to build computational models and make predictions in order to resolve business problems.
ML is based on the concept that systems can learn from data, identify patterns, and make decisions with minimal human intervention. ML algorithms, also known as models, are trained on a set of data (called training sets) to create a model. When new data inputs come in, these models then make predictions or decisions, without being explicitly programmed to execute those tasks.
Different from traditional computer programs where the routines are predefined with specific instructions for specific tasks, machine learning is using mathematical algorithms to analyze and parse large amounts of data and learn the patterns from the data and make predictions and determinations.
Deep Learning (DL)
Deep learning, as a subset of machine learning, uses neural networks to learn things in the same, or similar, way as human. The neural networks, for example artificial neural network, consist of many neurons which imitate the functions of neurons of a biological brain.
Deep learning is more complicated and advanced than machine learning, the latter might use mathematical algorithms as simple as linear regression to build the models and might learn from relatively small sets of data. On the other hand, deep learning will organize many neurons in multiple layers, each neuron takes input from other neurons, performs the calculation, and outputs the data to the next neurons. Deep learning requires relatively larger sets of data.
In recent years the hardware is developed with more and more enhanced computational powers, especially the graphics processing units (GPUs) which were originally for accelerating graphics processing, and they can significantly speed up the computational processes for deep learning, they are now an essential part of the deep learning, and new types of GPUs are developed exclusively for deep learning purpose.
Generative AI
Generative AI is a type of artificial intelligence systems that have the capability to generate various forms of contents or data that are similar to, but not same as, the input data they were trained on. Generative AI is a subset of Deep Learning (DL), meaning it uses deep learning techniques to build, train, understand the input data, and finally generate synthetic data that mimic the input training data.
It can generate a variety of contents, such as images, videos, texts, audio and music and so on.
My book of "Machine Learning and Deep Learning With Python[3]", ISBN: 978-1-7389084-0-0, 2023, or [3] in the Reference section at the end of this book, introduced the Generative Adversarial Network (GAN) which is a typical type of generative AI, it consists of two neural networks, a generator and a discriminator, which are trained simultaneously through adversarial training. The generator produces new synthetic images, while the discriminator evaluates if it’s real or fake. Through the iterative training process the generator is trained to create the synthetic images that close enough to the original training data. That book also includes a hands-on example of how to implement the GANs with Python and tensorflow library.
Large Language Model (LLM)
The Large Language Model is a subset of Generative AI, it refers to the artificial intelligence systems that are able to understand and generate human-like languages. The LLM models are trained on vast amounts of textual data to learn the patterns, grammar, and semantics of human language, this huge amount of text may be collected from the internet, books, newspaper and other sources. In most cases, extensive computational resources are required to perform the training on the huge amount of data, therefore the graphics processing units (GPUs) are widely used for training the LLMs.
There are some popular LLMs available as of today, including but not limited to:
GPT3, and 4: developed by OpenAI, it can perform a wide range of natural language processing tasks.
BERT: (Bidirectional Encoder Representations from Transformers): developed by Google.
FLAN-T5: (Fine-tuned LAnguage Net, Text-To-Text Transfer Transformer), also developed by Google.
BloombergGPT: developed by Bloomberg and focus on the languages and terminologies in financial industry.
The Large Language Model (LLM) is the focus of this book.
1.2.
Lifecycle of Large Language Models
When an organization decides to implement Large Language Models (LLMs), there is a typical process that includes several stages of planning, development, integration, and maintenance throughout the lifecycle of LLMs. It’s a comprehensive process that encompasses various stages, each crucial for the successful development, deployment, and utilization of these powerful AI systems, as shown in Figure 1.2.
1. Objective Definition and Feasibility Study:
The organization should define the clear goals for what to achieve with the LLMs, identify the requirements, and understand the capabilities they could provide.
The organization should also conduct feasibility research to analyze the technical requirements and the potential return on investment (ROI), examine the available computational resources, data privacy policies, and whether the chosen LLMs can be effectively integrated into current infrastructures.

A diagram of a model Description automatically generated with medium confidence
A black background with white text Description automatically generatedFigure 1.2 Lifecycle of LLMs
2. Data Acquisition and Preparation:
The organization should collect a large, diverse, and representative dataset, pre-process the dataset which include cleaning, annotating, or augmenting the data. This step is very important to ensure data quality, diversity and volume to train or fine-tune the model.
3a. Choose Existing Models:
The organization should understand the cost structure for using different LLMs, and consider the total cost of ownership over the lifespan of the LLMs. Section 4.6 of this book introduces some most popular LLMs in the industry, by reviewing the goals and requirements the organization should be able to select a pre-trained LLM that best suits its specific needs.
3b. Pre-training a Model:
Alternatively, if the organizations have their very specific requirements and goals that cannot be addressed by existing LLMs, they might decide to pre-train a LLM from scratch on its own, they should be prepared to invest significant resources and follow a structured process. Completing this process successfully requires careful planning and a significant commitment of resources not only the hardware devices but also the talents.
Chapter 4 of this book goes through the steps of pre-training a LLM model with a machine translation task, which is a hands-on practice.
4. Evaluation:
After pre-training the model, or selecting an existing pre-trained model, the organization should evaluate the model’s performance using validation datasets, and identify the areas that need to improve.
5. Prompt Engineering, Fine-turning and Human Feedback
There are a few ways to fine-tune the model, which include Prompt Engineering, Fine-tuning and Human Feedback, they are used together to make the LLM performs as desired.
Prompt engineering is to create input prompts to effectively communicate with the model and derive the desired outputs. It will be introduced later in this book.
Fine-turning is a process after the pre-training of a LLM, further train the model on task-specific datasets. It’s a supervised learning and allows the model to specialize in tasks relevant to the organization's needs.
As the model is becoming more capable, it’s very important to ensure it behaves well and in a way that align with human preferences by the reinforcement learning with human feedbacks.
6. Monitoring and Evaluation
It’s important to perform regular evaluation on the model during the fine-turning phase, monitoring and testing the model on various benchmarks and against established metrics to ensure it meets the desired criteria. Chapter 5 will introduce a variety of benchmarks and metrics for evaluating the LLMs.
7. Deployment
After the LLMs are confirmed to work as desired, deploy them into production on the corporate infrastructure where it can be accessed by the user acceptance testing. The deployment of LLMs is a complex and multifaceted process that requires careful consideration of various factors, Chapter 6 discusses the considerations and strategies for deployment.
8. Compliance and Ethics Review
In order not to expose the organization to legal or reputational risks, make sure to conduct periodic reviews and assessments to ensure the LLMs comply with all relevant regulations, industry standards, corporate policies and ethical guidelines, especially with regard to data privacy and security. Chapter 6 also discusses this topic.
9. Build LLM powered applications
After implementing an LLM, the organization might consider building LLM-powered applications to leverage its capabilities to enhance products, services, or internal processes. They may automate tasks related to natural language such as customer service inquiries, or enhance productivity by providing tools for summarization, information retrieval, etc., or improve the user experiences by providing human-like interactions with personalized and conversational AI. Chapter 6 will discuss this together with some practical examples.
10. User Training and Documentation
Provide comprehensive documentation and train end-users on how to interact effectively with the LLMs and the LLM-powered applications.
In conclusion, the lifecycle of LLMs is a multifaceted and iterative process that requires careful planning, execution, and continuous monitoring. By adhering to best practices and prioritizing a wide array of considerations, organizations can harness the power of LLMs while mitigating potential risks and ensuring responsible and trustworthy AI development.
1.3.
Whom This Book Is For
This book is a treasure for anyone who is interested in learning about language models, it’s written for people with different computer programming levels, whether you're just starting out or already have experiences. No matter you're taking your first steps into this fascinating world, or looking to deepen your understanding of AI and language models, you will be benefit from this book, which is a great resource for everyone on their learning journey.
If you're a beginner, don't worry! This book is designed to guide you from the basics, like Python and PyTorch, all the way to complex topics, like the Transformer models. You will start your journey with the fundamentals of machine learning and deep learning, and gradually explore the more exciting ends of the spectrum.
If you already have some experience, that's great too! Even those with a good understanding of machine learning and deep learning will find a lot to learn here. The book delves into the complexities of the Transformer architecture, making it a good fit for those ready to expand their knowledge.
This book also serves as a companion guide to the mathematical concepts underlying the Large Language Models (LLMs). These background concepts are essential for understanding how models function and their inner workings. As we journey through this book, you'll gain a deeper appreciation of Linear Algebra, Probability, and Statistics, among other key concepts. This book simplifies these concepts and techniques, making them accessible and understandable regardless of your math background.
By humanizing those mathematical expressions and equations used in the Large Language Models, this book will lead you on a path towards mastering the craft of building and using large language models. This makes the book not only a tutorial for Python, PyTorch and LLMs, but also a friendly guide to the intimidating world of mathematical concepts.
So, whether you're a math-savvy or just a beginner, this book will help you within your comfort zone. It's not just about coding models, but understanding them and, in the process, advancing your knowledge about the theory that empowers ML and AI.
1.4.
How This Book Is Organized
This book is designed to provide a comprehensive guide to understanding and working with large language models (LLMs). It is structured in a way that gradually builds your knowledge and skills, starting from the fundamental concepts and progressing towards more advanced topics and practical implementations.
Before diving into the intricacies of LLMs, Chapter 2 establishes a solid foundation in PyTorch, the popular deep learning framework used throughout the book. It also covers the essential mathematical concepts and operations that underpin the implementation of LLMs. This chapter is the foundation upon which everything else in this book will be built.
Chapter 3 delves into the Transformer architecture -- the heart of LLMs. It explores the various components of the Transformer, such as self-attention mechanisms, feed-forward networks, and positional encoding, etc. This chapter is a practical guide to constructing a Transformer from the ground up, with code examples using PyTorch, you will gain hands-on experience and insights into the mechanics of self-attention and positional encoding, among other fundamental concepts.
Pre-training is a crucial step in the development of LLMs, in Chapter 4 we explore the methodologies to teach LLMs the subtleties of language, and provide you with the theoretical framework and example codes to pre-train a Transformer model. You'll gain hands-on experience by pre-training a Transformer model from scratch using PyTorch.
Once an LLM is pre-trained, the next step is to fine-tune it for specific tasks. Chapter 5 covers traditional full fine-tuning methods, as well as more recent innovative techniques like Parameter Efficient Fine-tuning (PEFT) and Low-Rank Adaptation (LoRA). By the end of this chapter, you'll expect to have a toolkit of techniques to implement these fine-tuning approaches using PyTorch code examples.
Bringing theory into reality, Chapter 6 focuses on deploying LLMs effectively and efficiently. You will explore various deployment scenarios, considerations for production environments, and methods to serve your fine-tuned models to end-users. This chapter is about crossing the bridge from experimental to practical, ensuring your LLM can operate robustly in the real world.
As you progress through the chapters of this book, you'll find a balance of theory and application, including code examples, practical exercises, and real-world use cases to reinforce your understanding of LLMs. Whether you're a beginner or an experienced practitioner in the field of natural language processing (NLP), this book aims to provide a comprehensive and practical guide to demystifying large language models (LLMs).
1.5.
Source Code and Resources
This book is more than just an informational guide, it's a hands-on manual designed to offer practical experience. To make this learning journey effective and interactive, we've made all the source code in this book available on GitHub:
https://github.com/jchen8000/DemystifyingLLMs.git
This repository contains a dedicated folder for each chapter, allowing you to easily navigate and access the relevant code examples. This includes PyTorch code examples, implementations of the Transformer architecture, pre-training, fine-tuning scripts, simple chatbot, and more.
By cloning or downloading this repository, you can easily replicate, experiment, or build upon the examples and exercises provided in this book. The aim is to provide a comprehensive learning experience that brings you closer to the state-of-the-art in large language models.
Within each chapter's folder, you'll find well-documented and organized files that correspond to the code snippets and examples discussed in the book. These files are designed to be self-contained, ensuring that you can run them independently or integrate them into your own projects.
All source codes provided with this book is designed to run effortlessly in Google Colab, or similar cloud-based Jupyter notebook services. This greatly simplifies the setup process, freeing you from the typical headaches of configuring a local development environment, and allowing you to focus your energy on the heart of the book—the Large Language Models. These code examples are tested and working in Google Colab environment at the time of writing, a free plan with a single GPU is all you need to run the code.
In addition to the source code, this book references a collection of high-quality scholarly articles, white papers, technical blogs, and academic artefacts as its backbone. For ease of reference and to enable further in-depth exploration of specific topics, all these resources are listed in the References section towards the end of the book. These resources serve as extended reading materials for you to deepen your understanding and gain more insights into the exciting world of large language models.
Leverage these resources, explore the references, experiment with the code, and embrace the fantastic journey of unraveling the mysteries of large language models (LLMs)!
2. Pytorch Basics and Math Fundamentals
PyTorch is an open-source machine learning library developed by Facebook's AI Research lab (FAIR), first officially released in October 2016. Originally Torch library was primarily designed for numerical and scientific computing, but it gained popularity in the machine learning community due to its efficient tensor operations and automatic differentiation capabilities, which laid the foundation of PyTorch. It addressed some limitations of the Torch framework and provided more functionalities for machine learning and neural networks. It’s now widely used for deep learning and artificial intelligence applications.
In this book PyTorch is used as the primary tool to explore the world of Large Language Models (LLMs). This chapter will introduce some basics of PyTorch, including tensors, operations, optimizers, autograd and neural networks. PyTorch allows users to perform calculations on Graphics Processing Units (GPUs), this support is important for speeding up deep learning training and inference, especially when dealing with large language model where huge datasets and complex models are involved. This chapter will focus on this aspect as well.
The Large Language Models (LLMs) are built on various mathematical fundamentals, including concepts from linear algebra, calculus, and probability theory. Understanding these fundamentals is crucial for developing, training, and fine-tuning large language models, which include complex architectures and sophisticated training procedures. A solid foundation in these mathematical concepts is essential in the field of natural language processing (NLP) and artificial intelligence (AI).
But don’t scary, this chapter will introduce the key mathematical concepts from very basic and focus on implementing them using PyTorch.
2.1.
Tensor and Vector
In PyTorch, a tensor is a multi-dimensional array, a fundamental data structure for representing and manipulating data. Tensors are similar to NumPy arrays and are the basic building blocks used for constructing neural networks and performing various mathematical operations in PyTorch. Tensors is most often used to represent vectors and matrices.
This section is to introduce some commonly used PyTorch tensor related functions together with their mathematical concepts. These are very basic operations for deep learning and Large Language Model (LLMs) projects, which are used throughout this book.
A vector, in liner algebra, represents an object with both magnitude and direction, it can be represented as an ordered list of numbers, for example:

A black background with a black square Description automatically generated with medium confidenceThe magnitude (or length) of the vector is calculated as:

A black background with a black square Description automatically generated with medium confidenceIn general, a n-dimensional vector has n numbers:

Picture 6In PyTorch, tensors are commonly used to represent vectors with a one-dimensional array:
Line 1 is to import the library of PyTorch, and Line 2 is to define a one-dimensional array. The result looks like:
Vector: tensor([2., 3., 4.])
torch.norm() function is used to calculate the magnitude (or length) of the vector:
The result is:
tensor(5.3852)
The norm, in linear algebra, is a measure of the magnitude or length of a vector, typically it’s called Euclidean Norm, and defined as:

A black background with a black square Description automatically generated with medium confidenceIn Python another library, Numpy, provides the similar functionalities, both PyTorch tensors and Numpy arrays are powerful tools for numerical computations. The Numpy arrays are mostly used for scientific and mathematical applications, although also used for machine learning and deep learning; the PyTorch tensors are specifically designed for deep learning tasks with a focus on GPU acceleration and automatic differentiation, we will discuss it later.
Generate a tensor with 6 numbers, which are randomly selected from -100 to 100:
The result is something like:
tensor([ 82, -97, 53, -79, -74, -90])
Create an all-zero tensor:
The result has 8 zeros in the array:
tensor([0., 0., 0., 0., 0., 0., 0., 0.])
Create an all-one tensor:
The result:
tensor([1., 1., 1., 1., 1., 1., 1., 1.])
The default data type for tensors is float32 (32-bit floating-point), when you create a tensor without explicitly specifying a data type, it will be float32. In the above example, the number 0 or 1 is followed by a .
, which means it’s a float number.
If you want to specify a data type, say int64: