0% found this document useful (0 votes)

23 views

Introduction To Git and GitHub

Uploaded by

Vinay Patil

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Introduction To Git and GitHub

Uploaded by

Vinay Patil

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Introduction to Git and GitHub

Version Control System

Git is a version control system, a software that helps you manage the different versions of your work. Version control
skills are rapidly becoming a prerequisite for all data scientists. Version control can improve teamwork among data
scientists by encouraging collaboration on projects, simplifying the exchange of work, and assisting other data scientists
in repeating the same or similar processes. It is usually helpful to have the option to undo changes or make modifications
to a branch before merging them into the active project, even if you are a data scientist working alone. This allows you to
ensure that your change won't break anything.

Git is a version control software

GitHub is Git repository hosting platform

To summaries what Sajan explained, in the absence of a version control system, we tend to store files in a messy
format. you have definitely saved a file with suffixes such as final1, final2, final_final or really_final. This seems like a
quick fix while you are working, but when we revisit the same directory after a few months, it becomes really difficult to
locate the file and the folder.

If you are working in collaboration, it becomes difficult to track who made the change, what change was made and when
it was made. Other than that, it is difficult to retrieve the files in your system in case of any crash. However, if you use a
version control system, such as Git, you can fix these problems.

A version control system, such as Git, helps you by taking care of the following:

1. Tracking changes
2. Ability to revert back to any version
3. Finding what changes were made, when were they made and by whom
4. Easy recovery in case of any disaster
5. Improving the efficiency and agility of team projects

Git is a version control system for code files.

Background of Version Control System

There are three main types of version control systems:

1. Local Version Control System

2. Centralized Version Control System

3. Distributed Version Control System

 Git

 Mercurial

 Perforce

 BitKeeper

The distributed version control system takes up more storage space as every developer has the entire version
database. However, this does not pose a great problem in the versioning of code files, which are mostly text files and
require storage space of only a few kilobytes.
Introduction to Git and GitHub
Difference Between Git and GitHub

 Git is a distributed version control system. It is a tool to manage your project source code history. GitHub is
a web-based Git file hosting service that enables you to showcase or share your projects and files with others.

 A repository is a directory that contains your project work. All the files in the repository can be uploaded to
GitHub and shared with other people either publicly or privately.

 Git was developed in 2005 and it became very popular across the world as a main software development tool
across the world. It has been adopted by many fortune 500 companies, such as Google, Facebook, Microsoft,
Twitter, LinkedIn, etc.

 As the popularity of Git increased, many other repository hosting platforms came into existence, such as GitHub,
Gitlab, BitBucket, etc. GitHub was the most popular repository hosting platform. GitHub provides the power to
share your repository with the public and it also provides a graphical user interface.

 GitHub, which was launched in 2008, was widely adapted across several companies. As of 2021, it is being used
by 65+ million developers, 3+ million organizations, 200+ million repositories and 72% of Fortune 50 companies.

Basic Git Commands-I

The first step is creating a repository named ‘fraud detection’ in GitHub.

 README file is used to state an overview of

what the repository contains.

 .gitignore files are the files that you do not

want to be committed in the repository and
the license file is used when you have
created a software and you want to share it
for others to use.

 You will learn more about the README file in

upcoming videos.

You can read more about .gitignore files here and licensing files here.
Introduction to Git and GitHub
You have configured a Git account with your local system. For the first-time Git configuration, you use the following
commands:

git config --global user.name <yourusername> # Using this, you will enter your GitHub
username

git config --global user.email <youremail@example.com> # Using this, you will enter
your GitHub username

These commands are required only for the first-time configuration. For later instances, you do not need to run these
commands.

mkdir <folder name> - makes a directory

ls - lists all the directories in the current directory. In windows, dir is used
instead of ls

cat <filename> - displays the content in the file

cd <directory name> - to navigate to the directory

pwd - present working directory

Additional Links

1. Refer to this link for more information on basic shell commands. A cheat sheet for basic shell commands can be
found here.

2. Please go through the documentation here to learn basic git commands.

Which command is used to link the local folder to the GitHub repository?

git remote add origin <GitHub_URL>

Using this command, you can add a new remote repository to your local repository.

Which shell command will you use to find the present working directory?

pwd is used to find the present working directory.

What will happen when you execute the command ‘git branch -m main’?

It will create another branch called ‘main’ and you will be directed to the main branch. If you
type git branch after executing the above command, you will notice that your current branch has
been changed from master to main.
Introduction to Git and GitHub
For configuration, you follow the following steps:

1. echo "# type content of the file" >> file_name.md

This command creates a readme file and adds # fraud detection as contents in the file.

2. git init
Initializing the git repository for a particular folder

a. git config user.name < username>

Using this, you will enter your GitHub username

b. git config user.email <youremail@example.com>

Using this, you will enter your GitHub username

c. git status
status of the file in that instant

3. git add <File_name>.md

Adding the file, this means the file is going in the tracking mode

4. git commit -m "<comment on the file>”

Adding comment about the file just added or tracked

5. git branch
return the name of the main branch

6. git branch-M <name>

renaming the main branch

7. git remote add origin <https://github.com/user_name/respo_name.git>

The 'git remote add origin URL' command

Using this command, you can add a new remote repository to your local repository. To do so, you should
use the git remote add command on the terminal, in the directory your repository is stored at. The git
remote add command takes two arguments:
 A remote URL, for example, https://github.com/<user>/<repo>.git (the address of the
repository on your GitHub account that you want to link your local repository to.)
 A remote name, for example, origin (it can be any name)

8. git push -u origin <branch name>

Pushing all the file to the repository.

‘git remote add’ is used to

connect your local folder
with the git remote
repository, while ‘git config’
is used to configure the local
machine to your GitHub
account.
Introduction to Git and GitHub
Machine Learning - Jargon Busting

In its most basic form, machine learning is a set of instructions given to the computer to learn patterns from the data and
create its version of the pattern.
The diagram below shows the conceptual flow chart of a machine learning process.

As you can see from the image, data represents a sample of reality. From this sample of reality, a machine learning code
(also called a model) builds relationships between the data.
For instance, Google maps app in your phone builts a relationship between the various data points like
the distance you want to travel, the traffic on the road, the vehicle you are using, and the time it will take
for you to arrive at your destination.
In simple words, a machine learning model is just a code that can relate some input data points to the desired result.
The process of building any model is iterative, as shown in the image above.
Since building a machine learning model is complicated, ML engineers split the process into smaller steps. Typically the
ML processes can be divided into a few predefined segments.
1. Data Processing:
As the name suggests, this piece of code will deal with all the processing that needs to be done
on the data in its raw form. Later in the program, you will learn about the processes that need to
be done on raw data.

2. Model Building:
This piece of code will have instructions about extracting the patterns hidden in the data.

3. Model Evaluation:
This segment of code will have instructions about evaluating the pattern the model has learnt
from the data.

Apart from these, there are other popular segments like feature extraction, model retraining. At this point, it is not
expected that you will understand all the nuances being discussed. Important take ways are:
1. The machine learning model is a set of instructions.
2. The machine learning code is usually split into multiple smaller code files.

Additional Reference
StatQuest video on explaining machine learning
Introduction to Git and GitHub
Basic Git Commands - II
Creation of any project, they are three main stages:
1. Modified: When you change any code in the file.
2. Staged: When we add the file using ‘git add’, it goes into the staging area.
3. Committed: When you commit all the changes in the files that were in the staging area.
We created a file called data_processing.py in our local repository ‘fraud_detection’. Then, we wrote a dummy code in
the data processing file and performed the following functions:
#Commands used
git add
git status
git commit -m “message”
git push -u origin master

Let’s understand these commands in detail:

Git Command Description

1. It is used to add modified files to the staging area. You can add a
specific file using the command 'git add <filename>'.
'git add <filename>' or 'git add .'
2. If you wish to add all modified and unstaged files present in the
workspace to the staging area, you can use the 'git add .' command.

This command will display the state of the working directory and the
'git status' staging area. In other words, it lets you see the changes that have been
staged and the changes that have not been added to the staging area.

It gives a new commit message and commits all the files sitting in the
git commit -m “New commit message”
staging area.

It is used to upload all the files and changes that were included in the
git push -u origin master or git push
most recent commit to your remote repository on GitHub.

Basic Git commands: click here (These commands will help you throughout the course in case you forget anything)
Introduction to Git and GitHub
As shown in above diagram,
 ‘git add’ puts the file that you are working on in the staging area where the Git software keeps a track of the files.
 ‘git commit’ transfers it to the local version database and
 ‘git push’ transfers it to the remote repository in GitHub

Track File
Suppose that you create a new file named ‘model.py inside the Git repository. Will Git track it automatically?
No
Git will not track a new file created inside the Git repository automatically. You have to explicitly
ask Git to track a file. You can ask Git to track a newly added file named ‘model.py’ using the
following command:
git add model.py
This option is the correct choice.

Resetting and Reverting

In order to revert back to the previous commit, we used the Git revert command:
git revert HEAD → Reverts the project to the previously committed version
To go back to a commit that was 3-4 commits back, the command
git reset --hard <commit ID> → Reverts the project to the specific committed version

git log:
This command shows you the commit details. It lists out the commits made in the repository in reverse-
chronological order, that is, the most recent commits show up first. It shows commits with the following details:
 The commit ID or SHA
 Author’s name (who made the commit)
 Date and time
 For a shorter version of git log, you have git log –oneline

git Log --oneline → show number of commit done till now in concise manner

‘git revert’ is used to go back to the previous version and ‘git reset’ is used to go back to any version.

‘git revert’ will create another commit that shows that you are reverting back to the previous commit.

‘git reset’ goes back to the commit id that is mentioned and all the commits after that are erased.
Introduction to Git and GitHub
you need to be very careful about using ‘git reset’ as you will lose all the commits that you have done after the desired
commit.
Although ‘git reset’ and ‘git revert’ are good tools to get back to previous commits. It is not the best practice in the
industry to experiment on a stable code.
So, you use branching, wherein each individual developer can experiment on the code separately, and after successful
experimentation, they can merge their branches to the master branch.

Working with branches

When you come across the term branching, you might correlate it to the branches of a tree.
Well, yes! Branching means exactly the same. Imagine the branches growing out of the trunk of a tree. This trunk
represents Git, as shown in the following illustration.

Branching

The trunk in this image plays the role of a master branch and the branches coming out of the trunk represent the
branches in Git.
In machine learning projects, after preprocessing the data, you proceed to the modelling step. In this process, you may
not want to disturb the master branch with unnecessary experimentation, so you will create another branch named
‘model’.
To create a new branch, we have used the following function:
git checkout -b < New Branch name >

checkout command can be used to switch branches. The syntax to switch to another branch is given as follows:
git checkout <destination branch name>

Pull request
To merge it to the main branch, we used the GitHub UI to create the pull request.
Pull request means to request the merger of branches. In our case, there were no conflicts and the status was shown as
‘able to merge’.
Introduction to Git and GitHub
Once the pull request was created, we went to the pull request tabs and merged the two branches.

Once you have merged with the master branch, try the following exercise to understand how to resolve conflicts:
1. Make changes to the existing code of model.py file. Example: change one of the print statements.
2. Then, create a pull request.
3. Does it show ‘able to merge’ now or does it show ‘there is some conflict’?

It will show that there is some conflict, it is because the same file has different codes. You can resolve the conflicts and
then merge the branches. You can learn more about merging conflicts here.

To summaries, you learnt that there are situations where you may want to parallelly develop an existing project code,
without making any changes to your initial/original branch. You can accomplish this goal by creating different branches
based on your need (that is, creating a branch per team member or a branch for every new feature) and each branch will
have the same copy of the initial/original branch of the project source code.

Summary

In this session on the ‘Introduction to Version Control and Git’, you learnt the answers to the following questions:
 What issues would you face if you did not use version control to track the different changes and versions of your
project files?
o Some of the issues that you encounter are as follows:
 Suppose you are working on a code and, after making some changes, you realise that you have
messed up the code, and now, you would like to revert to the last good version of your project.
 Coordinating the changes that are being made to the project between you and
fellow developers
 How version control comes to your rescue if you have a huge file and you want to keep track of all the changes in
your file?
 About the three types of Version Control Systems (VCS):
o Local
o Centralized
o Distributed
 About Git and why it is preferred over all the other distributed version control systems?
 About GitHub and how you can send your changes from your local system to your remote repository on GitHub?
 How to use a set of commands on the command line to push all your code from your local system to your GitHub
remote repository?
 How to revert back to previous stable versions of codes?
 How to create a branch for experimentation and merge it back to the master branch?

You can refer to the Git cheat sheet for reference.

Introduction to Git and GitHub

CV Sisgp 2023 - 2024
No ratings yet
CV Sisgp 2023 - 2024
4 pages
Seminar Report On Git and Github
100% (1)
Seminar Report On Git and Github
11 pages
Reciprocating Multiruns: Case Packages Multirun Interaction
No ratings yet
Reciprocating Multiruns: Case Packages Multirun Interaction
6 pages
2018 Class Resume Book
No ratings yet
2018 Class Resume Book
46 pages
Git
No ratings yet
Git
31 pages
Git and GitHub
No ratings yet
Git and GitHub
12 pages
An Intro To Git - Github
No ratings yet
An Intro To Git - Github
7 pages
Article Review 12 Eng
No ratings yet
Article Review 12 Eng
12 pages
Introduction To Git and GitHub
No ratings yet
Introduction To Git and GitHub
3 pages
COSC Git Workshop
No ratings yet
COSC Git Workshop
23 pages
Git Hub
No ratings yet
Git Hub
9 pages
Basic Git
No ratings yet
Basic Git
87 pages
Git Docs1
No ratings yet
Git Docs1
12 pages
A Beginner'S Guide To Git and Github: January 2017
No ratings yet
A Beginner'S Guide To Git and Github: January 2017
5 pages
Versioning Git Matlab
No ratings yet
Versioning Git Matlab
13 pages
Git_Lec2
No ratings yet
Git_Lec2
24 pages
GIT Docs
No ratings yet
GIT Docs
9 pages
Git Handout - IU
No ratings yet
Git Handout - IU
5 pages
Git Basics Vtu
No ratings yet
Git Basics Vtu
11 pages
Git Tutorial
100% (1)
Git Tutorial
35 pages
Basic Git (1)
No ratings yet
Basic Git (1)
23 pages
Installing Git
No ratings yet
Installing Git
7 pages
Prodigius GIT
0% (1)
Prodigius GIT
26 pages
Git GitHub
No ratings yet
Git GitHub
30 pages
GITHUB
No ratings yet
GITHUB
31 pages
A Beginner'S Guide To Git and Github
No ratings yet
A Beginner'S Guide To Git and Github
4 pages
Learn The Basics of Git in Under 10 Minutes
No ratings yet
Learn The Basics of Git in Under 10 Minutes
18 pages
A Tutorial For GitHub
No ratings yet
A Tutorial For GitHub
31 pages
GitHub
No ratings yet
GitHub
6 pages
bggit_a4_c_1-GIT
No ratings yet
bggit_a4_c_1-GIT
115 pages
All About Git and Github
No ratings yet
All About Git and Github
4 pages
FER202_Slot01_Exercise 3_GIT
No ratings yet
FER202_Slot01_Exercise 3_GIT
10 pages
8 - 11 - Introduction To Github
No ratings yet
8 - 11 - Introduction To Github
10 pages
Devsecops Viva - Edited
No ratings yet
Devsecops Viva - Edited
23 pages
source_code_lab_record
No ratings yet
source_code_lab_record
48 pages
Git and GitHub
No ratings yet
Git and GitHub
40 pages
What is git
No ratings yet
What is git
35 pages
Git Manual
No ratings yet
Git Manual
19 pages
Introduction To GitHub and Version Control
No ratings yet
Introduction To GitHub and Version Control
9 pages
Git Slides
No ratings yet
Git Slides
34 pages
Souce control notes
No ratings yet
Souce control notes
6 pages
DEVSEC Exp 2
No ratings yet
DEVSEC Exp 2
4 pages
Introduction To Git - gitTutorial
No ratings yet
Introduction To Git - gitTutorial
20 pages
Learn_Git_Command_with_Practical_Examples_on_Linux
No ratings yet
Learn_Git_Command_with_Practical_Examples_on_Linux
11 pages
Git Basics in Under 10 Minutes
No ratings yet
Git Basics in Under 10 Minutes
22 pages
Git y Github
No ratings yet
Git y Github
11 pages
Git and Github
100% (1)
Git and Github
217 pages
An Introduction To Git and Github: Prof. Andrew C.R. Martin, University College London November, 2018
No ratings yet
An Introduction To Git and Github: Prof. Andrew C.R. Martin, University College London November, 2018
25 pages
GIT AND GITHUB LESSON
No ratings yet
GIT AND GITHUB LESSON
51 pages
INFO2180
No ratings yet
INFO2180
67 pages
2022 DH Toolbox Intro To Git and GitHub
No ratings yet
2022 DH Toolbox Intro To Git and GitHub
83 pages
Github and Git
No ratings yet
Github and Git
28 pages
Github Short Notes
No ratings yet
Github Short Notes
10 pages
git
No ratings yet
git
5 pages
Git-workshop-2024
No ratings yet
Git-workshop-2024
99 pages
Introduction To GitHub
No ratings yet
Introduction To GitHub
17 pages
Git and GitHub
No ratings yet
Git and GitHub
4 pages
An Introduction To Git and Github: Prof. Andrew C.R. Martin, University College London November, 2018
No ratings yet
An Introduction To Git and Github: Prof. Andrew C.R. Martin, University College London November, 2018
25 pages
Git Github Intro
No ratings yet
Git Github Intro
14 pages
The Git Version Control System
No ratings yet
The Git Version Control System
20 pages
DevOps Cours Partie2
No ratings yet
DevOps Cours Partie2
51 pages
Git Tutorial
No ratings yet
Git Tutorial
20 pages
Git Essentials
From Everand
Git Essentials
Ferdinando Santacroce
4.5/5 (4)
D C Chapter 03 Topic 51 (Network Performance)
No ratings yet
D C Chapter 03 Topic 51 (Network Performance)
20 pages
(Free Scores - Com) Trompette de Noel 58745
No ratings yet
(Free Scores - Com) Trompette de Noel 58745
4 pages
JPEG Image Compression and Decompression PDF
No ratings yet
JPEG Image Compression and Decompression PDF
7 pages
Power Mate H Motion Cntroller
No ratings yet
Power Mate H Motion Cntroller
448 pages
Base II Clearing VML Developer Handbook
No ratings yet
Base II Clearing VML Developer Handbook
78 pages
The Study On Resolutions of STRIDE Threat Model
No ratings yet
The Study On Resolutions of STRIDE Threat Model
3 pages
Next Generation Transmission Technology Infineon Technologies POTSWIRE SHDSL Technology
No ratings yet
Next Generation Transmission Technology Infineon Technologies POTSWIRE SHDSL Technology
12 pages
Neonode® Touch Sensor Module User's Guide
100% (1)
Neonode® Touch Sensor Module User's Guide
151 pages
JFF Publications Website - Google Search
No ratings yet
JFF Publications Website - Google Search
2 pages
Marvelmind Indoor Positioning Technologies Review
No ratings yet
Marvelmind Indoor Positioning Technologies Review
34 pages
Couchbase SDK Net 1.2 PDF
No ratings yet
Couchbase SDK Net 1.2 PDF
57 pages
Flask Restx Readthedocs Io en Latest
No ratings yet
Flask Restx Readthedocs Io en Latest
95 pages
FactoryTalk Historian SE Basic Lab (RAcbi) - 8-28-2018-IN05
No ratings yet
FactoryTalk Historian SE Basic Lab (RAcbi) - 8-28-2018-IN05
100 pages
Deloitte Belgium - AI Brochure
No ratings yet
Deloitte Belgium - AI Brochure
4 pages
Forti ADC
No ratings yet
Forti ADC
2 pages
Datasheet 70F3350
No ratings yet
Datasheet 70F3350
17 pages
Chap5 - Conditional Statements in MATLAB
No ratings yet
Chap5 - Conditional Statements in MATLAB
12 pages
Computer Devices and Peripherals
No ratings yet
Computer Devices and Peripherals
2 pages
(Untitled) : This Spreadsheet Was Created by Either POM, QM or POM-QM For Windows, V4
No ratings yet
(Untitled) : This Spreadsheet Was Created by Either POM, QM or POM-QM For Windows, V4
7 pages
ELS 02 Des 2022
No ratings yet
ELS 02 Des 2022
19 pages
FC12
No ratings yet
FC12
9 pages
NetWitness Respond Configuration Guide for 11.1
No ratings yet
NetWitness Respond Configuration Guide for 11.1
76 pages
Versant Guide - Test Administrators Guide
100% (2)
Versant Guide - Test Administrators Guide
22 pages
XY-MBZ55A-YC1155-Bluetooth-5-BR-EDR-BLE-module-Datasheet-20211101
No ratings yet
XY-MBZ55A-YC1155-Bluetooth-5-BR-EDR-BLE-module-Datasheet-20211101
27 pages
Prof DR MD Abdul Mottalib: Chapter 1-Introduction
No ratings yet
Prof DR MD Abdul Mottalib: Chapter 1-Introduction
35 pages
Nexus 9000 Architecture
No ratings yet
Nexus 9000 Architecture
110 pages
Introduction Speaker Recognition
No ratings yet
Introduction Speaker Recognition
6 pages