Data Analysis and Visualization On Space Race (Spacenalyzer)

Data Analysis and Visualization on Space Race
(Spacenalyzer)
Major Project Report
Submitted to the Centurion University
In partial fulfillment of requirements for the award of degree
Bachelor of Technology
in
Computer Science and Engineering
By
Hari Varma Pamudurthi
211801380018
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CENTURION UNIVERSITY OF TECHNOLOGY AND
MANAGEMENT
(CUTM-AP)VIZIANAGARAM
1
CENTURION UNIVERSITY OF TECHNOLOGY AND
MANAGEMENT
(CUTM-AP) VIZIANAGARAM
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CERTIFICATE
This is to certify that their report entitled "Data Analysis and Visualization
on Space Race" submitted by Hari Varma Pamudurthi (211801380018) in
Department of Computer Science and Engineering in partial fulfilment of
the B.Tech. Degree in Computer Science and Engineering is a bonafide
record of the seminar work carried out by him under our guidance and
supervision. This report in any form has not been submitted to any other
University or Institute for any purpose.
(Project Guide)
Dr. Lakshman Rao
Department of Computer Science
And Engineering
Centurion University (CUTM-AP)
External Examiner
2
Abstract
Through this document I would like to present my project “Spacenalyzer”.

Spacenalyzer is a visualization that portrays the data of individual countries
space launches. Space research is rapidly increasing in recent years. In parallel
with this, the number of space agencies are increasing drastically. Thus,
keeping a track of the space shuttle launches will be very much useful. This
keeps in track the travel history of various spacecraft’s launched through
various space agencies. Through this I also aim to do a comparison between
the space agencies. For developing Spacenalyzer I’ll be using python in the
frontend and Excel in the backend. Excel will be used to store, read, and write
the data.
3
INDEX
S. NO. TITLE PAGE
ABSTRACT 3
LIST OF FIGURES 6
1. INTRODUCTIION 7
2. EXISTING AND PROPOSED SYSTEM 8
3. SYSTEM REQUIREMENTS 9
3.1 Hardware Requirements 9
3.2 Software Requirements 9
4. METHODOLOGY 10
4.1 Modular Design 10
4.1.1 Data Acquisition 11
4.1.2 Data Processing 11
4.1.3 Implementation 11
4.1.4 Analysis 11
4.1.5 Visualization 11
4.2 What is Python 12
4.3 Modules 13
4.3.1 Pandas 13
4.3.2 Plotly. Express 13
4
5. DATA SET 14
6. IMPLEMENTATION 15
7. RESULTS 19
8. CONCLUSION 24
9. REFERENCE 25
5
LIST OF FIGURES
FIG. NO. TITLE PAGE NO
7.1 Data Set- World Rocket 19
7.2 Pie Chart- Showing the Status of 19
Rockets Launched
7.3 Bar Chart Comparison on Active and Retired 20
Rockets
7.4 3D-Plot on Which Country 20
Launched What Rocket
7.5 Bar Chart Analysis on Which Company 21
Launched Which Rocket
7.6 Scatter and Range Plot on Money 22
Spent on Rocket Launches
7.7 Sunburst Plot on World 22
Spacecraft Analysis
7.8 Tree Map Analysis on World 23
Spacecraft’s
6
1. Introduction
The globe has taken a tremendous interest in trying to travel beyond the planet
since since the USSR launched the first ever artificial satellite into space,
known as the Sputnik, at the beginning of the First World War. The pinnacle
of engineering and science, requiring extremely high levels of both theoretical
and experimental study, is rocket science, cosmology, and astronomy.
When and where a space launch should occur for it to reach its destination
with the least amount of resistance and with the maximum likelihood of
success are decisions that need a lot of arithmetic. Extreme engineering levels
are used at the same time to test the launch vehicles for potential malfunctions
and recreate equivalent space conditions here on Earth. For any of these space
missions to succeed, years of arduous labor, research, and testing are
necessary.
When launches are successful, it gives the country great pride. When the
missions fail and millions of money and national hopes go up in smoke, it is
also very depressing. As bizarre as science is, no notable experiment has ever
been conducted without a fair amount of failures.
I'd want to express my gratitude to the dataset contributor for making the effort
to give us with a fantastic dataset that can be used to analyze the many
accomplishments and failures of the world's space organizations.
The visualization I performed with my obtained dataset will be helpful to
analyze the data of the various rockets launched by various countries. It also
lets us know about the rocket status, the cost spent for launching the rocket.
Python was utilized as it is the most optimum programming language for data
analysis in order to see and analyze this data. The raw data that is available on
the internet cannot be understood by the human brain, so in order to facilitate
and clarify their comprehension, I created interactive, easily comprehensible
graphs.
7
2. Existing and Proposed System
There are many other visualizations available on the web for having a proper
analysis for the requirement to be full filled. But I believe that they are
outdated and the techniques and technology used while creating those
visualizations are far outdated from what we have today.
Therefore, given the state of technology now and the changes made to
programming languages. In my visualization I’ve used and added different
types of visualization modules and graphs which weren’t in use previously.
The graphs I’ve produces are interactive so that the users can also operate
them. I’ve linked the graphs with an html file so that the user may view the
3D plots with ease.
8
3. System Requirements
3.1. Hardware Requirements

1. Intel Pentium/ core/ i3/i5/i7
2. 4 GB/6 GB/8 GB Ram
3. Graphics Card (Integrated/dedicated)
4. Hard Disk Drive / Solid State Drive
3.2. Software Requirements

1. Windows 7,8,10,11
2. Mac OS
3. Linux
4. Anaconda (Jupyter notebook)/ Python IDLE/ PyCharm
5. Ms Excel
6. Any web Browser
9
4. Methodology
4.1. Modular Design
10
4.1.1. Data Acquisition
For collecting the raw data I used online sources like Kaggle, GitHub,
Google Scholar, etc. This raw data was later converted into numerical
data using Excel Workbook, which later was used Data processing.
4.1.2. Data Processing
Now the acquisited data is processed in excel for filtering the garbage
values. I also computed the data using mathematical calculations in excel
for a better result. After processing the data I imported the data in python
for implementation.
4.1.3.Implementation
After processing the data I imported the data in python environment for
analysis. I analyzed the data using pandas by importing it in the csv
format for operations and manipulated the data using parameters like use
calls, head, etc. Now the data has to be analyzed.
4.1.4.Analysis
After implementation the data has to analyze for visualization. In analysis
stage the user can clearly analyze the data in the form of rows and
columns for a clear understanding. With this we can have a clear
understanding of what data to visualize.
4.1.5.Visualization
After analysis we come to the visualization stage where we used some
python built-in-modules such as plotly where I produced interactive
graphs for a better representation of the data. I took the column names as
x, y axis for performing these visualizations. These visualization helps
the users to have a clear understanding about the raw data, which again
will be helpful for a better analysis.
11
4.2. What is Python
1. Python is a high-level, interpreted, general-purpose programming

language. Its design philosophy emphasizes code readability with the
use of significant indentation.
2. Python uses garbage collection and has dynamic typing. It supports a
variety of programming paradigms, including procedural, object-
oriented, and functional programming as well as structured
programming (especially this). Due to its extensive standard library, it
is frequently referred to as a "batteries included" language.
3. Python was created by Guido van Rossum in the late 1980s to replace
the ABC programming language, and it was originally made available
as Python 0.9.0 in 1991.
4. Python frequently causes programmers to fall in love due to the
enhanced productivity it offers. The edit-test-debug cycle is
extraordinarily quick because there is no compilation step.
5. Python programs are simple to debug since a segmentation failure is
never caused by a bug or incorrect input. Instead, the interpreter raises
an exception when it finds a mistake. The interpreter prints a stack trace
if the application doesn't catch the exception.
6. Setting breakpoints, evaluating arbitrary expressions, inspecting local
and global variables, stepping through the code one line at a time, and
other features are all possible with a source level debugger. Python's
ability to perform introspection is demonstrated by the debugger, which
is developed in Python.
4.3. Modules
For this project I’ve used the following modules:-
1. Pandas
2. Plotly. Express
12
4.3.1. Pandas
For the purpose of manipulating and analyzing data, the Python
programming language has a software package called pandas. It includes
specific data structures and procedures for working with time series and
mathematical tables. It is free software distributed under the BSD license's
three clauses. The word is derived from "panel data," a phrase used in
econometrics to refer to data sets that contain observations for the same
persons throughout a range of time periods. Python data analysis is a play
on words in the name of the thing. When Wes McKinney worked as a
researcher at AQR Capital from 2007 to 2010, he began creating the
pandas that would eventually become famous.
4.3.2. Plotly. Express

An interactive, open-source plotting toolkit for Python, plotly provides
over 40 different chart types for a variety of statistical, financial,
geographic, scientific, and three-dimensional use-cases.
Plotly, a Python tool that is built on top of the Plotly JavaScript library
(plotly.js), allows users to create stunning interactive web-based
visualizations that can be viewed in Jupyter notebooks, saved to standalone
HTML files, or used as a component of web applications that are entirely
written in Python and served using Dash. To distinguish it from the
JavaScript library, the plotly Python library is sometimes called
"plotly.py."
13
5. Dataset
So for performing any visualizations the main thing is the data. These data
can be of any type i.e. raw data, filtered data, selective data etc. A data set
(sometimes spelled dataset) is a group of data. In the case of tabular data, a
data set relates to one or more database tables, where each row refers to a
specific record in the corresponding data set and each column to a specific
variable. The data set includes values for each of the variables, such as the
object's height and weight, for each set member. Data sets can also be made
up of a group of files or documents. In my case for visualizing and analyzing
this project I used an already existing dataset which I obtained from a very
trusted website named “kaggle”.
14
6. Implementation
Code:-
##importing all the modules##
import dash
import plotly.express as px
import pandas as pd
##Analysis and filtering the data##
df=pd.read_csv(r'C:\123456.csv')
df['Launch date']=pd.to_datetime(df['Datum'])
df['Launch date']=df['Launch date'].astype(str)
df['Launch date']=df['Launch date'].str.split(' ',expand=True)[0]
df['Launch date']=pd.to_datetime(df['Launch date'])
df[' Rocket']=df[' Rocket'].str.replace(',','')
df[' Rocket']=df[' Rocket'].astype(float)
df['Status Rocket']=df['Status Rocket'].str.replace('Status','')
df.drop('Datum',axis=1,inplace=True)
data=df.dropna()
df=data.head(20)
df
15
##Separating the unique values##
print(df.Country.nunique())
print(df.Country.unique())
##Visualization##
##1) Pie Chart##

fig_pie = px.pie(data_frame=df, names='Status Mission', hole=0.4,
template='plotly_dark')
fig_pie.update_layout(title='Rocket Launches Status',title_x=0.5,
annotations=[dict(text='Status',font_size=15,
showarrow=False,height=800,width=900)])
fig_pie.show()
##2) Barchart##
df=df.groupby('Status Rocket').count().reset_index()
df=df.rename(columns={"Detail": "Details"})
fig_bar=px.bar(data_frame=df, x='Status Rocket', y='Details',
template='plotly_dark', title='Status of rockets Carrying Missions')
fig_bar.show()
16
##3) 3d-Line plot##
df=pd.read_csv(r'C:\123456.csv').head(100)
fig_bar=px.line_3d(data_frame=df, x='Com Name',
y='Detail',z='Country',template='plotly_dark',color_discrete_sequence=['ora
nge','green'], title='Which COuntry Launched What Rocket')
fig_bar.show()
fig_bar.write_html("hari.html")
##4) Barchart##
fig_bar=px.bar(data_frame=df, x='Detail', y='Com
Name',template='plotly_dark',color_discrete_sequence=['yellow','green'],bar
mode='group',height=2000, title='Which company Launched which Rocket')
fig_bar.show()
##5) Scatter/Range Plot##
fig = px.scatter(data_frame=df, x='Com Name', y=" Rocket", color='Com
Name', title='Money Spent in
billions',template='plotly_dark',marginal_y='violin',color_discrete_sequence
=['red','green','blue'])
fig.show()
data2=data.head(100)
17
##6) Sun Burst##
fig_graph=e.sunburst(data_frame=data2,
path=['Country','Com Name','Detail'],
template='plotly_dark',
hover_data=[' Rocket'],
title='World Spacecrafts Analysis',
color=' Rocket',
maxdepth=-1,
color_discrete_sequence=['orange','red','green','blue','hotpink'] )
fig_graph
##7) Tree Plot##
fig_graph=e.treemap(data_frame=data2,
path=['Country','Com Name','Detail'],
template='plotly_dark',
hover_data=[' Rocket'],
title='World Spacecrafts Analysis',
color=' Rocket',
maxdepth=-1,
color_discrete_sequence=['orange','red','green','blue','hotpink'])
fig_graph
################THE-END#################
18
7. Results
Fig. 7.1 (Data Set)
Fig. 7.2 (Pie Chart Showing the Status of Rocket Launched)

19
Fig. 7.3 (Bar Chart Comparison on Active and Retired Rockets)
Fig. 7.4 (3d-Plot on Which Country Launched What Rocket)
20
Fig. 7.5 (Bar Chart Analysis on which Space Agency Launched Which
Rocket)
21
Fig. 7.6 (Scatter and Range Plot Cost of Launching)
Fig. 7.7 (Sun Burst analysis on World Spacecraft’s)
22
Fig. 7.8 (Tree Map Analysis on World Spacecrafts)
23
8. Conclusion
The space is a wide area of exploration which is never ending. So, to study
space it is very important that there is proper equipment and instruments to go
ahead with the research and development work.
From the foregoing, we can see how crucial data visualization is to all
industries, as well as its advantages and many methods for creating visual
formats. Without this crucial step, analytics cannot process any future steps. I
therefore draw the conclusion that data visualization can be used in any
industry or profession. Data visualization is also necessary because the vast
majority of massive, unstructured data cannot be comprehended by human
minds alone. These data sets must be transformed into a format that we can
easily comprehend. To discover trends and linkages, graphs and maps are
essential if we are to gain understanding and reach a more accurate
conclusion.
Graphs and charts help us convey data results so that we can spot patterns and
trends, gain understanding, and swiftly arrive at smarter conclusions. The
significance of data visualization and what it means to our clients must be
understood by us. To help customers visualize their data in an understandable
and relevant way, we should offer them appealing and user-friendly
visualization capabilities and tools.
24
9. References
1. Kaggle (https://www.kaggle.com/code/arindambaruah/who-is-
leading-the-space-race/notebook)
2. W3School (https://www.w3schools.com/python)
3. Stack Overflow (https://stackoverflow.com/)
4. Tutorials point (https://www.tutorialspoint.com/python/index.htm)
5. Plotly (https://plotly.com/python/)
6. Pandas (https://pandas.pydata.org/)
7. https://ieeexplore.ieee.org/abstract/document/7284779
8. https://www.mdpi.com/2220-9964/8/7/292
9. https://www.sciencedirect.com/science/article/abs/pii/S009457651
9300621
10.https://www.jstage.jst.go.jp/article/jsme2/29/3/29_ME2903rh/_arti
cle/-char/ja/
25

Data Analysis and Visualization On Space Race (Spacenalyzer)

Uploaded by

Copyright:

Available Formats

Data Analysis and Visualization On Space Race (Spacenalyzer)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Analysis and Visualization On Space Race (Spacenalyzer)

Uploaded by

Copyright:

Available Formats

Data Analysis and Visualization on Space Race

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Through this document I would like to present my project “Spacenalyzer”.

S. NO. TITLE PAGE

3.1. Hardware Requirements

3.2. Software Requirements

4.1. Modular Design

1. Python is a high-level, interpreted, general-purpose programming

4.3.2. Plotly. Express

##Analysis and filtering the data##

##1) Pie Chart##

##5) Scatter/Range Plot##

##7) Tree Plot##

Fig. 7.1 (Data Set)

Fig. 7.2 (Pie Chart Showing the Status of Rocket Launched)

Fig. 7.4 (3d-Plot on Which Country Launched What Rocket)

Fig. 7.7 (Sun Burst analysis on World Spacecraft’s)

3. Stack Overflow (https://stackoverflow.com/)

4. Tutorials point (https://www.tutorialspoint.com/python/index.htm)

You might also like