Data Visualization
Data Visualization
i. Exploratory Data Analysis is an important step which helps us to look at data before making any assumptions.
ii. EDA can help identify obvious errors, as well as better understand patterns within the data, detect outliers or
anomalous events, find interesting relations among the variables.
iii. Exploratory data analysis is an initial step in data analysis. It's where the analyst takes a bird's
eye view of the data and tries to make some sense of it.
Define Data Visualization. Why the data visualization important for data analysis? Explain.
Data visualization presents the data in a graphical format as it is said that the human brain processes visual content
better than plain textual information.
Data visualization gives us a clear idea of what the information means by giving it visual context through maps or
graphs.
Data visualization can useful to :-
Identify outliers in data, Improve response time, Greater simplicity, Easier visualization of patterns, Business analysis
made easy, and Enhanced collaboration.
Role of
Possible Illustrative Data Visualization Graph
Data Visualization
Distribution Scatter chart, 3D Area chart, Histogram
Composition Pie chart, Waterfall chart, Stacked column chart, Stacked area chart
Connection Matrix chart, Node-link diagram, Word cloud, Alluvial diagram, Tube map
Provides a vast variety of colorful designs for data 2D and 3D chart options, open
Plotly visualization. Can use the chart studio to create source coding, interactivity, plotly
web-based reporting templates has hover tool capabilities.
Explain Data visualization libraries in Python.
matplotlib library :-
The matplotlib is the most common standard Python library used for plotting 2D data visualizations.
It is the first data visualization library to be developed in Python, and later many other libraries were built on top of
it.
This library is used to create a variety of visualization graphs such as line plots, pie charts, scatter plots, bar charts,
histograms, stem plots, and spectrograms. It allows easy use of labels, axes titles, grids, legends, and other graphic
requirements with customizable values and text.
seaborn library :-
The seaborn library couples the power of the matplotlib library to create artistic charts with very few lines of code.
This library follows creative styles and rich color palettes, which allows the creation of visualization plots to be more
attractive and modern.
As seaborn is considered to be a higher-level library, there are certain special visualization tools such as violin plots,
heat maps and time series plots that
plotly library :-
The plotly library is an online platform for data visualization
it can be used in making interactive plots that are not possible using other Python libraries.
Few such plots include dendrograms, contour plots, and 3D charts.
Other than these graphics, some basic visualization graphs such as area charts, bar charts, box plots, histograms,
polar charts, and bubble charts can also be created using the plotly library.
Basic Data Visualization tools.
The common basic visualization tools that are often used by analysts for data analysis Histograms, Bar charts/graphs,
Scatter plots, Line charts, Area plots, Pie charts Donut charts
i. A histogram is a graphical display of data using bars of different heights.
A histogram displays the shape and spread of continuous sample data.
ii. A bar chart has rectangular bars in which the lengths are proportional to the values which are represented.
iii. A scatter plot that is a two-dimensional plot to observe and display relationships between two variables.
iv. Line charts is a graph that is used for the representation of continuous data points on a number line.
v. An area plot or area chart is similar to a line chart, except that the area between the x axis and the line is filled in
with color or shading.
vi. A pie chart, as the name suggests, looks similar to a pie. It is a circular graphic that is divided into slices.
vii. A donut chart is similar to a pie chart with the main difference in that an area of the center is cut out to give .
. the look of a doughnut.
A Venn diagram is a visualization tool used to display all possible logical relations among a finite group of few sets.
The Tree map visualization tool is mainly used for displaying hierarchical data that can be structured in the form of a
tree.
3D scatter plot is one of the most frequently used three-dimensional graphs for comparing the three characteristics
of a given dataset.
A wordcloud is a visualization tool for understanding and determining patterns and evolving trends in text data.
What is Histogram and Bar Chart? How to create them? What is difference between them?
Difference between Histogram and Bar chart
A bar graph looks similar to a histogram consisting of a set of bars based on the data but there are some major
differences between a bar chart and a histogram.
that there are gaps between bars in a bar chart but in a histogram, the bars are placed adjacent to each other.
While the histogram displays the frequency of numerical data, a bar chart uses bars to compare different categories
of data.
A histogram is a graphical display of data using bars of different heights. A histogram displays the shape and spread
of continuous sample data.
A bar chart has rectangular bars in which the lengths are proportional to the values which are represented.