Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
23 views

Creating A Series and Using Matplotlib

Uploaded by

Sarah Mathibe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
23 views

Creating A Series and Using Matplotlib

Uploaded by

Sarah Mathibe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 11
CHAPTER 6 DATA EXPLORING AND ANALYSIS Creating a Series Pandas provides a Series() method that is used to create a series structure. A serious structure of size n should have an index of length n. By default Pandas creates indices starting at 0 and ending with n-1. ‘A Pandas series can be created using the constructor pandas Series (data, index, dtype, copy) where data could be an array, constant, list, etc. The series index should be unique and hashable with length n, while dtype is a data type that could be explicitly declared or inferred from the received data. Listing 6-1 creates a series with a default index and with a set index. Listing 6-1. Creating a Series In [5]: import pandas as pd import numpy as np data = np.array(['0",'s','S','A']) $1 = pd.Series(data) # without adding index 52 = pd,Series(data, index=[100,101,102,103]) # with adding index print (S1) print ("\n") print (S2) dtype: object 100 0 101 S 102 S 103 A dtype: object 244 CHAPTER 6 DATA EXPLORING AND ANALYSIS. In [40]:import pandas as pd import numpy as np mmy_series2 = np-random.randn(5, 10) print ("\nny series2\n", my series2) This is the output of creating a series of random values of 5 rows and 10 columns. ‘r’aco8se0e77 0. 79400706) - . eeaTsa As mentioned earlier, you can create a series from a dictionary; Listing 6-2 demonstrates how to create an index for a data series. Listing 6-2. Creating an Indexed Series In [6]: import pandas as pd import numpy as np data = {'X' :0., ‘Yor a, SERIES1 = pd.Series(data) print (SERTES1) X 0.0 Y 1.0 22.0 dtype: floatea In [7]: import pandas as pd import numpy as np data = {'X' 20, Vora, ‘2 2} SERIES1 = pd.Series(data,indexe["Y','Z','W','X']) print (SERTES1) Yio 245 CHAPTER 6 DATA EXPLORING AND ANALYSIS 22.0 W NaN X0.0 dtype: floateg Ifyou can create series data from a scalar value as shown in Listing 6-3, then an index is mandatory, and the scalar value will be repeated to match the length of the given index. Listing 6-3, Creating a Series Using a Scalar In [9]: # Use sclara to create a series import pandas as pd import numpy as np Seriesi = pd.Series(7, index=(0, 1, 2, 3, 4]) print (Series1) 0 1 2 3 4 dtype: intea Accessing Data from a Series with a Position Like lists, you can access a series data via its index value. The examples in Listing 6-4 demonstrate different methods of accessing a series of data. ‘The first example demonstrates retrieving a specific element with index 0 ‘The second example retrieves indices 0, 1, and 2. The third example retrieves the last three elements since the starting index is -3 and moves backward to -2, -1. The fourth and fifth examples retrieve data using the series index labels. 246 CHAPTER 6 DATA EXPLORING AND ANALYSIS Listing 6-4. Accessing a Data Series In [18]: import pandas as pd Seriesi = pd.Series([1,2,3,4,5],index = ['a',"b',"c',"d',"e"]) print ("Example 1:Retrieve the first element") print (Series[o] ) print ("\nExample 2:Retrieve the first three element") print (Seriesa[:3]) print ("\nExample 3:Retrieve the last three element") print(Seriesa{-3:]) print (*\nExanple 4:Retrieve a single elenent") print (Seriesa[‘a"]) print ("\nExanple 5:Retrieve multiple elements") print (Seriesi{['a','c','d']]) Example 1:Retrieve the first element 1 Example 2:Retrieve the first three element aon b 2 © 8 deype: inte4 Example 3:Retrieve the last three element c 3 a 4 e 5 type: int6é Example 4:Retrieve a single element 1 Example multiple elements aoa 2 3 a 4 dtype: intéd 247 CHAPTER 6 DATA EXPLORING AND ANALYSIS Exploring and Analyzing a Series ‘Numerous statistical methods can be applied directly on a data series. Listing 6-5 demonstrates the calculation of mean, max, min, and standard. deviation of a data series. Also, the .describe() method can be used to give a data description, including quantiles. Listing 6-5. Analyzing Series Data In [10]: import pandas as pd 248 import numpy as np my_seriesi = pd.Series([5, 6, 7, 8, 9, 10]) print (“ny seriesi\n", ny series1) print ("\n Series Analysis\n ") print ("Series mean value : ", my seriest.mean()) # find mean value in a series print ("Series max value find max value in a series print ("Series min value : ",my_seriest.min()) # find min value in a series “,my_seriest.max()) # print ("Series standard deviation value : ", my seriest.std()) # find standard deviation my_seriest o 5 6 7 8 9 10 1 2 3 4 5 dtype: int6g In Series Series Series Series Series CHAPTER 6 DATA EXPLORING AND ANALYSIS. Analysis mean value : 7.5 max value : 10 min value : 5 standard deviation value : 1.8708286933869707 [11]: my_seriest.describe() Out[a1]: count mean std min 25% 50% 15% max dtype: 6.000000 7.500000 1.870829 5.000000 6.250000 7.500000 8.750000 10.000000 float64 Ifyou copied by reference one series to another, then any changes to the series will adapt to the other one. After copyingmy_series1 tomy_ series_11, once you change the indices ofmy_series_11, itreflects back tomy_seriesi, as shown in Listing 6-6. Listing 6-6, Copying a Series to Another with a Reference In [17]: my_series 11 = ny seriest print (ny_series1) my_series 11.index = [‘A', print (my_series 11) print (my_series1) ° 5 1 6 207 3 8 249 CHAPTER 6 DATA EXPLORING AND ANALYSIS 49 5 10 dtype: integ AS B 6 co7 D8 Eo 9 F 10 dtype: integ AS B 6 co7 D8 E 9 F 10 dtype: integ ‘You can use the .copy() method to copy the data set without having a reference to the original series. See Listing 6-7. Listing 6-7. Copying Series Values to Another In [21]: my series 11 = my seriest.copy() print (ny_series1) my series 11,index = ['A', 'B', print (ny_series_11) print (my_series1) ° 5 1 6 207 3 8 250 CHAPTER 6 DATA EXPLORING AND ANALYSIS 49 5 10 dtype: integ mmonwe 10 dtype: integ o 5 1 6 207 3 8 49 5 10 dtype: inte4 Operations on a Series Numerous operations can be implemented on series data. You can check whether an index value is available in a series or not. Also, you can check all series elements against a specific condition, such as ifthe series value is less than 8 or not. In addition, you can perform math operations on series data directly or via a defined function, as shown in Listing 6-8. Listing 6-8, Operations on Series In [23]: 'F Out [23]: True in my_series 11 In [27]: tenp = my series 11 < 8 ‘temp 251 CHAPTER 6 DATA EXPLORING AND ANALYSIS out[27]: A True Bo True C Tre D False — False F False dtype: bool 'n [35]: len(my_series_11) out [35]: 6 In [28]: tenp = my_series 11[my series 11 < 8] *2 temp out[28]: A 10 B 12 co dtype: integ Define a function to add two series and call the function, like this: In [37]: def AddSeries(x,y) for i in range (Jen(x)): print (x[i] + y[i]) in [39]: print ("Add two series\n") Addseries (my series 11, my series Add two series 10 R 14 16 18 20 252 CHAPTER 6 DATA EXPLORING AND ANALYSIS. You can visualize data series using the different plotting systems that are covered in Chapter 7. However, Figure 6-1 demonstrates how to get an at-a-glance idea of your series data and graphically explore it via visual plotting diagrams. See Listing 6-9. Listing 6-9. Visualizing Data Series In [49]: import matplotlib.pyplot as plt plt.plot(my_series2) plt.ylabel ("index") plt.show() 20 is 10 os oo index ~05 “10 “15 -20 oo os 10 15 20 25 30 35 40 Figure 6-1. Line visualization In [54]: from numpy import * import math import matplotlib.pyplot as plt t = linspace(0, 2*nath.pi, 400) 253 CHAPTER 6 DATA EXPLORING AND ANALYSIS sin(t) cos(t) czatb In [50]: plt.plot(t, a, ‘r') # plotting t, a separately plt.plot(t, b, 'b’) # plotting t, b separately plt.plot(t, c, 'g') # plotting t, c separately plt.show() We can add multiple plots to the same canvas as shown in Figure 6-2. is 10 05 00 0 1 2 3 4 5 6 Figure 6-2. Multiplots on the same canvas Data Frame Data Structures ‘As mentioned earlier, a data frame is a two-dimensional data structure with heterogeneous data types, ie., tabular data. 254

You might also like