
- NumPy - Home
- NumPy - Introduction
- NumPy - Environment
- NumPy Arrays
- NumPy - Ndarray Object
- NumPy - Data Types
- NumPy Creating and Manipulating Arrays
- NumPy - Array Creation Routines
- NumPy - Array Manipulation
- NumPy - Array from Existing Data
- NumPy - Array From Numerical Ranges
- NumPy - Iterating Over Array
- NumPy - Reshaping Arrays
- NumPy - Concatenating Arrays
- NumPy - Stacking Arrays
- NumPy - Splitting Arrays
- NumPy - Flattening Arrays
- NumPy - Transposing Arrays
- NumPy Indexing & Slicing
- NumPy - Indexing & Slicing
- NumPy - Indexing
- NumPy - Slicing
- NumPy - Advanced Indexing
- NumPy - Fancy Indexing
- NumPy - Field Access
- NumPy - Slicing with Boolean Arrays
- NumPy Array Attributes & Operations
- NumPy - Array Attributes
- NumPy - Array Shape
- NumPy - Array Size
- NumPy - Array Strides
- NumPy - Array Itemsize
- NumPy - Broadcasting
- NumPy - Arithmetic Operations
- NumPy - Array Addition
- NumPy - Array Subtraction
- NumPy - Array Multiplication
- NumPy - Array Division
- NumPy Advanced Array Operations
- NumPy - Swapping Axes of Arrays
- NumPy - Byte Swapping
- NumPy - Copies & Views
- NumPy - Element-wise Array Comparisons
- NumPy - Filtering Arrays
- NumPy - Joining Arrays
- NumPy - Sort, Search & Counting Functions
- NumPy - Searching Arrays
- NumPy - Union of Arrays
- NumPy - Finding Unique Rows
- NumPy - Creating Datetime Arrays
- NumPy - Binary Operators
- NumPy - String Functions
- NumPy - Matrix Library
- NumPy - Linear Algebra
- NumPy - Matplotlib
- NumPy - Histogram Using Matplotlib
- NumPy Sorting and Advanced Manipulation
- NumPy - Sorting Arrays
- NumPy - Sorting along an axis
- NumPy - Sorting with Fancy Indexing
- NumPy - Structured Arrays
- NumPy - Creating Structured Arrays
- NumPy - Manipulating Structured Arrays
- NumPy - Record Arrays
- Numpy - Loading Arrays
- Numpy - Saving Arrays
- NumPy - Append Values to an Array
- NumPy - Swap Columns of Array
- NumPy - Insert Axes to an Array
- NumPy Handling Missing Data
- NumPy - Handling Missing Data
- NumPy - Identifying Missing Values
- NumPy - Removing Missing Data
- NumPy - Imputing Missing Data
- NumPy Performance Optimization
- NumPy - Performance Optimization with Arrays
- NumPy - Vectorization with Arrays
- NumPy - Memory Layout of Arrays
- Numpy Linear Algebra
- NumPy - Linear Algebra
- NumPy - Matrix Library
- NumPy - Matrix Addition
- NumPy - Matrix Subtraction
- NumPy - Matrix Multiplication
- NumPy - Element-wise Matrix Operations
- NumPy - Dot Product
- NumPy - Matrix Inversion
- NumPy - Determinant Calculation
- NumPy - Eigenvalues
- NumPy - Eigenvectors
- NumPy - Singular Value Decomposition
- NumPy - Solving Linear Equations
- NumPy - Matrix Norms
- NumPy Element-wise Matrix Operations
- NumPy - Sum
- NumPy - Mean
- NumPy - Median
- NumPy - Min
- NumPy - Max
- NumPy Set Operations
- NumPy - Unique Elements
- NumPy - Intersection
- NumPy - Union
- NumPy - Difference
- NumPy Random Number Generation
- NumPy - Random Generator
- NumPy - Permutations & Shuffling
- NumPy - Uniform distribution
- NumPy - Normal distribution
- NumPy - Binomial distribution
- NumPy - Poisson distribution
- NumPy - Exponential distribution
- NumPy - Rayleigh Distribution
- NumPy - Logistic Distribution
- NumPy - Pareto Distribution
- NumPy - Visualize Distributions With Sea born
- NumPy - Matplotlib
- NumPy - Multinomial Distribution
- NumPy - Chi Square Distribution
- NumPy - Zipf Distribution
- NumPy File Input & Output
- NumPy - I/O with NumPy
- NumPy - Reading Data from Files
- NumPy - Writing Data to Files
- NumPy - File Formats Supported
- NumPy Mathematical Functions
- NumPy - Mathematical Functions
- NumPy - Trigonometric functions
- NumPy - Exponential Functions
- NumPy - Logarithmic Functions
- NumPy - Hyperbolic functions
- NumPy - Rounding functions
- NumPy Fourier Transforms
- NumPy - Discrete Fourier Transform (DFT)
- NumPy - Fast Fourier Transform (FFT)
- NumPy - Inverse Fourier Transform
- NumPy - Fourier Series and Transforms
- NumPy - Signal Processing Applications
- NumPy - Convolution
- NumPy Polynomials
- NumPy - Polynomial Representation
- NumPy - Polynomial Operations
- NumPy - Finding Roots of Polynomials
- NumPy - Evaluating Polynomials
- NumPy Statistics
- NumPy - Statistical Functions
- NumPy - Descriptive Statistics
- NumPy Datetime
- NumPy - Basics of Date and Time
- NumPy - Representing Date & Time
- NumPy - Date & Time Arithmetic
- NumPy - Indexing with Datetime
- NumPy - Time Zone Handling
- NumPy - Time Series Analysis
- NumPy - Working with Time Deltas
- NumPy - Handling Leap Seconds
- NumPy - Vectorized Operations with Datetimes
- NumPy ufunc
- NumPy - ufunc Introduction
- NumPy - Creating Universal Functions (ufunc)
- NumPy - Arithmetic Universal Function (ufunc)
- NumPy - Rounding Decimal ufunc
- NumPy - Logarithmic Universal Function (ufunc)
- NumPy - Summation Universal Function (ufunc)
- NumPy - Product Universal Function (ufunc)
- NumPy - Difference Universal Function (ufunc)
- NumPy - Finding LCM with ufunc
- NumPy - ufunc Finding GCD
- NumPy - ufunc Trigonometric
- NumPy - Hyperbolic ufunc
- NumPy - Set Operations ufunc
- NumPy Useful Resources
- NumPy - Quick Guide
- NumPy - Cheatsheet
- NumPy - Useful Resources
- NumPy - Discussion
- NumPy Compiler
NumPy - Median
What is Median?
In mathematics, the median is the middle value of a set of numbers when they are arranged in order.
If the set has an odd number of values, the median is the middle one. If it has an even number of values, the median is the average of the two middle values.
The median is useful for finding the central tendency of data, especially when there are outliers.
The NumPy median() Function
The median() function in NumPy calculates the median of an array's elements. It sorts the values and returns the middle value, or the average of the two middle values if the array has an even number of elements.
You can also specify an axis to calculate the median along rows or columns. For example, np.median([1, 3, 2, 4]) returns 2.5.
Following is the basic syntax of the median() function in NumPy −
numpy.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)
Where,
- a: The input array or dataset for which the median is calculated.
- axis: Specifies the axis along which the median is computed. If None (default), the median is computed over the entire array.
- out: This allows you to specify a location where the result will be stored. If None (default), the result is returned as a new array.
- overwrite_input: If True, the input array is modified in place to save memory. This is useful when you do not need the original data.
- keepdims: If True, the result will retain the reduced dimensions, allowing for easier broadcasting. If False (default), the result is squeezed.
Understanding the Median Calculation
The calculation of the median in a dataset follows these steps −
- Step 1: Sort the array in ascending order.
- Step 2: Find the middle element. If the number of elements is odd, the middle element is the median.
- Step 3: If the number of elements is even, calculate the average of the two middle elements to get the median.
Example
Let us understand this concept with an example. Here, in the first example, the array has an odd number of elements (5), so the middle element (5) is returned as the median.
In the second example, the array has an even number of elements (4), so the median is calculated by averaging the two middle elements (3 and 5), which gives 4.0 as the result −
import numpy as np data_odd = np.array([1, 3, 5, 7, 9]) data_even = np.array([1, 3, 5, 7]) # Calculating the median for both datasets median_odd = np.median(data_odd) median_even = np.median(data_even) print("Median of odd dataset:", median_odd) print("Median of even dataset:", median_even)
Following is the output obtained −
Median of odd dataset: 5.0 Median of even dataset: 4.0
Computing Median along Different Axes
In NumPy, the axis parameter allows you to compute the median along specific axes of a multi-dimensional array. The axis refers to the direction in which the median should be calculated. For example, in a 2D array −
- axis=0: Calculate the median along the columns (vertical axis).
- axis=1: Calculate the median along the rows (horizontal axis).
Example
In the following example, we are computing the median along both axes of a 2D array −
import numpy as np # Create a 2D array data_2d = np.array([[1, 3, 5], [2, 4, 6], [7, 8, 9]]) # Calculate the median along axis 0 (columns) median_axis_0 = np.median(data_2d, axis=0) # Calculate the median along axis 1 (rows) median_axis_1 = np.median(data_2d, axis=1) print("Median along axis 0:", median_axis_0) print("Median along axis 1:", median_axis_1)
In the output below, the median along axis 0 is computed by taking the median of each column. The median along axis 1 is calculated by taking the median of each row −
Median along axis 0: [2. 4. 6.] Median along axis 1: [3. 4. 8.]
Median for Higher-Dimensional Arrays
The numpy.median() function also works for arrays with more than two dimensions. You can specify the axis along which to calculate the median, and the function will return the median for that axis while retaining the other dimensions. If no axis is specified, the median is calculated over the entire array.
Example
Following is an example to compute the median of a 3D array −
import numpy as np # Create a 3D array data_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) # Median along axis 0 median_3d_axis_0 = np.median(data_3d, axis=0) # Median along axis 1 median_3d_axis_1 = np.median(data_3d, axis=1) # Median along axis 2 median_3d_axis_2 = np.median(data_3d, axis=2) print("Median along axis 0:", median_3d_axis_0) print("Median along axis 1:", median_3d_axis_1) print("Median along axis 2:", median_3d_axis_2)
In this case, the median is calculated along each of the axes (0, 1, and 2) for the 3D array. The function returns the median values for each of the specified axes while preserving the other dimensions −
Median along axis 0: [[3. 4.] [5. 6.]] Median along axis 1: [[2. 3.] [6. 7.]] Median along axis 2: [[1.5 3.5] [5.5 7.5]]
Handling NaN (Not a Number) Values
Sometimes, arrays may contain NaN (Not a Number) values, which can interfere with the calculation of the median. To handle NaN values, NumPy provides an option to ignore them during median calculation. You can use the numpy.nanmedian() function, which computes the median while ignoring NaN values.
Example
Following is an example to handle NaN values while calculating median in NumPy −
import numpy as np # Create an array with NaN values data_with_nan = np.array([1, 3, np.nan, 5, 7]) # Calculate the median while ignoring NaN values median_without_nan = np.nanmedian(data_with_nan) print("Median without NaN:", median_without_nan)
In this example, the np.nanmedian() function ignores the NaN value and computes the median of the remaining numbers, resulting in 4.0.
Median without NaN: 4.0