Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

NumPy - Sum



What is Sum?

In mathematics, a sum is the result of adding two or more numbers together. For example, the sum of 2 and 3 is 5.

It is often represented using the plus symbol (+). Summation can also involve adding a sequence of numbers, often using the Greek letter sigma () to denote the operation.

The NumPy sum() Function

The sum() function in NumPy calculates the sum of array elements along a specified axis, providing flexibility to sum across rows, columns, or the entire array.

Following is the basic syntax of the sum() function in NumPy −

numpy.sum(a, axis=None, dtype=None, out=None, keepdims=False)

Where,

  • a: The input array containing the elements to sum.
  • axis: The axis along which to sum. If None, it sums all the elements of the array. For multi-dimensional arrays, you can specify an axis (0 for rows, 1 for columns, etc.).
  • dtype: The data type to use for the sum. If not specified, it defaults to the data type of the array.
  • out: A location where the result will be stored. If provided, it must be of the same shape and type as the input array.
  • keepdims: If True, the reduced axes are kept in the result as dimensions with size one. This is useful for broadcasting.

Summing All Elements of a 1D Array

If you have a one-dimensional array, you can use the numpy.sum() function to calculate the sum of all its elements. Following is an example −

import numpy as np

# Define a 1D array
arr = np.array([1, 2, 3, 4, 5])

# Calculate the sum of all elements
total_sum = np.sum(arr)

print("Total sum of the array:", total_sum)

Following is the output obtained −

Total sum of the array: 15

Summing Along a Specific Axis in a 2D Array

In a two-dimensional array, you can compute the sum along a specific axis. For example, summing along the rows or columns −

import numpy as np

# Define a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Sum along rows (axis=1)
sum_rows = np.sum(arr_2d, axis=1)

# Sum along columns (axis=0)
sum_columns = np.sum(arr_2d, axis=0)

print("Sum along rows:", sum_rows)
print("Sum along columns:", sum_columns)

Following is the output obtained −

Sum along rows: [ 6 15 24]
Sum along columns: [12 15 18]

Summing with a Specified Data Type

You can also specify the data type in which you want the sum to be computed. This is especially useful when dealing with large numbers or when you need the result in a specific precision (such as float64). Here is an example −

import numpy as np

# Define an array of integers
arr_int = np.array([10, 20, 30])

# Calculate the sum with a specified data type (float64)
sum_float = np.sum(arr_int, dtype=np.float64)

print("Sum with dtype float64:", sum_float)

Following is the output obtained −

Sum with dtype float64: 60.0

Summing with "Keepdims" Parameter

The keepdims parameter helps preserve the dimensionality of the original array after the sum operation. If set to True, the result will have the same number of dimensions as the input array, but the size of the summed axes will be reduced to one.

import numpy as np

# Define a 2D array 
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Sum along columns while keeping dimensions
sum_keepdims = np.sum(arr_2d, axis=0, keepdims=True)

print("Sum with keepdims=True:", sum_keepdims)

Following is the output obtained −

Sum with keepdims=True: [[12 15 18]]

Applications of NumPy Sum

The numpy.sum() function has a wide range of applications in scientific computing, data analysis, and machine learning. Some common use cases are −

  • Summing over rows or columns in matrices: In data science, you often need to calculate sums along specific axes to summarize data in tables or matrices.
  • Computing total values in an array: Summing elements in an array can help in financial analysis, statistics, and scientific computations, such as calculating the total of measurements or quantities.
  • Data aggregation: When analyzing data, summing values can be part of aggregation operations, such as finding total sales or calculating the cumulative sum of some data points.
  • Feature scaling: In machine learning, the sum of features is often used in data normalization or scaling to adjust the range of features.

Optimizing the Sum Calculation

NumPy is optimized for fast array operations, and the numpy.sum() function is highly efficient. However, there are a few ways to further optimize your sum calculations −

  • Using the out parameter: If you want to store the result of the sum in a pre-existing array, you can use the out parameter, which avoids creating a new array and helps save memory.
  • Using axis wisely: Specify the axis only when necessary. Summing over the whole array by default is the fastest operation, but summing along specific axes might be slower depending on the data.
Advertisements