NumPy, short for 'Numerical Python,' is one of the core libraries for numerical computing in Python. Known for its powerful array operations, NumPy makes data manipulation and mathematical calculations in Python efficient and fast, without the need for slow, traditional loops. Not only that NumPy also has many features including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, and many more.
Installing NumPy
pip install numpy
A. Basic NumPy Operation
NumPy provides a variety of built-in functions that simplify math and data manipulation. Before diving into specific functions, it’s crucial to understand how NumPy arrays work and how to manipulate data within them. This knowledge will serve as a foundation for using NumPy’s more advanced features effectively.
1. Create a NumPy ndarray Object
To access the full range of functions in NumPy, you need to create a NumPy array, or ndarray. Arrays can be created in several ways, depending on the type and structure of data you need.
import numpy as np
# Creating a 1D ndarray object
arr = np.array([1, 2, 3, 4, 5])
print("1D array:", arr)
# Creating a 2D ndarray object
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D array:\n", arr_2d)
output:
1D array: [1 2 3 4 5]
2D array:
[[1 2 3]
[4 5 6]]
- np.ones(): Create an array of ones.
# Create a 1D array of ones with 5 elements
arr_ones_1d = np.ones(5)
print("1D array of ones:", arr_ones_1d)
# Create a 2D array of ones with shape (3, 4)
arr_ones_2d = np.ones((3, 4))
print("2D array of ones:\n", arr_ones_2d)
output:
1D array of ones: [1. 1. 1. 1. 1.]
2D array of ones:
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
- np.zeros(): Create an array of zeros.
# Create a 1D array of zeros with 5 elements
arr_zeros_1d = np.zeros(5)
print("1D array of zeros:", arr_zeros_1d)
# Create a 2D array of zeros with shape (2, 3)
arr_zeros_2d = np.zeros((2, 3))
print("2D array of zeros:\n", arr_zeros_2d)
output:
1D array of zeros: [0. 0. 0. 0. 0.]
2D array of zeros:
[[0. 0. 0.]
[0. 0. 0.]]
- np.eye(): Create an identity matrix.
# Create a 3x3 identity matrix
identity_matrix = np.eye(3)
print("3x3 identity matrix:\n", identity_matrix)
output:
3x3 identity matrix:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
- np.full(): Create an array filled with a specified value.
# Create a 2x3 array filled with the value 7
arr_full = np.full((2, 3), 7)
print("Array filled with 7:\n", arr_full)
output:
Array filled with 7:
[[7 7 7]
[7 7 7]]
2. Array Slicing
Array Slicing in NumPy is relatively similar to Python for 1 dimension. In a two-dimensional array, the elements at each index are no longer scalars but rather one-dimensional arrays.
# Slicing a 1D array
slice_1d = arr[1:4]
print("Slice of 1D array (index 1 to 3):", slice_1d)
# Slicing a 2D array
slice_2d = arr_2d[:, 1:3]
print("Slice of 2D array (all rows, col 1 to 2):\n", slice_2d)
output:
Slice of 1D array (index 1 to 3): [2 3 4]
Slice of 2D array (all rows, col 1 to 2):
[[2 3]
[5 6]]
3. Array Shape
The shape of an array tells us its dimensions. This is especially helpful in data analysis and machine learning when working with matrices or higher-dimensional data.
print("Shape of 1D array:", arr.shape)
print("Shape of 2D array:", arr_2d.shape)
output:
Shape of 1D array: (5,)
Shape of 2D array: (2, 3)
4. Array Reshape
Reshaping allows you to change the dimensions of an array. This is useful when reorganizing data or preparing it for machine learning models.
# Reshaping a 1D array into a 2D array (3 rows, 2 columns)
reshaped_arr = arr.reshape((3, 2))
print("Reshaped array (3x2):\n", reshaped_arr)
output:
Reshaped array (3x2):
[[1 2]
[3 4]
[5 0]]
5. Array Iterating
You can iterate over a NumPy array just like any other Python iterable, but NumPy provides built-in functions to make this efficient.
print("Iterating over 1D array:")
for x in arr:
print(x)
print("Iterating over 2D array:")
for row in arr_2d:
print(row)
output:
Iterating over 1D array:
1
2
3
4
5
Iterating over 2D array:
[1 2 3]
[4 5 6]
6. Array Join
If you want to combine two arrays into one, use the concatenate function in NumPy. This is helpful when merging datasets.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
joined_arr = np.concatenate((arr1, arr2))
print("Joined array:", joined_arr)
output:
Joined array: [1 2 3 4 5 6]
7. Array Split
The split function allows you to divide an array into multiple parts, which can be helpful when working with sections of data.
split_arr = np.array_split(joined_arr, 3)
print("Split array into 3 parts:", split_arr)
output:
Split array into 3 parts: [array([1, 2]), array([3, 4]), array([5, 6])]
8. Array Search
You can use np.where to search for values within an array and return the indices where a condition is met.
search_result = np.where(joined_arr == 4)
print("Index where element is 4:", search_result[0])
Output:
Index where element is 4: [3]
9. Array Sort
Sorting data is often essential in data analysis to identify trends or patterns. Use np.sort to sort arrays.
unsorted_arr = np.array([3, 1, 5, 2, 4])
sorted_arr = np.sort(unsorted_arr)
print("Sorted array:", sorted_arr)
output:
Sorted array: [1 2 3 4 5]
10. Array Filter
Filtering in NumPy can be done using boolean indexing. This allows you to extract elements based on certain conditions.
filter_condition = joined_arr > 3
filtered_arr = joined_arr[filter_condition]
print("Filtered array (elements > 3):", filtered_arr)
output:
Filtered array (elements > 3): [4 5 6]
B. Universal functions (ufuncs)
Universal functions (ufuncs) in NumPy are functions that support operations like addition, subtraction, trigonometric functions, and more. Below are examples demonstrating how to use some common universal functions in NumPy:
1. Arithmetic Operations
Arithmetic is a mathematics branch that studies numerical operations like addition, subtraction, multiplication, and division.
import numpy as np
# Create two arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
# Addition
add_result = np.add(arr1, arr2)
print("Addition:", add_result)
# Subtraction
sub_result = np.subtract(arr1, arr2)
print("Subtraction:", sub_result)
# Multiplication
mul_result = np.multiply(arr1, arr2)
print("Multiplication:", mul_result)
# Division
div_result = np.divide(arr1, arr2)
print("Division:", div_result)
# Modulus
mod_result = np.mod(arr2, arr1)
print("Modulus:", mod_result)
output:
Addition: [5 7 9]
Subtraction: [-3 -3 -3]
Multiplication: [ 4 10 18]
Division: [0.25 0.4 0.5 ]
Modulus: [0 1 0]
Illustration:
2. Trigonometric Functions
Trigonometry is the branch of mathematics concerned with specific functions of angles and their application to calculations.
# Create an array of angles in radians
angles = np.array([0, np.pi/2, np.pi, 3*np.pi/2])
# Sine of angles
sin_result = np.sin(angles)
print("Sine:", sin_result)
# Cosine of angles
cos_result = np.cos(angles)
print("Cosine:", cos_result)
# Tangent of angles
tan_result = np.tan(angles)
print("Tangent:", tan_result)
output:
Sine: [ 0. 1. 0. -1.]
Cosine: [ 1.000000e+00 6.123234e-17 -1.000000e+00 -1.836970e-16]
Tangent: [ 0.0000000e+00 1.6331239e+16 -1.2246468e-16 5.4437465e+15]
3. Vector Operations
NumPy also provides functions to deal with vector operation.
# Create two vectors
vec1 = np.array([1, 2, 3])
vec2 = np.array([4, 5, 6])
# Dot product
dot_product = np.dot(vec1, vec2)
print("Dot product:", dot_product)
# Cross product
cross_product = np.cross(vec1, vec2)
print("Cross product:", cross_product)
# Magnitude of a vector
magnitude = np.linalg.norm(vec1)
print("Magnitude of vec1:", magnitude)
# Angle between vectors (in radians)
cos_theta = np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))
angle = np.arccos(cos_theta)
print("Angle between vec1 and vec2 (radians):", angle)
output:
Dot product: 32
Cross product: [-3 6 -3]
Magnitude of vec1: 3.7416573867739413
Angle between vec1 and vec2 (radians): 0.2257261285527342
4. Most Used Functions
Some commonly used functions in NumPy include sum, mean, and max, which are often used in data analysis.
# Array of random numbers
arr = np.array([3, 1, 5, 4, 2])
# Minimum value
min_value = np.min(arr)
print("Minimum value:", min_value)
# Maximum value
max_value = np.max(arr)
print("Maximum value:", max_value)
# Sum of all elements
sum_value = np.sum(arr)
print("Sum of all elements:", sum_value)
# Mean of the array
mean_value = np.mean(arr)
print("Mean:", mean_value)
# Standard deviation
std_dev = np.std(arr)
print("Standard deviation:", std_dev)
# Sorting an array
sorted_arr = np.sort(arr)
print("Sorted array:", sorted_arr)
output:
Minimum value: 1
Maximum value: 5
Sum of all elements: 15
Mean: 3.0
Standard deviation: 1.4142135623730951
Sorted array: [1 2 3 4 5]
NumPy is an essential tool for any data analyst or Python programmer looking to work with large datasets efficiently. By understanding and mastering these basic and advanced NumPy functions, you’ll be equipped to handle a variety of data processing tasks. Experiment with these functions in your own projects to see the true power of NumPy in action!