One of the crucial tools in data science and numerical computing is Numerical Python or, more commonly, NumPy. It is a strong library that makes us comfortable dealing with arrays and matrices, and it can also make complex maths operations. Be it data analysis, machine learning, or scientific research, NumPy is that one package without which work becomes a bit difficult and time-consuming.
Thus, why is NumPy so important? The flexibility of the standard list of Python is high, but its speed is low when performing massive calculations. As for the arrays, the NumPy array in Python is more optimised in its implementation as compared to the Python list. Unlike Python’s lists, it saves elements in proximal memory spaces; hence, data processing takes less time and does not cause overhead. Also, NumPy is developed in C language, which makes it possible for operations to be performed at a very high speed.
Installation and Importing NumPy Library
we can install NumPy using pip, which is the package installer in Python. Open a terminal or command prompt and type:
pip install numpy
Once installed, importing NumPy into our Python script is straightforward. We typically use the alias np for convenience:
import numpy as np
This small step sets the stage for utilising all the powerful features that NumPy offers.
Get curriculum highlights, career paths, industry insights and accelerate your technology journey.
Download brochure
Creating NumPy Array in Python
Creating arrays in NumPy can be done in several ways. One of the most common methods is converting existing Python lists or tuples into NumPy arrays. This is not only simple but also introduces us to the basic structure of NumPy arrays.
Using Lists
Let’s begin by transforming a Python list into a NumPy array.
import numpy as np
# Creating a list
my_list = [1, 2, 3, 4, 5]
# Converting the list to a NumPy array
np_array = np.array(my_list)
print(np_array)
Output:
Using Tuples
Similarly, we can create a NumPy array in Python from a tuple. This is particularly useful when we want to ensure the immutability of the data structure before conversion:
import numpy as np
# Creating a tuple
my_tuple = (1, 2, 3, 4, 5)
# Converting the tuple to a NumPy array
np_array = np.array(my_tuple)
print(np_array)
Output:
Different Types of NumPy Array in Python and Uses
There are various forms of array, each serving a different purpose. With the knowledge of these types, we can readily select the most preferable one.
0-D Arrays
A 0-D array, or scalar, contains a single element. It’s the most basic type of array.
import numpy as np
# Creating a 0-D array
zero_d_array = np.array(42)
print(zero_d_array)
Output:
1-D Arrays
A 1-D array is a single-dimensional array, essentially a list. It’s commonly used for simple data structures.
import numpy as np
# Creating a 1-D array
one_d_array = np.array([1, 2, 3, 4, 5])
print(one_d_array)
Output:
2-D Arrays
2-D arrays are more complex and are used to represent matrices or tables of data. Each element in a 2-D array is an array itself.
import numpy as np
# Creating a 2-D array
two_d_array = np.array([[1, 2, 3], [4, 5, 6]])
print(two_d_array)
Output:
Higher-Dimensional Arrays
Higher dimensional arrays, like 3-D arrays, are widely used in fields like deep learning and image processing.
Understanding the attributes of the NumPy array in Python is crucial for effective data manipulation. Let’s explore some of these essential attributes.
Shape, Size, and Dimensions
So, the shape of a NumPy array in Python describes how many elements are present along the axis. The size provides the total number of elements, and ndim shows the number of dimensions.
import numpy as np
# Creating a 2-D array
array = np.array([[1, 2, 3], [4, 5, 6]])
print("Shape:", array.shape)
print("Size:", array.size)
print("Number of dimensions:", array.ndim)
Output:
Data Types in NumPy Array in Python
Each item in a NumPy array in Python has a type that may be set at the time of array creation or may be computed based on the array contents. The dtype attribute lets us know the kind of elements the array holds.
import numpy as np
# Creating an array with float elements
array = np.array([1.5, 2.3, 3.1])
print("Data type:", array.dtype) # Output: float64
Output:
Understanding these attributes allows us to manipulate arrays more effectively and ensures that our data is structured and accessed correctly.
Array Creation Methods in NumPy
Creating arrays in NumPy is versatile and straightforward, offering multiple methods to suit different needs. Let’s dive into some of the most common and useful array creation methods.
Using np.array() Function
The np.array() function is the simplest way to create a NumPy array in Python. It can convert lists, tuples, or any array-like structure into an array. This function is versatile and keeps us informed of the data type of the items involved.
import numpy as np
# Creating an array from a list
my_list = [1, 2, 3, 4, 5]
np_array = np.array(my_list)
print(np_array)
Output:
Using np.arange() for Creating Sequential Arrays
np.arange() function creates an array with equally spaced values in a specified range. It is the same as the built-in range() function but it returns an array.
import numpy as np
# Creating an array with values from 0 to 9
seq_array = np.arange(10)
print(seq_array)
Output:
We can also define the start, stop, and step values to customise the sequence.
import numpy as np
# Creating an array from 1 to 9 with a step of 2
seq_array = np.arange(1, 10, 2)
print(seq_array)
Output:
Using np.linspace() for Linearly Spaced Arrays
np.linspace() creates an array with the desired number of elements and an equidistant separation between the start and endpoints. This method is often useful when you want to find n numbers that are equally divided at equal intervals.
import numpy as np
# Creating an array with 5 values from 0 to 1
lin_array = np.linspace(0, 1, 5)
print(lin_array)
Output:
Using np.zeros() for Zero Arrays
np.zeros() creates an array filled with zeros. It’s handy when we need an array of a specific size but with all elements initialised to zero.
import numpy as np
# Creating a 2x3 array of zeros
zero_array = np.zeros((2, 3))
print(zero_array)
Output:
Using np.ones() for Arrays of Ones
np.ones() generates an array filled with ones, similar to np.zeros(), but with ones instead.
import numpy as np
# Creating a 3x2 array of ones
ones_array = np.ones((3, 2))
print(ones_array)
Output:
Using np.empty() for Uninitialised Arrays
np.empty() creates an array without initialising the values. The values in the array will be whatever was in memory at that location. It’s faster but requires caution.
import numpy as np
# Creating a 2x2 uninitialised array
empty_array = np.empty((2, 2))
print(empty_array)
Output:
Using np.full() for Constant Value Arrays
np.full() generates an array filled with a specified constant value.
import numpy as np
# Creating a 2x3 array filled with the value 7
full_array = np.full((2, 3), 7)
print(full_array)
Output:
Performing Basic Operations on NumPy Array in Python
When we have formed arrays, we are able to work on them in several ways. These methods are made easy by NumPy, which allows for easy data manipulation.
Arithmetic Operations
NumPy supports element-wise arithmetic operations, making it possible to add, subtract, multiply, and divide arrays.
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Adding arrays
add_result = a + b
print("Addition:", add_result)
# Subtracting arrays
sub_result = a - b
print("Subtraction:", sub_result)
# Multiplying arrays
mul_result = a * b
print("Multiplication:", mul_result)
# Dividing arrays
div_result = a / b
print("Division:", div_result)
Output:
Aggregation Functions
NumPy gives a range of functions for performing the operations like sum, mean, maximum, and minimum of elements of an array.
import numpy as np
array = np.array([1, 2, 3, 4, 5])
# Sum of array elements
sum_result = array.sum()
print("Sum:", sum_result)
# Mean of array elements
mean_result = array.mean()
print("Mean:", mean_result)
# Maximum element
max_result = array.max()
print("Max:", max_result)
# Minimum element
min_result = array.min()
print("Min:", min_result)
Output:
Advanced Indexing and Slicing Techniques
In NumPy, advanced indexing and slicing enable various ways to access and alter elements in an array.
Boolean Indexing
Boolean indexing lets us filter arrays based on conditions. We can use boolean arrays to select elements that meet certain criteria.
import numpy as np
array = np.array([1, 2, 3, 4, 5])
bool_idx = array > 2
# Selecting elements greater than 2
filtered_array = array[bool_idx]
print("Elements greater than 2:", filtered_array)
Output:
Fancy Indexing
Fancy indexing uses arrays of indices to access elements. It’s useful when we need to select multiple elements based on their positions.
import numpy as np
array = np.array([10, 20, 30, 40, 50])
indices = [1, 3]
# Selecting elements at index 1 and 3
selected_elements = array[indices]
print("Selected elements:", selected_elements)
Broadcasting is a highly useful concept in NumPy that enables us to work with arrays of different dimensions. Its capability sees the two arrays as equivalent or similar and automatically increases the number of dimensions in the smaller one to the extent of the larger one.
In the arithmetic operations of the two arrays with different sizes, NumPy uses broadcasting. It makes operations vectorised and efficient in terms of time as well as memory space.
It simplifies our code and enhances performance if a lengthy series of calculations is involved. It eradicates the implicit use of loops and the alignment of data shapes, which is more efficient for our code.
Working with arrays often requires combining multiple arrays into one or splitting a single array into several. NumPy provides efficient ways to handle these tasks, making data manipulation straightforward.
Combining Arrays with np.concatenate()
np.concatenate()is a widely used function to merge arrays. This function joins two or more arrays in the given axis.
We can also combine multi-dimensional arrays along different axes
import numpy as np
# Creating two 2D arrays
array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])
# Combining arrays along axis 1
combined_array = np.concatenate((array1, array2), axis=1)
print("Combined 2D array along axis 1:n", combined_array)
Output:
Splitting Arrays with np.split()
Conversely, we might need to split an array into multiple smaller arrays. The np.split() function allows us to do just that.
import numpy as np
# Creating an array
array = np.array([1, 2, 3, 4, 5, 6])
# Splitting the array into 3 parts
split_array = np.split(array, 3)
print("Split array:", split_array)
Output:
For multi-dimensional arrays, we can specify the axis along which to split.
import numpy as np
# Creating a 2D array
array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Splitting the 2D array along axis 0
split_array = np.split(array, 3, axis=0)
print("Split 2D array along axis 0:n", split_array)
Output:
Saving and Loading NumPy Array in Python
Efficient data handling means common operations, such as saving arrays to disk and then loading them back again. This is made easier in NumPy with functions such as np.save(), np.load(), np.savetxt(), and np.loadtxt().
Saving and Loading Binary Files
The np.save() function saves an array into a binary file following the specifications of NumPy’s .npy format. We can load the array back using np.load().
import numpy as np
# Creating an array
array = np.array([1, 2, 3, 4, 5])
# Saving the array to a file
np.save('my_array.npy', array)
# Loading the array from the file
loaded_array = np.load('my_array.npy')
print("Loaded array:", loaded_array)
Working with Text Files
Sometimes, we need to save arrays in a human-readable format. np.savetxt() and np.loadtxt() handle text files like CSV.
import numpy as np
# Creating an array
array = np.array([1, 2, 3, 4, 5])
# Saving the array to a text file
np.savetxt('my_array.txt', array)
# Loading the array from the text file
loaded_array = np.loadtxt('my_array.txt')
print("Loaded array from text file:", loaded_array)
Practical Examples of NumPy Arrays in Data Science
NumPy is an acknowledged tool in data science which is used for data cleaning, transformation, and preparation for analysis, statistical computations, and machine learning.
Data Preprocessing
Before feeding data into a machine learning model, it often needs preprocessing. The NumPy array in Python makes this process efficient.
import numpy as np
# Creating a sample dataset
data = np.array([[1.2, 2.3], [3.4, 4.5], [5.6, 6.7]])
# Normalising data
data_mean = np.mean(data, axis=0)
data_std = np.std(data, axis=0)
normalised_data = (data - data_mean) / data_std
print("Normalised data:n", normalised_data)
Output:
Statistical Analysis
NumPy simplifies statistical analysis with functions for computing mean, median, variance, and more.
import numpy as np
# Creating a sample dataset
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
# Computing statistical measures
mean = np.mean(data)
median = np.median(data)
variance = np.var(data)
print("Mean:", mean)
print("Median:", median)
print("Variance:", variance)
Output:
Common Errors and Debugging Tips
Even with a powerful tool like NumPy, we might encounter errors. Here are some common issues and how to address them.
Error Type
Description
Solution
Shape Mismatch Errors
Occur when performing operations on arrays of incompatible shapes.
Ensure that arrays are compatible in shape before performing operations. Use reshaping if necessary.
Type Errors
Happen when incompatible data types are used together.
Check and convert data types as needed. NumPy functions like astype() can help with type conversion.
Memory Errors
Large arrays can lead to memory errors, especially when operations require significant memory.
Work with smaller subsets of data or use memory-efficient data structures.
Index Errors
Occur when accessing elements outside the array’s bounds.
Always check array dimensions and bounds before accessing elements.
Debugging Tips
Use print statements to check array shapes and values at different stages of your code.
Use Assertions to ensure that array shapes and types are as expected.
Refer to NumPy documentation for detailed explanations of functions and their requirements.
Simplify complex operations into smaller steps to isolate and identify issues.
Conclusion
In this blog, we’ve explored essential NumPy functionalities, from creating arrays and performing operations to combining, splitting, saving, and loading data. We delved into advanced techniques like indexing, reshaping, and broadcasting and discussed practical applications in data science. Knowledge of these tools and methods expands our working capacity regarding the data and increases our efficiency at work. Learning NumPy gives us the tools to manipulate the various information types and prepare for further calculations and data analysis using machine learning methods.
FAQs
What is the difference between a Python list and a NumPy array?
In Python, the list is a versatile and basic data container that is capable of holding elements of different types. On the other hand, the NumPy array in Python has been specifically developed for numeric computations and allows only uniform types of elements. It is faster and more efficient in memory handling since all NumPy arrays are stored in contiguous memory locations.
Can NumPy handle multi-dimensional arrays?
Yes, this is one of the significant strong points of NumPy; it deals with multi-dimensional arrays very effectively. It allows the creation and control of arrays of any order, which means one- two- three- and even more dimensional arrays.
What are some common operations that can be performed on NumPy arrays?
NumPy allows many kinds of computations, such as addition, subtraction, multiplication and division, mean, maximum, minimum, reshaping of arrays, indexing and slicing and many more. These operations can operate effectively on a large set of data.
Hero Vired is a leading LearnTech company dedicated to offering cutting-edge programs in collaboration with top-tier global institutions. As part of the esteemed Hero Group, we are committed to revolutionizing the skill development landscape in India. Our programs, delivered by industry experts, are designed to empower professionals and students with the skills they need to thrive in today’s competitive job market.