Top Related Projects
SciPy library main repository
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
scikit-learn: machine learning in Python
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
Quick Overview
NumPy is the fundamental package for scientific computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a vast collection of high-level mathematical functions to operate on these arrays efficiently. NumPy is the foundation for many other scientific and data analysis libraries in Python.
Pros
- Highly efficient array operations and numerical computations
- Extensive mathematical functions and tools for scientific computing
- Seamless integration with other scientific Python libraries
- Well-documented and widely supported by the community
Cons
- Steep learning curve for beginners
- Limited support for non-numeric data types
- Memory-intensive for very large datasets
- Can be slower than specialized libraries for certain specific tasks
Code Examples
Creating and manipulating arrays:
import numpy as np
# Create a 2D array
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)
# Perform element-wise operations
print(arr * 2)
Basic linear algebra operations:
import numpy as np
# Create two matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Matrix multiplication
C = np.dot(A, B)
print(C)
# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:", eigenvectors)
Statistical operations on arrays:
import numpy as np
# Generate random data
data = np.random.randn(1000)
# Compute basic statistics
print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Standard deviation:", np.std(data))
# Compute histogram
hist, bins = np.histogram(data, bins=20)
print("Histogram:", hist)
print("Bins:", bins)
Getting Started
To get started with NumPy, first install it using pip:
pip install numpy
Then, you can import NumPy in your Python script and start using it:
import numpy as np
# Create a simple array
arr = np.array([1, 2, 3, 4, 5])
# Perform operations
print(arr.mean())
print(arr.sum())
print(arr * 2)
# Create a 2D array
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(matrix.shape)
print(matrix.T) # Transpose
This basic example demonstrates creating arrays, performing simple operations, and working with multi-dimensional arrays. NumPy offers many more advanced features for scientific computing and data analysis.
Competitor Comparisons
SciPy library main repository
Pros of SciPy
- Offers a wider range of scientific and engineering algorithms
- Includes specialized modules for optimization, integration, and signal processing
- Built on top of NumPy, providing additional functionality
Cons of SciPy
- Larger package size and potentially slower import times
- May have a steeper learning curve for beginners
- Some functions might be less optimized compared to NumPy equivalents
Code Comparison
NumPy example:
import numpy as np
a = np.array([1, 2, 3, 4])
b = np.sum(a)
SciPy example:
from scipy import stats
data = [1, 2, 3, 4]
mean, std = stats.norm.fit(data)
In this comparison, NumPy is used for basic array operations, while SciPy provides more advanced statistical functions. SciPy builds upon NumPy's foundation, offering a broader range of scientific computing tools. While NumPy excels in array operations and basic mathematical functions, SciPy extends these capabilities with specialized modules for various scientific domains. The choice between the two depends on the specific requirements of your project and the level of complexity needed in scientific computations.
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Pros of pandas
- More intuitive data structures for tabular data (DataFrame, Series)
- Built-in functionality for data analysis, cleaning, and manipulation
- Powerful time series capabilities and date range generation
Cons of pandas
- Slower performance for large numerical computations
- Higher memory usage compared to NumPy arrays
- Steeper learning curve due to more complex API
Code Comparison
pandas:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
result = df.groupby('A').sum()
NumPy:
import numpy as np
arr = np.array([[1, 4], [2, 5], [3, 6]])
result = np.sum(arr, axis=0)
pandas excels at handling structured data and provides high-level operations for data analysis. NumPy focuses on efficient numerical computations with multi-dimensional arrays. While pandas builds on top of NumPy, it offers more specialized tools for working with labeled data and time series. NumPy is generally faster for pure numerical operations, but pandas provides a more user-friendly interface for data manipulation tasks.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Automatic differentiation and GPU acceleration for deep learning
- Dynamic computational graphs for flexible model development
- Seamless integration with Python ecosystem and libraries
Cons of PyTorch
- Steeper learning curve for beginners compared to NumPy
- Less mature ecosystem for non-deep learning tasks
- Potentially slower for simple array operations
Code Comparison
NumPy example:
import numpy as np
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
z = np.dot(x, y)
PyTorch example:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.dot(x, y)
Both libraries provide similar functionality for basic array operations, but PyTorch's tensor objects are designed to work seamlessly with neural networks and GPU acceleration. PyTorch also allows for dynamic computation graphs, which can be beneficial for certain types of models and research applications.
While NumPy is more general-purpose and widely used in scientific computing, PyTorch specializes in deep learning tasks and offers more advanced features for building and training neural networks.
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Specialized for deep learning and neural networks
- Supports distributed computing and GPU acceleration
- Offers high-level APIs like Keras for easier model building
Cons of TensorFlow
- Steeper learning curve compared to NumPy
- Larger library size and more complex installation process
- Less flexible for general-purpose numerical computing
Code Comparison
NumPy example:
import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = x * 2
print(y)
TensorFlow example:
import tensorflow as tf
x = tf.constant([1, 2, 3, 4, 5])
y = x * 2
print(y.numpy())
Both libraries can perform basic array operations, but TensorFlow is designed for building and training machine learning models, while NumPy is more general-purpose. TensorFlow uses tensors and computational graphs, which can be more complex for simple operations but offer advantages for deep learning tasks. NumPy is typically easier to use for basic numerical computing and data manipulation.
scikit-learn: machine learning in Python
Pros of scikit-learn
- Higher-level machine learning functionality
- Extensive collection of algorithms for classification, regression, clustering, etc.
- Comprehensive documentation and tutorials
Cons of scikit-learn
- Slower performance for basic numerical operations
- Less flexibility for low-level array manipulations
- Steeper learning curve for beginners
Code Comparison
NumPy example:
import numpy as np
# Create a 2D array and perform element-wise operations
arr = np.array([[1, 2], [3, 4]])
result = np.sqrt(arr) + np.sin(arr)
scikit-learn example:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
# Load dataset, split, and train a classifier
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf = SVC().fit(X_train, y_train)
NumPy focuses on efficient array operations and mathematical functions, while scikit-learn provides high-level machine learning tools built on top of NumPy. scikit-learn is ideal for quickly implementing machine learning models, while NumPy is better suited for low-level numerical computations and array manipulations.
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
Pros of CNTK
- Designed for deep learning and neural networks, offering specialized tools and optimizations
- Supports distributed training across multiple GPUs and machines
- Provides a high-level API for easier model creation and training
Cons of CNTK
- Less versatile for general-purpose numerical computing compared to NumPy
- Smaller community and ecosystem than NumPy
- Development has slowed down in recent years
Code Comparison
CNTK example (creating a simple neural network):
import cntk as C
with C.layers.default_options(init=C.glorot_uniform()):
model = C.layers.Sequential([
C.layers.Dense(128, activation=C.relu),
C.layers.Dense(10, activation=None)
])
NumPy example (matrix multiplication):
import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
C = np.dot(A, B)
Summary
While CNTK is specialized for deep learning tasks with powerful distributed training capabilities, NumPy remains the go-to library for general-purpose numerical computing in Python. CNTK offers high-level APIs for neural networks, but NumPy's versatility and extensive ecosystem make it more suitable for a wider range of scientific computing tasks.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
NumPy is the fundamental package for scientific computing with Python.
- Website: https://www.numpy.org
- Documentation: https://numpy.org/doc
- Mailing list: https://mail.python.org/mailman/listinfo/numpy-discussion
- Source code: https://github.com/numpy/numpy
- Contributing: https://www.numpy.org/devdocs/dev/index.html
- Bug reports: https://github.com/numpy/numpy/issues
- Report a security vulnerability: https://tidelift.com/docs/security
It provides:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
Testing:
NumPy requires pytest
and hypothesis
. Tests can then be run after installation with:
python -c "import numpy, sys; sys.exit(numpy.test() is False)"
Code of Conduct
NumPy is a community-driven open source project developed by a diverse group of contributors. The NumPy leadership has made a strong commitment to creating an open, inclusive, and positive community. Please read the NumPy Code of Conduct for guidance on how to interact with others in a way that makes our community thrive.
Call for Contributions
The NumPy project welcomes your expertise and enthusiasm!
Small improvements or fixes are always appreciated. If you are considering larger contributions to the source code, please contact us through the mailing list first.
Writing code isnât the only way to contribute to NumPy. You can also:
- review pull requests
- help us stay on top of new and old issues
- develop tutorials, presentations, and other educational materials
- maintain and improve our website
- develop graphic design for our brand assets and promotional materials
- translate website content
- help with outreach and onboard new contributors
- write grant proposals and help with other fundraising efforts
For more information about the ways you can contribute to NumPy, visit our website. If youâre unsure where to start or how your skills fit in, reach out! You can ask on the mailing list or here, on GitHub, by opening a new issue or leaving a comment on a relevant issue that is already open.
Our preferred channels of communication are all public, but if youâd like to speak to us in private first, contact our community coordinators at numpy-team@googlegroups.com or on Slack (write numpy-team@googlegroups.com for an invitation).
We also have a biweekly community call, details of which are announced on the mailing list. You are very welcome to join.
If you are new to contributing to open source, this guide helps explain why, what, and how to successfully get involved.
Top Related Projects
SciPy library main repository
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
scikit-learn: machine learning in Python
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot