handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

29,131

13,121

29,131

229

View on GitHub

Top Related Projects

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

scikit-learn

62,466

scikit-learn: machine learning in Python

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

ML-For-Beginners

73,270

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

Quick Overview

Handson-ml2 is a comprehensive repository containing Jupyter notebooks and Python scripts that accompany the O'Reilly book "Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow". It provides practical examples and exercises covering various machine learning and deep learning concepts using popular libraries.

Pros

Extensive coverage of machine learning topics, from basic to advanced
Well-structured notebooks with clear explanations and code examples
Regularly updated to keep pace with the latest versions of libraries
Includes both Jupyter notebooks and Python scripts for flexibility

Cons

May be overwhelming for absolute beginners in machine learning
Requires a significant time investment to work through all materials
Some examples might become outdated as libraries evolve rapidly
Dependency on multiple libraries can lead to potential compatibility issues

Code Examples

Loading and preprocessing data using Scikit-Learn:

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

housing = fetch_california_housing()
X_train, X_test, y_train, y_test = train_test_split(housing.data, housing.target, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Creating and training a simple neural network with Keras:

from tensorflow import keras

model = keras.models.Sequential([
    keras.layers.Dense(30, activation="relu", input_shape=X_train.shape[1:]),
    keras.layers.Dense(1)
])
model.compile(loss="mse", optimizer=keras.optimizers.SGD(learning_rate=1e-3))
history = model.fit(X_train_scaled, y_train, epochs=20, validation_split=0.2)

Implementing a random forest classifier using Scikit-Learn:

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
rf_clf.fit(X_train, y_train)
y_pred = rf_clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Random Forest Accuracy: {accuracy:.2f}")

Getting Started

Clone the repository:

git clone https://github.com/ageron/handson-ml2.git

Install dependencies:

cd handson-ml2
pip install -r requirements.txt

Launch Jupyter Notebook:
```
jupyter notebook
```
Open and run the notebooks in the handson-ml2 directory to start exploring the examples and exercises.

Competitor Comparisons

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

Pros of tensorflow

Comprehensive, official library for machine learning and deep learning
Extensive ecosystem with tools, extensions, and community support
High-performance, scalable for large-scale deployments

Cons of tensorflow

Steeper learning curve for beginners
More complex setup and configuration
Frequent updates may lead to compatibility issues

Code comparison

handson-ml2:

import tensorflow as tf
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

tensorflow:

import tensorflow as tf
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(10, activation='softmax'))

Summary

handson-ml2 is a practical, beginner-friendly repository focused on hands-on machine learning examples using various libraries, including TensorFlow. It provides a gentler introduction to machine learning concepts and implementation.

tensorflow is the official repository for the TensorFlow library, offering a powerful and flexible framework for machine learning and deep learning. It's more suitable for advanced users and large-scale projects but requires more expertise to utilize effectively.

Both repositories use similar code structures for creating neural networks, with handson-ml2 often providing more context and explanations around the code examples.

scikit-learn

62,466

scikit-learn: machine learning in Python

Pros of scikit-learn

Comprehensive machine learning library with a wide range of algorithms and tools
Well-established, mature project with extensive documentation and community support
Designed for production use with efficient implementations and scalability features

Cons of scikit-learn

Steeper learning curve for beginners due to its extensive functionality
Less focus on deep learning and neural networks compared to handson-ml2
May require additional libraries for more advanced machine learning tasks

Code Comparison

handson-ml2:

import tensorflow as tf
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(10,)),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

scikit-learn:

from sklearn.neural_network import MLPClassifier
model = MLPClassifier(hidden_layer_sizes=(64,), activation='relu', solver='adam')
model.fit(X_train, y_train)

While handson-ml2 focuses on TensorFlow and Keras for neural networks, scikit-learn provides a simpler API for various machine learning algorithms, including neural networks. The handson-ml2 repository is more suited for learning and experimentation, especially with deep learning, while scikit-learn is designed for practical implementation of machine learning models in production environments.

keras

63,156

Deep Learning for humans

Pros of Keras

Comprehensive deep learning library with extensive documentation
Supports multiple backend engines (TensorFlow, Theano, CNTK)
Large community and ecosystem of extensions

Cons of Keras

Less focus on machine learning concepts and theory
May be overwhelming for beginners due to its extensive API
Limited coverage of non-neural network algorithms

Code Comparison

Keras:

from keras.models import Sequential
from keras.layers import Dense

model = Sequential([
    Dense(64, activation='relu', input_shape=(10,)),
    Dense(1, activation='sigmoid')
])

Handson-ml2:

import tensorflow as tf

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(10,)),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

Key Differences

Handson-ml2 is a comprehensive machine learning tutorial with practical examples
Keras is a high-level neural network library focused on deep learning
Handson-ml2 covers a broader range of ML topics, including data preprocessing and visualization
Keras provides more advanced deep learning features and model architectures
Handson-ml2 uses TensorFlow's implementation of Keras, while Keras supports multiple backends

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

Comprehensive deep learning framework with extensive functionality
Large, active community and ecosystem of tools and libraries
Flexible and dynamic computational graph for easier debugging

Cons of PyTorch

Steeper learning curve for beginners compared to handson-ml2
Less focus on practical, hands-on examples and tutorials
Requires more setup and configuration for basic tasks

Code Comparison

handson-ml2:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

PyTorch:

import torch.nn as nn

model = nn.Sequential(
    nn.Linear(784, 64),
    nn.ReLU(),
    nn.Linear(64, 10),
    nn.Softmax(dim=1)
)

Summary

handson-ml2 is a practical, beginner-friendly repository focused on machine learning tutorials and examples using various libraries. PyTorch, on the other hand, is a comprehensive deep learning framework offering more advanced features and flexibility. While PyTorch provides a powerful toolset for experienced practitioners, handson-ml2 may be more suitable for those looking to learn machine learning concepts through hands-on examples.

ML-For-Beginners

73,270

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

Pros of ML-For-Beginners

More comprehensive curriculum structure with 26 lessons covering various ML topics
Includes hands-on projects and quizzes for practical learning
Offers content in multiple languages, making it accessible to a wider audience

Cons of ML-For-Beginners

Less focus on deep learning compared to handson-ml2
May not cover advanced topics in as much depth as handson-ml2
Primarily uses Scikit-learn, while handson-ml2 explores more libraries like TensorFlow

Code Comparison

ML-For-Beginners (using Scikit-learn):

from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

handson-ml2 (using TensorFlow):

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)

fastai

27,300

The fastai deep learning library

Pros of fastai

Provides a high-level API for quick and easy model development
Includes advanced techniques like mixed precision training and learning rate finder
Offers a comprehensive ecosystem with integrated libraries and tools

Cons of fastai

Steeper learning curve for beginners due to its opinionated approach
Less flexibility for low-level customization compared to handson-ml2
Primarily focused on PyTorch, limiting options for other frameworks

Code Comparison

fastai:

from fastai.vision.all import *
path = untar_data(URLs.PETS)
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=lambda x: x[0].isupper(), item_tfms=Resize(224))
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)

handson-ml2:

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor

housing = fetch_california_housing()
X_train, X_test, y_train, y_test = train_test_split(housing.data, housing.target, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
mlp = MLPRegressor(random_state=42)
mlp.fit(X_train_scaled, y_train)

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Machine Learning Notebooks

â The 3rd edition of my book will be released in October 2022. The notebooks are available at ageron/handson-ml3 and contain more up-to-date code.

This project aims at teaching you the fundamentals of Machine Learning in python. It contains the example code and solutions to the exercises in the second edition of my O'Reilly book Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow:

Note: If you are looking for the first edition notebooks, check out ageron/handson-ml. For the third edition, check out ageron/handson-ml3.

Quick Start

Want to play with these notebooks online without having to install anything?

Use any of the following services (I recommended Colab or Kaggle, since they offer free GPUs and TPUs).

WARNING: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you download any data you care about.

Just want to quickly look at some notebooks, without executing any code?

github.com's notebook viewer also works but it's not ideal: it's slower, the math equations are not always displayed correctly, and large notebooks often fail to open.

Want to run this project using a Docker image?

Read the Docker instructions.

Want to install this project on your own machine?

Start by installing Anaconda (or Miniconda), git, and if you have a TensorFlow-compatible GPU, install the GPU driver, as well as the appropriate version of CUDA and cuDNN (see TensorFlow's documentation for more details).

Next, clone this project by opening a terminal and typing the following commands (do not type the first $ signs on each line, they just indicate that these are terminal commands):

$ git clone https://github.com/ageron/handson-ml2.git
$ cd handson-ml2

Next, run the following commands:

$ conda env create -f environment.yml
$ conda activate tf2
$ python -m ipykernel install --user --name=python3

Finally, start Jupyter:

$ jupyter notebook

If you need further instructions, read the detailed installation instructions.

FAQ

Which Python version should I use?

I recommend Python 3.8. If you follow the installation instructions above, that's the version you will get. Most code will work with other versions of Python 3, but some libraries do not support Python 3.9 or 3.10 yet, which is why I recommend Python 3.8.

I'm getting an error when I call load_housing_data()

Make sure you call fetch_housing_data() before you call load_housing_data(). If you're getting an HTTP error, make sure you're running the exact same code as in the notebook (copy/paste it if needed). If the problem persists, please check your network configuration.

I'm getting an SSL error on MacOSX

You probably need to install the SSL certificates (see this StackOverflow question). If you downloaded Python from the official website, then run /Applications/Python\ 3.8/Install\ Certificates.command in a terminal (change 3.8 to whatever version you installed). If you installed Python using MacPorts, run sudo port install curl-ca-bundle in a terminal.

I've installed this project locally. How do I update it to the latest version?

See INSTALL.md

How do I update my Python libraries to the latest versions, when using Anaconda?

See INSTALL.md

Contributors

I would like to thank everyone who contributed to this project, either by providing useful feedback, filing issues or submitting Pull Requests. Special thanks go to Haesun Park and Ian Beauregard who reviewed every notebook and submitted many PRs, including help on some of the exercise solutions. Thanks as well to Steven Bunkley and Ziembla who created the docker directory, and to github user SuperYorio who helped on some exercise solutions.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of tensorflow

Cons of tensorflow

Code comparison

Summary

Pros of scikit-learn

Cons of scikit-learn

Code Comparison

Pros of Keras

Cons of Keras

Code Comparison

Key Differences

Pros of PyTorch

Cons of PyTorch

Code Comparison

Summary

Pros of ML-For-Beginners

Cons of ML-For-Beginners

Code Comparison

Pros of fastai

Cons of fastai

Code Comparison

Convert designs to code with AI

README

Machine Learning Notebooks

â The 3rd edition of my book will be released in October 2022. The notebooks are available at ageron/handson-ml3 and contain more up-to-date code.

Quick Start

Want to play with these notebooks online without having to install anything?

Just want to quickly look at some notebooks, without executing any code?

Want to run this project using a Docker image?

Want to install this project on your own machine?

FAQ

Contributors

Top Related Projects

Convert designs to code with AI

â The 3rd edition of my book will be released in October 2022. The notebooks are available at ageron/handson-ml3 and contain more up-to-date code.