handson-ml

⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.

25,538

12,884

25,538

143

View on GitHub

Top Related Projects

models

77,618

Models and examples built with TensorFlow

scikit-learn

62,466

scikit-learn: machine learning in Python

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

ML-For-Beginners

73,270

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

Quick Overview

"Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow" is a comprehensive GitHub repository accompanying the popular book by Aurélien Géron. It contains Jupyter notebooks with code examples, exercises, and solutions covering various machine learning and deep learning topics using popular Python libraries.

Pros

Extensive coverage of machine learning concepts with practical implementations
Well-structured notebooks with clear explanations and visualizations
Regular updates to keep pace with the latest library versions and ML techniques
Includes both Scikit-Learn (for traditional ML) and TensorFlow/Keras (for deep learning)

Cons

May be overwhelming for absolute beginners in machine learning
Some examples might become outdated as libraries evolve rapidly
Requires a significant time investment to work through all the material
Dependency on specific library versions may cause compatibility issues

Code Examples

Loading and preprocessing data using Scikit-Learn:

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Creating and training a simple neural network with Keras:

from tensorflow import keras

model = keras.Sequential([
    keras.layers.Dense(64, activation="relu", input_shape=[X_train.shape[1]]),
    keras.layers.Dense(32, activation="relu"),
    keras.layers.Dense(1)
])
model.compile(optimizer="adam", loss="mse")
history = model.fit(X_train, y_train, epochs=100, validation_split=0.2)

Implementing a random forest classifier with Scikit-Learn:

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
rf_clf.fit(X_train, y_train)
y_pred = rf_clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

Getting Started

To get started with the project:

Clone the repository:

git clone https://github.com/ageron/handson-ml.git

Install the required dependencies:
```
pip install -r requirements.txt
```
Launch Jupyter Notebook:
```
jupyter notebook
```
Open and run the notebooks in the handson-ml directory to explore the examples and exercises.

Competitor Comparisons

models

77,618

Models and examples built with TensorFlow

Pros of models

Extensive collection of official TensorFlow models and examples
Regularly updated with state-of-the-art implementations
Comprehensive documentation and tutorials for each model

Cons of models

Steeper learning curve for beginners
Less focus on hands-on, step-by-step learning
May be overwhelming due to the large number of models and implementations

Code comparison

handson-ml:

from sklearn.ensemble import RandomForestClassifier

forest_clf = RandomForestClassifier(n_estimators=100, random_state=42)
forest_clf.fit(X_train, y_train)
y_pred = forest_clf.predict(X_test)

models:

import tensorflow as tf
from official.nlp import bert
import official.nlp.bert.tokenization as tokenization

bert_config = bert.BertConfig.from_json_file(bert_config_file)
model = bert.BertModel(bert_config)

The handson-ml example demonstrates a simpler approach using scikit-learn, while the models example showcases a more complex BERT model implementation using TensorFlow.

scikit-learn

62,466

scikit-learn: machine learning in Python

Pros of scikit-learn

Comprehensive machine learning library with a wide range of algorithms and tools
Well-established, extensively documented, and actively maintained by a large community
Designed for production use with a focus on performance and scalability

Cons of scikit-learn

Steeper learning curve for beginners due to its extensive functionality
Less focus on deep learning and neural networks compared to handson-ml
May require additional libraries for more advanced machine learning tasks

Code Comparison

scikit-learn:

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=4)
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X, y)

handson-ml:

import tensorflow as tf
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy')

Summary

scikit-learn is a robust, production-ready machine learning library with a wide range of algorithms, while handson-ml focuses more on practical examples and tutorials, including deep learning with TensorFlow. scikit-learn is better suited for traditional machine learning tasks, while handson-ml provides a more accessible introduction to various ML concepts and techniques.

fastai

27,300

The fastai deep learning library

Pros of fastai

Provides a high-level API for rapid prototyping and experimentation
Offers built-in support for advanced techniques like transfer learning and mixed precision training
Includes a comprehensive library of pre-trained models and datasets

Cons of fastai

Steeper learning curve for beginners due to its more opinionated approach
Less flexibility for low-level customization compared to handson-ml
Primarily focused on PyTorch, limiting options for other frameworks

Code Comparison

handson-ml:

from sklearn.ensemble import RandomForestClassifier

rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
rf_clf.fit(X_train, y_train)
y_pred = rf_clf.predict(X_test)

fastai:

from fastai.vision.all import *

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(4)
preds, _ = learn.get_preds(dl=dls.test_dl(test_images))

The handson-ml example demonstrates a more traditional scikit-learn approach, while fastai showcases its high-level API for quickly building and training a CNN model.

keras

63,156

Deep Learning for humans

Pros of Keras

Comprehensive deep learning library with extensive documentation
Supports multiple backend engines (TensorFlow, Theano, CNTK)
Large community and ecosystem of extensions

Cons of Keras

Less focus on machine learning concepts and theory
May be overwhelming for beginners due to its extensive API

Code Comparison

Keras:

from keras.models import Sequential
from keras.layers import Dense

model = Sequential([
    Dense(64, activation='relu', input_shape=(784,)),
    Dense(10, activation='softmax')
])

Handson-ml:

import tensorflow as tf

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(10, activation='softmax')
])

Key Differences

Handson-ml provides a more educational approach with explanations and examples
Keras focuses on providing a powerful, flexible deep learning framework
Handson-ml covers a broader range of machine learning topics beyond deep learning
Keras offers more advanced features and customization options for deep learning

Target Audience

Handson-ml: Beginners and intermediate learners seeking a comprehensive ML education
Keras: Developers and researchers looking for a production-ready deep learning library

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

Extensive, production-ready deep learning framework
Large community and ecosystem of tools/extensions
Flexible and dynamic computational graph

Cons of PyTorch

Steeper learning curve for beginners
More complex setup and installation process
Less focus on traditional machine learning algorithms

Code Comparison

handson-ml (using TensorFlow):

model = keras.models.Sequential([
    keras.layers.Dense(30, activation="relu", input_shape=[8]),
    keras.layers.Dense(1)
])
model.compile(loss="mse", optimizer=keras.optimizers.SGD(learning_rate=1e-3))

PyTorch:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(8, 30)
        self.fc2 = nn.Linear(30, 1)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = Net()
optimizer = optim.SGD(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()

ML-For-Beginners

73,270

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

Pros of ML-For-Beginners

More comprehensive curriculum structure with lesson plans and quizzes
Covers a wider range of ML topics, including ethics and real-world applications
Designed for self-paced learning with clear progression

Cons of ML-For-Beginners

Less focus on hands-on coding exercises compared to handson-ml
May not delve as deeply into technical details of ML algorithms
Newer repository with potentially fewer community contributions

Code Comparison

ML-For-Beginners example (Python):

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LogisticRegression()
model.fit(X_train, y_train)

handson-ml example (Python):

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("linear_svc", LinearSVC(C=1, loss="hinge"))
])
svm_clf.fit(X_train, y_train)

Both repositories offer valuable resources for learning machine learning, with ML-For-Beginners providing a more structured curriculum and handson-ml focusing on practical implementation. The choice between them depends on the learner's preferences and goals.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Machine Learning Notebooks

â THE THIRD EDITION OF MY BOOK IS NOW AVAILABLE.

This project is for the first edition, which is now outdated.

This project aims at teaching you the fundamentals of Machine Learning in python. It contains the example code and solutions to the exercises in my O'Reilly book Hands-on Machine Learning with Scikit-Learn and TensorFlow:

Quick Start

Want to play with these notebooks online without having to install anything?

Use any of the following services.

WARNING: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you download any data you care about.

Recommended: open this repository in Colaboratory:
Or open it in Binder:
- Note: Most of the time, Binder starts up quickly and works great, but when handson-ml is updated, Binder creates a new environment from scratch, and this can take quite some time.
Or open it in Deepnote:

Just want to quickly look at some notebooks, without executing any code?

Browse this repository using jupyter.org's notebook viewer:

Note: github.com's notebook viewer also works but it is slower and the math equations are not always displayed correctly.

Want to run this project using a Docker image?

Read the Docker instructions.

Want to install this project on your own machine?

Start by installing Anaconda (or Miniconda), git, and if you have a TensorFlow-compatible GPU, install the GPU driver, as well as the appropriate version of CUDA and cuDNN (see TensorFlow's documentation for more details).

Next, clone this project by opening a terminal and typing the following commands (do not type the first $ signs on each line, they just indicate that these are terminal commands):

$ git clone https://github.com/ageron/handson-ml.git
$ cd handson-ml

Next, run the following commands:

$ conda env create -f environment.yml
$ conda activate tf1
$ python -m ipykernel install --user --name=python3

Finally, start Jupyter:

$ jupyter notebook

If you need further instructions, read the detailed installation instructions.

FAQ

Which Python version should I use?

I recommend Python 3.7. If you follow the installation instructions above, that's the version you will get. Most code will work with other versions of Python 3, but some libraries do not support Python 3.8 or 3.9 yet, which is why I recommend Python 3.7.

I'm getting an error when I call load_housing_data()

Make sure you call fetch_housing_data() before you call load_housing_data(). If you're getting an HTTP error, make sure you're running the exact same code as in the notebook (copy/paste it if needed). If the problem persists, please check your network configuration.

I'm getting an SSL error on MacOSX

You probably need to install the SSL certificates (see this StackOverflow question). If you downloaded Python from the official website, then run /Applications/Python\ 3.7/Install\ Certificates.command in a terminal (change 3.7 to whatever version you installed). If you installed Python using MacPorts, run sudo port install curl-ca-bundle in a terminal.

I've installed this project locally. How do I update it to the latest version?

See INSTALL.md

How do I update my Python libraries to the latest versions, when using Anaconda?

See INSTALL.md

Contributors

I would like to thank everyone who contributed to this project, either by providing useful feedback, filing issues or submitting Pull Requests. Special thanks go to Haesun Park and Ian Beauregard who reviewed every notebook and submitted many PRs, including help on some of the exercise solutions. Thanks as well to Steven Bunkley and Ziembla who created the docker directory, and to github user SuperYorio who helped on some exercise solutions.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of models

Cons of models

Code comparison

Pros of scikit-learn

Cons of scikit-learn

Code Comparison

Summary

Pros of fastai

Cons of fastai

Code Comparison

Pros of Keras

Cons of Keras

Code Comparison

Key Differences

Target Audience

Pros of PyTorch

Cons of PyTorch

Code Comparison

Pros of ML-For-Beginners

Cons of ML-For-Beginners

Code Comparison

Convert designs to code with AI

README

Machine Learning Notebooks

â THE THIRD EDITION OF MY BOOK IS NOW AVAILABLE.

Quick Start

Want to play with these notebooks online without having to install anything?

Just want to quickly look at some notebooks, without executing any code?

Want to run this project using a Docker image?

Want to install this project on your own machine?

FAQ

Contributors

Top Related Projects

Convert designs to code with AI

â THE THIRD EDITION OF MY BOOK IS NOW AVAILABLE.