Top Related Projects
Models and examples built with TensorFlow
scikit-learn: machine learning in Python
The fastai deep learning library
Deep Learning for humans
Tensors and Dynamic neural networks in Python with strong GPU acceleration
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Quick Overview
"Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow" is a comprehensive GitHub repository accompanying the popular book by Aurélien Géron. It contains Jupyter notebooks with code examples, exercises, and solutions covering various machine learning and deep learning topics using popular Python libraries.
Pros
- Extensive coverage of machine learning concepts with practical implementations
- Well-structured notebooks with clear explanations and visualizations
- Regular updates to keep pace with the latest library versions and ML techniques
- Includes both Scikit-Learn (for traditional ML) and TensorFlow/Keras (for deep learning)
Cons
- May be overwhelming for absolute beginners in machine learning
- Some examples might become outdated as libraries evolve rapidly
- Requires a significant time investment to work through all the material
- Dependency on specific library versions may cause compatibility issues
Code Examples
- Loading and preprocessing data using Scikit-Learn:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
- Creating and training a simple neural network with Keras:
from tensorflow import keras
model = keras.Sequential([
keras.layers.Dense(64, activation="relu", input_shape=[X_train.shape[1]]),
keras.layers.Dense(32, activation="relu"),
keras.layers.Dense(1)
])
model.compile(optimizer="adam", loss="mse")
history = model.fit(X_train, y_train, epochs=100, validation_split=0.2)
- Implementing a random forest classifier with Scikit-Learn:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
rf_clf.fit(X_train, y_train)
y_pred = rf_clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
Getting Started
To get started with the project:
-
Clone the repository:
git clone https://github.com/ageron/handson-ml.git
-
Install the required dependencies:
pip install -r requirements.txt
-
Launch Jupyter Notebook:
jupyter notebook
-
Open and run the notebooks in the
handson-ml
directory to explore the examples and exercises.
Competitor Comparisons
Models and examples built with TensorFlow
Pros of models
- Extensive collection of official TensorFlow models and examples
- Regularly updated with state-of-the-art implementations
- Comprehensive documentation and tutorials for each model
Cons of models
- Steeper learning curve for beginners
- Less focus on hands-on, step-by-step learning
- May be overwhelming due to the large number of models and implementations
Code comparison
handson-ml:
from sklearn.ensemble import RandomForestClassifier
forest_clf = RandomForestClassifier(n_estimators=100, random_state=42)
forest_clf.fit(X_train, y_train)
y_pred = forest_clf.predict(X_test)
models:
import tensorflow as tf
from official.nlp import bert
import official.nlp.bert.tokenization as tokenization
bert_config = bert.BertConfig.from_json_file(bert_config_file)
model = bert.BertModel(bert_config)
The handson-ml example demonstrates a simpler approach using scikit-learn, while the models example showcases a more complex BERT model implementation using TensorFlow.
scikit-learn: machine learning in Python
Pros of scikit-learn
- Comprehensive machine learning library with a wide range of algorithms and tools
- Well-established, extensively documented, and actively maintained by a large community
- Designed for production use with a focus on performance and scalability
Cons of scikit-learn
- Steeper learning curve for beginners due to its extensive functionality
- Less focus on deep learning and neural networks compared to handson-ml
- May require additional libraries for more advanced machine learning tasks
Code Comparison
scikit-learn:
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=4)
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X, y)
handson-ml:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy')
Summary
scikit-learn is a robust, production-ready machine learning library with a wide range of algorithms, while handson-ml focuses more on practical examples and tutorials, including deep learning with TensorFlow. scikit-learn is better suited for traditional machine learning tasks, while handson-ml provides a more accessible introduction to various ML concepts and techniques.
The fastai deep learning library
Pros of fastai
- Provides a high-level API for rapid prototyping and experimentation
- Offers built-in support for advanced techniques like transfer learning and mixed precision training
- Includes a comprehensive library of pre-trained models and datasets
Cons of fastai
- Steeper learning curve for beginners due to its more opinionated approach
- Less flexibility for low-level customization compared to handson-ml
- Primarily focused on PyTorch, limiting options for other frameworks
Code Comparison
handson-ml:
from sklearn.ensemble import RandomForestClassifier
rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
rf_clf.fit(X_train, y_train)
y_pred = rf_clf.predict(X_test)
fastai:
from fastai.vision.all import *
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(4)
preds, _ = learn.get_preds(dl=dls.test_dl(test_images))
The handson-ml example demonstrates a more traditional scikit-learn approach, while fastai showcases its high-level API for quickly building and training a CNN model.
Deep Learning for humans
Pros of Keras
- Comprehensive deep learning library with extensive documentation
- Supports multiple backend engines (TensorFlow, Theano, CNTK)
- Large community and ecosystem of extensions
Cons of Keras
- Less focus on machine learning concepts and theory
- May be overwhelming for beginners due to its extensive API
Code Comparison
Keras:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(10, activation='softmax')
])
Handson-ml:
import tensorflow as tf
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(784,)),
tf.keras.layers.Dense(10, activation='softmax')
])
Key Differences
- Handson-ml provides a more educational approach with explanations and examples
- Keras focuses on providing a powerful, flexible deep learning framework
- Handson-ml covers a broader range of machine learning topics beyond deep learning
- Keras offers more advanced features and customization options for deep learning
Target Audience
- Handson-ml: Beginners and intermediate learners seeking a comprehensive ML education
- Keras: Developers and researchers looking for a production-ready deep learning library
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Extensive, production-ready deep learning framework
- Large community and ecosystem of tools/extensions
- Flexible and dynamic computational graph
Cons of PyTorch
- Steeper learning curve for beginners
- More complex setup and installation process
- Less focus on traditional machine learning algorithms
Code Comparison
handson-ml (using TensorFlow):
model = keras.models.Sequential([
keras.layers.Dense(30, activation="relu", input_shape=[8]),
keras.layers.Dense(1)
])
model.compile(loss="mse", optimizer=keras.optimizers.SGD(learning_rate=1e-3))
PyTorch:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(8, 30)
self.fc2 = nn.Linear(30, 1)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
model = Net()
optimizer = optim.SGD(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Pros of ML-For-Beginners
- More comprehensive curriculum structure with lesson plans and quizzes
- Covers a wider range of ML topics, including ethics and real-world applications
- Designed for self-paced learning with clear progression
Cons of ML-For-Beginners
- Less focus on hands-on coding exercises compared to handson-ml
- May not delve as deeply into technical details of ML algorithms
- Newer repository with potentially fewer community contributions
Code Comparison
ML-For-Beginners example (Python):
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LogisticRegression()
model.fit(X_train, y_train)
handson-ml example (Python):
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
svm_clf = Pipeline([
("scaler", StandardScaler()),
("linear_svc", LinearSVC(C=1, loss="hinge"))
])
svm_clf.fit(X_train, y_train)
Both repositories offer valuable resources for learning machine learning, with ML-For-Beginners providing a more structured curriculum and handson-ml focusing on practical implementation. The choice between them depends on the learner's preferences and goals.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Machine Learning Notebooks
â THE THIRD EDITION OF MY BOOK IS NOW AVAILABLE.
This project is for the first edition, which is now outdated.
This project aims at teaching you the fundamentals of Machine Learning in python. It contains the example code and solutions to the exercises in my O'Reilly book Hands-on Machine Learning with Scikit-Learn and TensorFlow:
Quick Start
Want to play with these notebooks online without having to install anything?
Use any of the following services.
WARNING: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you download any data you care about.
-
Recommended: open this repository in Colaboratory:
-
Or open it in Binder:
- Note: Most of the time, Binder starts up quickly and works great, but when handson-ml is updated, Binder creates a new environment from scratch, and this can take quite some time.
-
Or open it in Deepnote:
Just want to quickly look at some notebooks, without executing any code?
Browse this repository using jupyter.org's notebook viewer:
Note: github.com's notebook viewer also works but it is slower and the math equations are not always displayed correctly.
Want to run this project using a Docker image?
Read the Docker instructions.
Want to install this project on your own machine?
Start by installing Anaconda (or Miniconda), git, and if you have a TensorFlow-compatible GPU, install the GPU driver, as well as the appropriate version of CUDA and cuDNN (see TensorFlow's documentation for more details).
Next, clone this project by opening a terminal and typing the following commands (do not type the first $
signs on each line, they just indicate that these are terminal commands):
$ git clone https://github.com/ageron/handson-ml.git
$ cd handson-ml
Next, run the following commands:
$ conda env create -f environment.yml
$ conda activate tf1
$ python -m ipykernel install --user --name=python3
Finally, start Jupyter:
$ jupyter notebook
If you need further instructions, read the detailed installation instructions.
FAQ
Which Python version should I use?
I recommend Python 3.7. If you follow the installation instructions above, that's the version you will get. Most code will work with other versions of Python 3, but some libraries do not support Python 3.8 or 3.9 yet, which is why I recommend Python 3.7.
I'm getting an error when I call load_housing_data()
Make sure you call fetch_housing_data()
before you call load_housing_data()
. If you're getting an HTTP error, make sure you're running the exact same code as in the notebook (copy/paste it if needed). If the problem persists, please check your network configuration.
I'm getting an SSL error on MacOSX
You probably need to install the SSL certificates (see this StackOverflow question). If you downloaded Python from the official website, then run /Applications/Python\ 3.7/Install\ Certificates.command
in a terminal (change 3.7
to whatever version you installed). If you installed Python using MacPorts, run sudo port install curl-ca-bundle
in a terminal.
I've installed this project locally. How do I update it to the latest version?
See INSTALL.md
How do I update my Python libraries to the latest versions, when using Anaconda?
See INSTALL.md
Contributors
I would like to thank everyone who contributed to this project, either by providing useful feedback, filing issues or submitting Pull Requests. Special thanks go to Haesun Park and Ian Beauregard who reviewed every notebook and submitted many PRs, including help on some of the exercise solutions. Thanks as well to Steven Bunkley and Ziembla who created the docker
directory, and to github user SuperYorio who helped on some exercise solutions.
Top Related Projects
Models and examples built with TensorFlow
scikit-learn: machine learning in Python
The fastai deep learning library
Deep Learning for humans
Tensors and Dynamic neural networks in Python with strong GPU acceleration
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot