ML-For-Beginners

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

73,270

16,061

73,270

View on GitHub

Top Related Projects

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

scikit-learn

62,466

scikit-learn: machine learning in Python

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Quick Overview

The Microsoft ML for Beginners repository is a comprehensive collection of tutorials and resources designed to help beginners learn and understand the fundamentals of machine learning. It covers a wide range of topics, from data preprocessing and model training to deployment and evaluation, using popular machine learning frameworks and libraries.

Pros

Comprehensive Curriculum: The repository provides a structured learning path, covering a diverse range of machine learning concepts and techniques.
Hands-on Approach: The tutorials include practical examples and code snippets, allowing learners to apply the concepts they've learned.
Beginner-friendly: The content is tailored for individuals new to machine learning, with a focus on providing clear explanations and step-by-step guidance.
Actively Maintained: The repository is regularly updated by the Microsoft team, ensuring the content remains relevant and up-to-date.

Cons

Limited Depth: While the repository covers a broad range of topics, the depth of coverage may be limited for more advanced learners.
Specific to Microsoft Ecosystem: The tutorials and examples are primarily focused on Microsoft's machine learning ecosystem, which may not be as relevant for those using other platforms or frameworks.
Lack of Interactive Elements: The repository is primarily text-based, with limited interactive elements or hands-on exercises that could enhance the learning experience.
Potential Bias: As the repository is maintained by Microsoft, there may be a potential bias towards Microsoft's products and services.

Getting Started

To get started with the Microsoft ML for Beginners repository, follow these steps:

Clone the repository to your local machine:

git clone https://github.com/microsoft/ML-For-Beginners.git

Navigate to the repository directory:

cd ML-For-Beginners

Explore the directory structure and navigate to the specific topic or module you're interested in. For example, to access the "Introduction to Machine Learning" module:

cd 1-Introduction

Open the README.md file in the module directory to access the tutorial content and follow the step-by-step instructions.
Optionally, you can set up a virtual environment and install the required dependencies for the specific module you're working on. The repository provides guidance on the necessary setup in each module's README.
As you progress through the tutorials, try to apply the concepts you've learned by experimenting with the provided code examples or by working on your own machine learning projects.

Competitor Comparisons

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

Comprehensive, production-ready deep learning framework
Extensive ecosystem with tools like TensorBoard for visualization
Supports multiple programming languages and platforms

Cons of TensorFlow

Steeper learning curve for beginners
More complex setup and configuration
Larger codebase, which can be overwhelming for newcomers

Code Comparison

ML-For-Beginners (Python with scikit-learn):

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LogisticRegression()
model.fit(X_train, y_train)

TensorFlow (Python):

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit(X_train, y_train, epochs=10)

ML-For-Beginners focuses on teaching machine learning concepts using simpler libraries and tools, making it more accessible for beginners. TensorFlow, on the other hand, is a powerful framework designed for building and deploying large-scale machine learning models, offering more advanced features but with a steeper learning curve.

scikit-learn

62,466

scikit-learn: machine learning in Python

Pros of scikit-learn

Comprehensive machine learning library with a wide range of algorithms and tools
Well-established, mature project with extensive documentation and community support
Designed for production use and integration into real-world applications

Cons of scikit-learn

Steeper learning curve for beginners due to its extensive feature set
Less focus on educational content and step-by-step learning
May be overwhelming for those new to machine learning concepts

Code Comparison

ML-For-Beginners (Python example):

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

scikit-learn (Python example):

from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

ML-For-Beginners focuses on explaining concepts and providing step-by-step tutorials, while scikit-learn offers a robust toolkit for implementing machine learning algorithms in practice. ML-For-Beginners is better suited for educational purposes, whereas scikit-learn is ideal for developing and deploying machine learning models in real-world scenarios.

keras

63,156

Deep Learning for humans

Pros of Keras

Production-ready deep learning library with extensive documentation and community support
Offers high-level APIs for quick prototyping and experimentation
Seamless integration with TensorFlow backend for enhanced performance

Cons of Keras

Steeper learning curve for beginners compared to ML-For-Beginners
Less focus on fundamental ML concepts and more on implementation
May be overwhelming for those new to machine learning

Code Comparison

ML-For-Beginners (Python):

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LinearRegression()
model.fit(X_train, y_train)

Keras (Python):

from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(64, activation='relu', input_dim=20))
model.add(Dense(1, activation='sigmoid'))

ML-For-Beginners provides a more beginner-friendly approach with simpler code examples, while Keras offers more advanced and flexible deep learning capabilities.

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

Comprehensive deep learning framework with extensive functionality
Large, active community providing support and contributions
Widely used in industry and research, offering better job prospects

Cons of PyTorch

Steeper learning curve for beginners
Less focus on educational content and tutorials
Requires more advanced programming skills

Code Comparison

ML-For-Beginners (using scikit-learn):

from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)

PyTorch:

import torch.nn as nn
model = nn.Linear(input_size, output_size)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss = nn.MSELoss()
output = model(input)
loss_value = loss(output, target)
loss_value.backward()
optimizer.step()

ML-For-Beginners is designed as an educational resource for beginners, focusing on teaching machine learning concepts with simple implementations. It covers a broad range of topics and uses various tools and libraries.

PyTorch, on the other hand, is a powerful deep learning framework used for building and training neural networks. It offers more advanced features and flexibility but requires a deeper understanding of machine learning and programming concepts.

fastai

27,300

The fastai deep learning library

Pros of fastai

Provides a high-level API for rapid deep learning model development
Includes cutting-edge techniques and best practices in deep learning
Offers comprehensive documentation and a supportive community

Cons of fastai

Steeper learning curve for beginners in machine learning
More focused on deep learning, less coverage of traditional ML algorithms
Requires some programming experience, particularly in Python

Code Comparison

ML-For-Beginners:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = RandomForestClassifier()
model.fit(X_train, y_train)
predictions = model.predict(X_test)

fastai:

from fastai.vision.all import *
dls = ImageDataLoaders.from_folder(path, valid_pct=0.2, size=224)
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit_one_cycle(4)
predictions = learn.predict(test_images)

ML-For-Beginners focuses on a broader range of ML concepts with simpler implementations, making it more accessible for beginners. fastai, on the other hand, provides a powerful deep learning library with advanced features, catering to more experienced practitioners and those specifically interested in deep learning applications.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Machine Learning for Beginners - A Curriculum

ð Travel around the world as we explore Machine Learning by means of world cultures ð

Cloud Advocates at Microsoft are pleased to offer a 12-week, 26-lesson curriculum all about Machine Learning. In this curriculum, you will learn about what is sometimes called classic machine learning, using primarily Scikit-learn as a library and avoiding deep learning, which is covered in our AI for Beginners' curriculum. Pair these lessons with our 'Data Science for Beginners' curriculum, as well!

Travel with us around the world as we apply these classic techniques to data from many areas of the world. Each lesson includes pre- and post-lesson quizzes, written instructions to complete the lesson, a solution, an assignment, and more. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'.

âï¸ Hearty thanks to our authors Jen Looper, Stephen Howell, Francesca Lazzeri, Tomomi Imura, Cassie Breviu, Dmitry Soshnikov, Chris Noring, Anirban Mukherjee, Ornella Altunyan, Ruth Yakubu and Amy Boyd

ð¨ Thanks as well to our illustrators Tomomi Imura, Dasani Madipalli, and Jen Looper

ð Special thanks ð to our Microsoft Student Ambassador authors, reviewers, and content contributors, notably Rishit Dagli, Muhammad Sakib Khan Inan, Rohan Raj, Alexandru Petrescu, Abhishek Jaiswal, Nawrin Tabassum, Ioan Samuila, and Snigdha Agarwal

ð¤© Extra gratitude to Microsoft Student Ambassadors Eric Wanjau, Jasleen Sondhi, and Vidushi Gupta for our R lessons!

Getting Started

Follow these steps:

Fork the Repository: Click on the "Fork" button at the top-right corner of this page.
Clone the Repository: git clone https://github.com/microsoft/ML-For-Beginners.git

find all additional resources for this course in our Microsoft Learn collection

Students, to use this curriculum, fork the entire repo to your own GitHub account and complete the exercises on your own or with a group:

Start with a pre-lecture quiz.
Read the lecture and complete the activities, pausing and reflecting at each knowledge check.
Try to create the projects by comprehending the lessons rather than running the solution code; however that code is available in the /solution folders in each project-oriented lesson.
Take the post-lecture quiz.
Complete the challenge.
Complete the assignment.
After completing a lesson group, visit the Discussion Board and "learn out loud" by filling out the appropriate PAT rubric. A 'PAT' is a Progress Assessment Tool that is a rubric you fill out to further your learning. You can also react to other PATs so we can learn together.

For further study, we recommend following these Microsoft Learn modules and learning paths.

Teachers, we have included some suggestions on how to use this curriculum.

Video walkthroughs

Some of the lessons are available as short form video. You can find all these in-line in the lessons, or on the ML for Beginners playlist on the Microsoft Developer YouTube channel by clicking the image below.

Meet the Team

Gif by Mohit Jaisal

ð¥ Click the image above for a video about the project and the folks who created it!

Pedagogy

We have chosen two pedagogical tenets while building this curriculum: ensuring that it is hands-on project-based and that it includes frequent quizzes. In addition, this curriculum has a common theme to give it cohesion.

By ensuring that the content aligns with projects, the process is made more engaging for students and retention of concepts will be augmented. In addition, a low-stakes quiz before a class sets the intention of the student towards learning a topic, while a second quiz after class ensures further retention. This curriculum was designed to be flexible and fun and can be taken in whole or in part. The projects start small and become increasingly complex by the end of the 12-week cycle. This curriculum also includes a postscript on real-world applications of ML, which can be used as extra credit or as a basis for discussion.

Find our Code of Conduct, Contributing, and Translation guidelines. We welcome your constructive feedback!

Each lesson includes

optional sketchnote
optional supplemental video
video walkthrough (some lessons only)
pre-lecture warmup quiz
written lesson
for project-based lessons, step-by-step guides on how to build the project
knowledge checks
a challenge
supplemental reading
assignment
post-lecture quiz

A note about languages: These lessons are primarily written in Python, but many are also available in R. To complete an R lesson, go to the /solution folder and look for R lessons. They include an .rmd extension that represents an R Markdown file which can be simply defined as an embedding of code chunks (of R or other languages) and a YAML header (that guides how to format outputs such as PDF) in a Markdown document. As such, it serves as an exemplary authoring framework for data science since it allows you to combine your code, its output, and your thoughts by allowing you to write them down in Markdown. Moreover, R Markdown documents can be rendered to output formats such as PDF, HTML, or Word.

A note about quizzes: All quizzes are contained in Quiz App folder, for 52 total quizzes of three questions each. They are linked from within the lessons but the quiz app can be run locally; follow the instruction in the quiz-app folder to locally host or deploy to Azure.

Lesson Number	Topic	Lesson Grouping	Learning Objectives	Linked Lesson	Author
01	Introduction to machine learning	Introduction	Learn the basic concepts behind machine learning	Lesson	Muhammad
02	The History of machine learning	Introduction	Learn the history underlying this field	Lesson	Jen and Amy
03	Fairness and machine learning	Introduction	What are the important philosophical issues around fairness that students should consider when building and applying ML models?	Lesson	Tomomi
04	Techniques for machine learning	Introduction	What techniques do ML researchers use to build ML models?	Lesson	Chris and Jen
05	Introduction to regression	Regression	Get started with Python and Scikit-learn for regression models	Python R	Jen Eric Wanjau
06	North American pumpkin prices ð	Regression	Visualize and clean data in preparation for ML	Python R	Jen Eric Wanjau
07	North American pumpkin prices ð	Regression	Build linear and polynomial regression models	Python R	Jen and Dmitry Eric Wanjau
08	North American pumpkin prices ð	Regression	Build a logistic regression model	Python R	Jen Eric Wanjau
09	A Web App ð	Web App	Build a web app to use your trained model	Python	Jen
10	Introduction to classification	Classification	Clean, prep, and visualize your data; introduction to classification	Python R	Jen and Cassie Eric Wanjau
11	Delicious Asian and Indian cuisines ð	Classification	Introduction to classifiers	Python R	Jen and Cassie Eric Wanjau
12	Delicious Asian and Indian cuisines ð	Classification	More classifiers	Python R	Jen and Cassie Eric Wanjau
13	Delicious Asian and Indian cuisines ð	Classification	Build a recommender web app using your model	Python	Jen
14	Introduction to clustering	Clustering	Clean, prep, and visualize your data; Introduction to clustering	Python R	Jen Eric Wanjau
15	Exploring Nigerian Musical Tastes ð§	Clustering	Explore the K-Means clustering method	Python R	Jen Eric Wanjau
16	Introduction to natural language processing âï¸	Natural language processing	Learn the basics about NLP by building a simple bot	Python	Stephen
17	Common NLP Tasks âï¸	Natural language processing	Deepen your NLP knowledge by understanding common tasks required when dealing with language structures	Python	Stephen
18	Translation and sentiment analysis â¥ï¸	Natural language processing	Translation and sentiment analysis with Jane Austen	Python	Stephen
19	Romantic hotels of Europe â¥ï¸	Natural language processing	Sentiment analysis with hotel reviews 1	Python	Stephen
20	Romantic hotels of Europe â¥ï¸	Natural language processing	Sentiment analysis with hotel reviews 2	Python	Stephen
21	Introduction to time series forecasting	Time series	Introduction to time series forecasting	Python	Francesca
22	â¡ï¸ World Power Usage â¡ï¸ - time series forecasting with ARIMA	Time series	Time series forecasting with ARIMA	Python	Francesca
23	â¡ï¸ World Power Usage â¡ï¸ - time series forecasting with SVR	Time series	Time series forecasting with Support Vector Regressor	Python	Anirban
24	Introduction to reinforcement learning	Reinforcement learning	Introduction to reinforcement learning with Q-Learning	Python	Dmitry
25	Help Peter avoid the wolf! ðº	Reinforcement learning	Reinforcement learning Gym	Python	Dmitry
Postscript	Real-World ML scenarios and applications	ML in the Wild	Interesting and revealing real-world applications of classical ML	Lesson	Team
Postscript	Model Debugging in ML using RAI dashboard	ML in the Wild	Model Debugging in Machine Learning using Responsible AI dashboard components	Lesson	Ruth Yakubu

find all additional resources for this course in our Microsoft Learn collection

Offline access

You can run this documentation offline by using Docsify. Fork this repo, install Docsify on your local machine, and then in the root folder of this repo, type docsify serve. The website will be served on port 3000 on your localhost: localhost:3000.

PDFs

Find a pdf of the curriculum with links here.

Help Wanted

Would you like to contribute a translation? Please read our translation guidelines and add a templated issue to manage the workload here.

ð Other Courses

Our team produces other courses! Check out:

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Getting Started

Competitor Comparisons

Pros of TensorFlow

Cons of TensorFlow

Code Comparison

Pros of scikit-learn

Cons of scikit-learn

Code Comparison

Pros of Keras

Cons of Keras

Code Comparison

Pros of PyTorch

Cons of PyTorch

Code Comparison

Pros of fastai

Cons of fastai

Code Comparison

Convert designs to code with AI

README

Machine Learning for Beginners - A Curriculum

Getting Started

Video walkthroughs

Meet the Team

Pedagogy

Each lesson includes

Offline access

PDFs

Help Wanted

ð Other Courses

Top Related Projects

Convert designs to code with AI

ð Other Courses