tensorwatch
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Top Related Projects
Quick Overview
TensorWatch is a debugging and visualization tool for machine learning and data science developed by Microsoft. It provides real-time visualizations, interactive dashboards, and extensible features for monitoring and analyzing ML models and data streams.
Pros
- Flexible and extensible architecture allowing custom visualizations and data sources
- Real-time monitoring and visualization of ML model training and inference
- Integration with popular ML frameworks like PyTorch and TensorFlow
- Support for both local and remote execution environments
Cons
- Steeper learning curve compared to some simpler visualization tools
- Documentation could be more comprehensive for advanced features
- Limited community support compared to more established tools
- Some features may require additional setup or dependencies
Code Examples
- Creating a simple line plot:
import tensorwatch as tw
w = tw.Watcher()
stream = w.create_stream()
stream.write([1, 2, 3, 4, 5])
w.create_line_plot(stream)
w.show()
- Monitoring a PyTorch model during training:
import tensorwatch as tw
import torch.nn as nn
model = nn.Linear(10, 1)
w = tw.Watcher()
for epoch in range(10):
# Training loop
loss = ...
w.observe(epoch=epoch, loss=loss)
w.show()
- Creating a custom visualization:
import tensorwatch as tw
import matplotlib.pyplot as plt
def custom_viz(data):
plt.figure()
plt.scatter(data['x'], data['y'])
plt.title('Custom Scatter Plot')
return plt.gcf()
w = tw.Watcher()
stream = w.create_stream()
stream.write({'x': [1, 2, 3], 'y': [4, 5, 6]})
w.create_plot(stream, custom_viz)
w.show()
Getting Started
To get started with TensorWatch:
- Install TensorWatch:
pip install tensorwatch
- Import and create a Watcher:
import tensorwatch as tw
w = tw.Watcher()
- Create a stream and add data:
stream = w.create_stream()
stream.write([1, 2, 3, 4, 5])
- Create a visualization and show:
w.create_line_plot(stream)
w.show()
Competitor Comparisons
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
Pros of wandb
- More comprehensive and feature-rich logging and visualization capabilities
- Stronger community support and wider adoption in the ML industry
- Cloud-based solution with easy collaboration and experiment sharing
Cons of wandb
- Requires an account and internet connection for full functionality
- Potential privacy concerns with data being stored on external servers
- Steeper learning curve due to more extensive features
Code Comparison
wandb:
import wandb
wandb.init(project="my-project")
wandb.log({"loss": 0.5, "accuracy": 0.8})
wandb.finish()
tensorwatch:
import tensorwatch as tw
logger = tw.Watcher()
logger.observe(loss=0.5, accuracy=0.8)
logger.show()
Key Differences
- wandb offers a more comprehensive suite of tools for experiment tracking, visualization, and collaboration
- tensorwatch is more lightweight and focused on real-time visualization
- wandb has a cloud-based infrastructure, while tensorwatch is primarily local
- wandb has a larger user base and more frequent updates
- tensorwatch provides more flexibility for custom visualizations
Both tools serve the purpose of ML experiment tracking and visualization, but wandb is generally more suited for team collaborations and large-scale projects, while tensorwatch may be preferred for individual use or when working with sensitive data that cannot be uploaded to external servers.
Open source platform for the machine learning lifecycle
Pros of MLflow
- More comprehensive experiment tracking and model management
- Better integration with various ML frameworks and tools
- Stronger community support and wider adoption
Cons of MLflow
- Steeper learning curve for beginners
- Requires more setup and configuration
Code Comparison
MLflow:
import mlflow
mlflow.start_run()
mlflow.log_param("param1", value1)
mlflow.log_metric("metric1", value2)
mlflow.end_run()
TensorWatch:
import tensorwatch as tw
w = tw.Watcher()
w.observe(lambda: metric1)
w.start()
Key Differences
- MLflow focuses on end-to-end ML lifecycle management, while TensorWatch is primarily for real-time visualization and debugging
- MLflow has a more structured approach to experiment tracking, while TensorWatch offers more flexibility in visualization
- MLflow provides better model versioning and deployment capabilities, whereas TensorWatch excels in interactive debugging
Use Cases
- MLflow: Ideal for teams working on large-scale ML projects requiring comprehensive tracking and reproducibility
- TensorWatch: Better suited for individual developers or small teams focusing on real-time debugging and visualization of ML models
Community and Support
MLflow has a larger and more active community, with frequent updates and extensive documentation. TensorWatch, while powerful, has a smaller user base and less frequent updates.
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
Pros of Aim
- More active development with frequent updates and releases
- Broader support for ML frameworks beyond PyTorch
- User-friendly web UI for experiment tracking and visualization
Cons of Aim
- Less integrated debugging capabilities compared to TensorWatch
- Steeper learning curve for advanced features
- Requires more setup and configuration for complex use cases
Code Comparison
TensorWatch:
import tensorwatch as tw
stream = tw.WatchMode(filename='test.log')
stream.write(x=1, y=2)
Aim:
from aim import Run
run = Run()
run.track(1, name='x')
run.track(2, name='y')
Key Differences
- TensorWatch focuses on real-time debugging and visualization
- Aim emphasizes experiment tracking and comparison across multiple runs
- TensorWatch has tighter integration with Jupyter notebooks
- Aim offers a more comprehensive web-based UI for exploring results
Both tools provide valuable features for ML practitioners, with TensorWatch excelling in debugging scenarios and Aim offering a more robust experiment management system. The choice between them depends on specific project requirements and workflow preferences.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Welcome to TensorWatch
TensorWatch is a debugging and visualization tool designed for data science, deep learning and reinforcement learning from Microsoft Research. It works in Jupyter Notebook to show real-time visualizations of your machine learning training and perform several other key analysis tasks for your models and data.
TensorWatch is designed to be flexible and extensible so you can also build your own custom visualizations, UIs, and dashboards. Besides traditional "what-you-see-is-what-you-log" approach, it also has a unique capability to execute arbitrary queries against your live ML training process, return a stream as a result of the query and view this stream using your choice of a visualizer (we call this Lazy Logging Mode).
TensorWatch is under heavy development with a goal of providing a platform for debugging machine learning in one easy to use, extensible, and hackable package.
How to Get It
pip install tensorwatch
TensorWatch supports Python 3.x and is tested with PyTorch 0.4-1.x. Most features should also work with TensorFlow eager tensors. TensorWatch uses graphviz to create network diagrams and depending on your platform sometime you might need to manually install it.
How to Use It
Quick Start
Here's simple code that logs an integer and its square as a tuple every second to TensorWatch:
import tensorwatch as tw
import time
# streams will be stored in test.log file
w = tw.Watcher(filename='test.log')
# create a stream for logging
s = w.create_stream(name='metric1')
# generate Jupyter Notebook to view real-time streams
w.make_notebook()
for i in range(1000):
# write x,y pair we want to log
s.write((i, i*i))
time.sleep(1)
When you run this code, you will notice a Jupyter Notebook file test.ipynb
gets created in your script folder. From a command prompt type jupyter notebook
and select test.ipynb
. Choose Cell > Run all in the menu to see the real-time line graph as values get written in your script.
Here's the output you will see in Jupyter Notebook:
To dive deeper into the various other features, please see Tutorials and notebooks.
How does this work?
When you write to a TensorWatch stream, the values get serialized and sent to a TCP/IP socket as well as the file you specified. From Jupyter Notebook, we load the previously logged values from the file and then listen to that TCP/IP socket for any future values. The visualizer listens to the stream and renders the values as they arrive.
Ok, so that's a very simplified description. The TensorWatch architecture is actually much more powerful. Almost everything in TensorWatch is a stream. Files, sockets, consoles and even visualizers are streams themselves. A cool thing about TensorWatch streams is that they can listen to any other streams. This allows TensorWatch to create a data flow graph. This means that a visualizer can listen to many streams simultaneously, each of which could be a file, a socket or some other stream. You can recursively extend this to build arbitrary data flow graphs. TensorWatch decouples streams from how they get stored and how they get visualized.
Visualizations
In the above example, the line graph is used as the default visualization. However, TensorWatch supports many other diagram types including histograms, pie charts, scatter charts, bar charts and 3D versions of many of these plots. You can log your data, specify the chart type you want and let TensorWatch take care of the rest.
One of the significant strengths of TensorWatch is the ability to combine, compose, and create custom visualizations effortlessly. For example, you can choose to visualize an arbitrary number of streams in the same plot. Or you can visualize the same stream in many different plots simultaneously. Or you can place an arbitrary set of visualizations side-by-side. You can even create your own custom visualization widget simply by creating a new Python class, implementing a few methods.
Comparing Results of Multiple Runs
Each TensorWatch stream may contain a metric of your choice. By default, TensorWatch saves all streams in a single file, but you could also choose to save each stream in separate files or not to save them at all (for example, sending streams over sockets or into the console directly, zero hit to disk!). Later you can open these streams and direct them to one or more visualizations. This design allows you to quickly compare the results from your different experiments in your choice of visualizations easily.
Training within Jupyter Notebook
Often you might prefer to do data analysis, ML training, and testing - all from within Jupyter Notebook instead of from a separate script. TensorWatch can help you do sophisticated, real-time visualizations effortlessly from code that is run within a Jupyter Notebook end-to-end.
Lazy Logging Mode
A unique feature in TensorWatch is the ability to query the live running process, retrieve the result of this query as a stream and direct this stream to your preferred visualization(s). You don't need to log any data beforehand. We call this new way of debugging and visualization a lazy logging mode.
For example, as seen below, we visualize input and output image pairs, sampled randomly during the training of an autoencoder on a fruits dataset. These images were not logged beforehand in the script. Instead, the user sends query as a Python lambda expression which results in a stream of images that gets displayed in the Jupyter Notebook:
Pre-Training and Post-Training Tasks
TensorWatch leverages several excellent libraries including hiddenlayer, torchstat, Visual Attribution to allow performing the usual debugging and analysis activities in one consistent package and interface.
For example, you can view the model graph with tensor shapes with a one-liner:
You can view statistics for different layers such as flops, number of parameters, etc:
You can view the dataset in a lower dimensional space using techniques such as t-SNE:
Prediction Explanations
We wish to provide various tools for explaining predictions to help debugging models. Currently, we offer several explainers for convolutional networks, including Lime. For example, the following highlights the areas that cause the Resnet50 model to make a prediction for class 240 for the Imagenet dataset:
Tutorials
Paper
More technical details are available in TensorWatch paper (EICS 2019 Conference). Please cite this as:
@inproceedings{tensorwatch2019eics,
author = {Shital Shah and Roland Fernandez and Steven M. Drucker},
title = {A system for real-time interactive analysis of deep learning training},
booktitle = {Proceedings of the {ACM} {SIGCHI} Symposium on Engineering Interactive
Computing Systems, {EICS} 2019, Valencia, Spain, June 18-21, 2019},
pages = {16:1--16:6},
year = {2019},
crossref = {DBLP:conf/eics/2019},
url = {https://arxiv.org/abs/2001.01215},
doi = {10.1145/3319499.3328231},
timestamp = {Fri, 31 May 2019 08:40:31 +0200},
biburl = {https://dblp.org/rec/bib/conf/eics/ShahFD19},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Contribute
We would love your contributions, feedback, questions, and feature requests! Please file a Github issue or send us a pull request. Please review the Microsoft Code of Conduct and learn more.
Contact
Join the TensorWatch group on Facebook to stay up to date or ask any questions.
Credits
TensorWatch utilizes several open source libraries for many of its features. These include: hiddenlayer, torchstat, Visual-Attribution, pyzmq, receptivefield, nbformat. Please see install_requires
section in setup.py for upto date list.
License
This project is released under the MIT License. Please review the License file for more details.
Top Related Projects
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot