Convert Figma logo to code with AI

dastergon logoawesome-chaos-engineering

A curated list of Chaos Engineering resources.

6,024
650
6,024
37

Top Related Projects

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.

Chaos Engineering Toolkit & Orchestration for Developers

4,476

Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q

A Chaos Engineering Platform for Kubernetes.

Quick Overview

The "awesome-chaos-engineering" repository is a curated list of Chaos Engineering resources, tools, and platforms. It serves as a comprehensive guide for practitioners and enthusiasts in the field of Chaos Engineering, providing links to articles, books, papers, conferences, and various tools used in the industry.

Pros

  • Extensive collection of resources covering various aspects of Chaos Engineering
  • Regularly updated with new tools and resources
  • Well-organized structure, making it easy to find specific information
  • Community-driven project, allowing for contributions from experts in the field

Cons

  • May be overwhelming for beginners due to the large amount of information
  • Some listed resources might become outdated over time
  • Lacks detailed explanations or comparisons of the listed tools
  • Primarily focuses on listing resources rather than providing in-depth tutorials

Note: As this is not a code library, the code example and quick start sections have been omitted as per the instructions.

Competitor Comparisons

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.

Pros of Chaosmonkey

  • Actual implementation of chaos engineering tool, ready for deployment
  • Specifically designed for AWS environments
  • Actively maintained by Netflix, a leader in chaos engineering

Cons of Chaosmonkey

  • Limited to terminating EC2 instances
  • Requires specific infrastructure setup and configuration
  • Less comprehensive in terms of chaos engineering resources and techniques

Code Comparison

Chaosmonkey (Go):

func (s *Schedule) Generate(config *Config, t time.Time) ([]Instance, error) {
    instances, err := s.Crawler.Crawl(config.Accounts...)
    if err != nil {
        return nil, err
    }
    return s.generateInstances(config, instances, t), nil
}

Awesome Chaos Engineering (Markdown):

## Tools
- [Chaos Monkey](https://github.com/Netflix/chaosmonkey) - A resiliency tool that helps applications tolerate random instance failures.
- [kube-monkey](https://github.com/asobti/kube-monkey) - An implementation of Netflix's Chaos Monkey for Kubernetes clusters.

Summary

Chaosmonkey is a specific tool for chaos engineering in AWS environments, while Awesome Chaos Engineering is a curated list of resources, tools, and information about chaos engineering. Chaosmonkey offers a concrete implementation but is limited in scope, whereas Awesome Chaos Engineering provides a broader overview of the field but doesn't include actual tools. The code comparison shows the difference between an actual tool implementation and a resource list.

Chaos Engineering Toolkit & Orchestration for Developers

Pros of chaostoolkit

  • Provides a complete toolkit for chaos engineering experiments
  • Offers a CLI and Python API for easy integration and automation
  • Supports various cloud platforms and technologies out-of-the-box

Cons of chaostoolkit

  • Requires more setup and configuration compared to a curated list
  • May have a steeper learning curve for beginners
  • Limited to specific programming languages and frameworks

Code comparison

chaostoolkit:

from chaoslib.experiment import run_experiment

experiment = {
    "steady-state-hypothesis": {...},
    "method": [...]
}

run_experiment(experiment)

awesome-chaos-engineering:

## Tools

- [Chaos Monkey](https://github.com/Netflix/chaosmonkey) - A resiliency tool that helps applications tolerate random instance failures.
- [Chaos Toolkit](https://github.com/chaostoolkit/chaostoolkit) - A chaos engineering toolkit to help you build confidence in your software system.

Summary

chaostoolkit is a comprehensive toolkit for implementing chaos engineering experiments, offering a CLI and Python API for automation. It supports various platforms but may require more setup and have a steeper learning curve.

awesome-chaos-engineering is a curated list of chaos engineering resources, tools, and articles. It provides a broader overview of the field but doesn't offer direct implementation capabilities.

Choose chaostoolkit for hands-on experimentation and automation, or awesome-chaos-engineering for a comprehensive resource guide and exploration of available tools.

4,476

Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q

Pros of Litmus

  • Provides a complete chaos engineering platform with a wide range of experiments
  • Offers a user-friendly web interface for managing and monitoring chaos experiments
  • Integrates well with Kubernetes environments and supports multiple cloud providers

Cons of Litmus

  • Requires more setup and infrastructure compared to a curated list
  • May have a steeper learning curve for beginners in chaos engineering
  • Limited to Kubernetes-based environments, while the awesome list covers broader topics

Code Comparison

Litmus example (experiment manifest):

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: nginx-chaos
spec:
  appinfo:
    appns: 'default'
    applabel: 'app=nginx'
  chaosServiceAccount: pod-delete-sa
  experiments:
    - name: pod-delete

Awesome Chaos Engineering (no code, example list entry):

- [Chaos Monkey](https://github.com/Netflix/chaosmonkey) - A resiliency tool that helps applications tolerate random instance failures.

The Awesome Chaos Engineering repository is a curated list of chaos engineering resources, tools, and articles, while Litmus is a full-fledged chaos engineering platform. Litmus provides hands-on experimentation capabilities, whereas the awesome list serves as a comprehensive reference for various chaos engineering topics and tools.

A Chaos Engineering Platform for Kubernetes.

Pros of Chaos Mesh

  • Comprehensive chaos engineering platform with a wide range of fault injection types
  • User-friendly web interface for experiment management and visualization
  • Native Kubernetes integration for cloud-native environments

Cons of Chaos Mesh

  • Focused solely on Kubernetes environments, limiting its applicability
  • Steeper learning curve for users new to Kubernetes and cloud-native concepts
  • Requires more setup and infrastructure compared to a curated list of resources

Code Comparison

Chaos Mesh (YAML configuration):

apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
  name: pod-failure-example
spec:
  action: pod-failure
  mode: one
  selector:
    namespaces:
      - default

Awesome Chaos Engineering (Markdown list):

## Tools
- [Chaos Monkey](https://github.com/Netflix/chaosmonkey) - A resiliency tool that helps applications tolerate random instance failures.
- [Chaos Toolkit](https://github.com/chaostoolkit/chaostoolkit) - A chaos engineering toolkit to help you build confidence in your software system.

Summary

Chaos Mesh is a comprehensive platform for chaos engineering in Kubernetes environments, offering a wide range of fault injection types and a user-friendly interface. However, it's limited to Kubernetes and has a steeper learning curve. Awesome Chaos Engineering, on the other hand, is a curated list of resources covering various tools and platforms, making it more accessible for beginners and applicable to diverse environments. The choice between them depends on your specific needs and infrastructure.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Awesome Chaos Engineering Awesome

A curated list of awesome Chaos Engineering resources.

What is Chaos Engineering?

Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. - Principles Of Chaos Engineering website.

Contents

Culture

Books

Education

Notable Tools

  • Chaos Monkey - A resiliency tool that helps applications tolerate random instance failures.
  • orchestrator - MySQL replication topology management and HA.
  • kube-monkey - An implementation of Netflix's Chaos Monkey for Kubernetes clusters.
  • Gremlin Inc. - Failure as a Service.
  • Chaos Toolkit - A chaos engineering toolkit to help you build confidence in your software system.
  • steadybit - A Chaos Engineering platform (SaaS or On-Prem) with auto discovery features, different attack types, user management and many more.
  • PowerfulSeal - Adds chaos to your Kubernetes clusters, so that you can detect problems in your systems as early as possible. It kills targeted pods and takes VMs up and down.
  • drax - DC/OS Resilience Automated Xenodiagnosis tool. It helps to test DC/OS deployments by applying a Chaos Monkey-inspired, proactive and invasive testing approach.
  • Wiremock - API mocking (Service Virtualization) which enables modeling real world faults and delays
  • MockLab - API mocking (Service Virtualization) as a service which enables modeling real world faults and delays.
  • Pod-Reaper - A rules based pod killing container. Pod-Reaper was designed to kill pods that meet specific conditions that can be used for Chaos testing in Kubernetes.
  • Muxy - A chaos testing tool for simulating a real-world distributed system failures.
  • Toxiproxy - A TCP proxy to simulate network and system conditions for chaos and resiliency testing.
  • Chaos engineering for Docker:
    • Pumba - Chaos testing and network emulation for Docker containers (and clusters).
    • Blockade - Docker-based utility for testing network failures and partitions in distributed applications.
  • chaos-lambda - Randomly terminate ASG instances during business hours.
  • Namazu - Programmable fuzzy scheduler for testing distributed systems.
  • Chaos Monkey for Spring Boot - Injects latencies, exceptions, and terminations into Spring Boot applications
  • Byte-Monkey - Bytecode-level fault injection for the JVM. It works by instrumenting application code on the fly to deliberately introduce faults like exceptions and latency.
  • GomJabbar - ChaosMonkey for your private cloud
  • Turbulence - Tool focused on BOSH environments capable of stressing VMs, manipulating network traffic, and more. It is very simmilar to Gremlin.
  • chaosblade - An Easy to Use and Powerful Chaos Engineering Toolkit.
  • KubeInvaders - Gamfied Chaos engineering tool for Kubernetes Clusters
  • Cthulhu - Chaos Engineering tool that helps evaluating the resiliency of microservice systems simulating various disaster scenarios against a target infrastructure in a data-driven manner.
  • VMware Mangle - Orchestrating Chaos Engineering.
  • Byteman - A Swiss Army Knife for Byte Code Manipulation.
  • Litmus - Framework for Kubernetes environments that enables users to run test suites, capture logs, generate reports and perform chaos tests.
  • Perses - A project to cause (controlled) destruction to a JVM application.
  • ChaosKube - chaoskube periodically kills random pods in your Kubernetes cluster.
  • Chaos Mesh - Chaos Mesh is a cloud-native Chaos Engineering platform that orchestrates chaos on Kubernetes environments.
  • failure-lambda - A small Node module for injecting failure into AWS Lambda using latency, exception, statuscode or diskspace.
  • aws-chaos-scripts - Collection of python scripts to run failure injection on AWS infrastructure
  • chaos-ssm-documents - Collection of AWS SSM Documents to perform Chaos Engineering experiments
  • aws-lambda-chaos-injection - A library injecting chaos into AWS Lambda. It offers simple python decorators to do delay, exception and statusCode injection and a Class to add delay to any 3rd party dependencies.
  • chaos-dingo - A tool to mess with Azure services using the Azure NodeJS SDK.
  • Chaos HTTP Proxy - Introduce failures into HTTP requests via a proxy server
  • Chaos Lemur - A self-hostable application to randomly destroy virtual machines in a BOSH-managed environment
  • Simoorg - Linkedin’s very own failure inducer framework.
  • react-chaos - A chaos engineering tool for your React apps
  • vue-chaos - A chaos engineering tool for your Vue apps
  • Chaos Engine - tool designed to intermittently destroy or degrade application resources running in cloud based infrastructure. Documentation
  • kubedoom - Kill Kubernetes pods by playing Id's DOOM.
  • kubethanos - Kills half of your randomly selected Kubernetes pods.
  • go-fault - Fault injection middleware in Go
  • Proofdock's Chaos Engineering Platform - A chaos engineering platform that seamlessly integrates in Azure DevOps and has a focus on the Azure cloud platform.
  • Pystol - Pystol is a fault injection platform allowing users to execute fault injection Actions in cloud-native environments in a controlled and prescribed way.
  • AWSSSMChaosRunner - Amazon's light-weight open-source library for chaos engineering on AWS. It can be used for EC2, ECS (with EC2 launch type) and Fargate.
  • Kraken - Chaos and resiliency testing tool for Kubernetes and OpenShift.
  • kube-burner - A tool aimed at stressing Kubernetes clusters by creating or deleting a high quantity of objects.
  • Chaos Experimentation Framework - An extensible platform for infrastructure management including Chaos Engineering
  • NetHavoc - A Chaos Engineering Tool for Linux, K8s, Windows, PCF, Cloud, and Containers for injecting Resource, Infrastructure, Network, and Application failures.
  • gorm-sqlchaos - A runtime SQL manipulator for your Golang applications based on gorm.
  • Chaos Frontend Toolkit - A set of tools to apply Chaos Engineering to frontend
  • Mitigant - The Continuos Security Verification Platform, enables confidence in cloud security posture by leveraging security chaos engineering.

Retired tools

  • The Simian Army - A suite of tools for keeping your cloud operating in top form.
  • ChaoSlingr - Introducing Security Chaos Engineering. ChaoSlingr focuses primarily on the experimentation on AWS Infrastructure to proactively instrument system security failure through experimentation.

Cloud Services

Papers

Gamedays

Blogs & Newsletters

Podcasts

  • Break Things On Purpose - Monthly podcast about Chaos Engineering presented by Gremlin Inc. Also available on Spotify, Google Play, and Stitcher.

Conferences & Meetups

Forums

Contributing

Please take a look at the contribution guidelines first. Contributions are always welcome!