Convert Figma logo to code with AI

OTRF logoSecurity-Datasets

Re-play Security Events

1,599
239
1,599
9

Top Related Projects

Cloud-native SIEM for intelligent security analytics for your entire enterprise.

Splunk Security Content

5,245

MISP (core software) - Open Source Threat Intelligence and Sharing Platform

3,357

TheHive: a Scalable, Open Source and Free Security Incident Response Platform

Quick Overview

The OTRF/Security-Datasets repository is a collection of security-related datasets and logs for research, threat hunting, and data analysis. It provides a variety of data sources, including Windows event logs, network traffic, and application logs, to help security professionals and researchers analyze and understand different security scenarios.

Pros

  • Diverse collection of security-related datasets from various sources
  • Well-organized and categorized for easy navigation and access
  • Regularly updated with new datasets and contributions from the community
  • Includes detailed metadata and context for each dataset

Cons

  • Some datasets may be large and require significant storage and processing power
  • Not all datasets are consistently formatted, which may require additional preprocessing
  • Limited documentation on how to effectively use or analyze some of the datasets
  • Some datasets may be outdated or no longer relevant to current security landscapes

Getting Started

To get started with the OTRF/Security-Datasets repository:

  1. Clone the repository:

    git clone https://github.com/OTRF/Security-Datasets.git
    
  2. Navigate to the desired dataset folder:

    cd Security-Datasets/datasets/<category>/<dataset_name>
    
  3. Read the README.md file in the dataset folder for specific information about the dataset and how to use it.

  4. Download or access the dataset files as needed for your analysis or research.

Note: Some datasets may require additional tools or software for processing and analysis. Refer to the dataset-specific documentation for more information.

Competitor Comparisons

Cloud-native SIEM for intelligent security analytics for your entire enterprise.

Pros of Azure-Sentinel

  • Comprehensive cloud-native SIEM and SOAR solution
  • Extensive integration with Azure services and third-party tools
  • Active development and regular updates from Microsoft

Cons of Azure-Sentinel

  • Requires Azure subscription and associated costs
  • Steeper learning curve for users unfamiliar with Azure ecosystem
  • Limited customization options compared to open-source alternatives

Code Comparison

Security-Datasets:

{
  "title": "Windows Security Event Log",
  "description": "Windows Security events collected from a Windows workstation",
  "platform": "Windows",
  "log_source": "Security",
  "log_name": "Security.evtx",
  "file_type": "evtx"
}

Azure-Sentinel:

id: 123456789
name: Suspicious PowerShell Command Line
description: Detects suspicious PowerShell command line parameters
severity: Medium
requiredDataConnectors:
  - connectorId: WindowsSecurityEvents
    dataTypes:
      - SecurityEvent
queryFrequency: 1h
queryPeriod: 1h

The Security-Datasets repository focuses on providing sample datasets for security analysis, while Azure-Sentinel offers a complete SIEM solution with detection rules, analytics, and automation capabilities. Security-Datasets is more suitable for research and testing, whereas Azure-Sentinel is designed for production environments and real-time threat detection.

Splunk Security Content

Pros of security_content

  • Extensive collection of pre-built detection rules and analytics
  • Regular updates and contributions from the Splunk security community
  • Includes machine learning models for advanced threat detection

Cons of security_content

  • Primarily focused on Splunk-specific content and formats
  • May require more setup and configuration for non-Splunk environments
  • Less diverse in terms of raw datasets compared to Security-Datasets

Code Comparison

Security-Datasets example (YAML):

title: Windows Security Event Log
platform: Windows
log_source:
  product: Windows
  service: Security

security_content example (YAML):

name: Detect Suspicious Process Creation
search: |
  index=windows sourcetype=WinEventLog:Security EventCode=4688
  | stats count by NewProcessName

Both repositories use YAML for configuration, but security_content focuses on Splunk search queries, while Security-Datasets provides more general metadata about datasets.

Summary

Security-Datasets offers a broader range of security-related datasets across various platforms, making it more versatile for different security tools and environments. security_content, on the other hand, provides a rich set of pre-built detection rules and analytics specifically tailored for Splunk environments, making it more immediately actionable for Splunk users but potentially less flexible for other platforms.

5,245

MISP (core software) - Open Source Threat Intelligence and Sharing Platform

Pros of MISP

  • Comprehensive threat intelligence platform with extensive sharing capabilities
  • Active community and regular updates
  • Supports various data formats and integrations with other security tools

Cons of MISP

  • Steeper learning curve due to its complexity
  • Requires more resources to set up and maintain
  • May be overkill for smaller organizations or simpler use cases

Code Comparison

MISP (Python):

from pymisp import PyMISP
misp = PyMISP('https://misp.example.com', 'YOUR_API_KEY')
event = misp.new_event(info='Suspicious Activity', distribution=0, threat_level_id=2, analysis=0)

Security-Datasets (No specific code, as it's a collection of datasets):

# No direct code comparison available
# Security-Datasets provides pre-formatted datasets for analysis

Summary

MISP is a powerful threat intelligence platform with extensive features, while Security-Datasets offers pre-formatted datasets for security analysis. MISP provides more comprehensive capabilities but requires more setup and maintenance. Security-Datasets is simpler to use but lacks the advanced sharing and analysis features of MISP. The choice between them depends on specific organizational needs and resources available for threat intelligence management.

3,357

TheHive: a Scalable, Open Source and Free Security Incident Response Platform

Pros of TheHive

  • Comprehensive incident response platform with case management features
  • Integrates with other security tools and supports automation workflows
  • Active community and regular updates

Cons of TheHive

  • Steeper learning curve and more complex setup
  • Requires more resources to run and maintain
  • Focused on incident response rather than providing diverse security datasets

Code Comparison

TheHive (Python API example):

from thehive4py.api import TheHiveApi
from thehive4py.models import Case

api = TheHiveApi('http://localhost:9000', 'api_key')
case = Case(title='Suspicious Activity', description='Investigating unusual network traffic')
response = api.create_case(case)

Security-Datasets (Sample dataset usage):

import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/OTRF/Security-Datasets/master/datasets/atomic/windows/process_creation/process_creation_win_mshta_javascript.csv')
suspicious_processes = df[df['process_name'] == 'mshta.exe']

While TheHive focuses on incident response management with API interactions, Security-Datasets provides ready-to-use datasets for analysis and testing. TheHive is more suitable for operational security teams, whereas Security-Datasets is ideal for researchers and analysts looking for pre-compiled security data.

Pros of detection-rules

  • Focuses on providing ready-to-use detection rules for Elastic Security
  • Offers a comprehensive set of rules covering various attack techniques
  • Includes a rule testing framework for validation and quality assurance

Cons of detection-rules

  • Limited to Elastic Security ecosystem, less versatile for other platforms
  • Requires Elastic Stack knowledge for optimal use and customization
  • May have a steeper learning curve for users unfamiliar with Elastic products

Code Comparison

detection-rules:

name: Suspicious Process Creation in Unusual Location
type: eql
risk_score: 50
description: Detects process creation in unusual directories
query: |
  process where event.type == "creation" and
    not process.executable : ("C:\\Windows\\*", "C:\\Program Files\\*")

Security-Datasets:

{
  "EventID": 1,
  "Image": "C:\\Users\\Admin\\AppData\\Local\\Temp\\suspicious.exe",
  "CommandLine": "C:\\Users\\Admin\\AppData\\Local\\Temp\\suspicious.exe -enc payload",
  "ParentImage": "C:\\Windows\\System32\\cmd.exe"
}

Summary

detection-rules provides a robust set of detection rules specifically for Elastic Security, with built-in testing capabilities. Security-Datasets offers a broader collection of security-related datasets for various platforms and use cases. While detection-rules is more focused and integrated with Elastic products, Security-Datasets provides greater flexibility for different security tools and analysis approaches.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Security Datasets

Binder License: MIT Twitter Open_Threat_Research Community Open Source Love svg1

The Security Datasets project is an open-source initiatve that contributes malicious and benign datasets, from different platforms, to the infosec community to expedite data analysis and threat research.

Docs

Goals

  • Provide open portable datasets to expedite the development of data analytics.
  • Facilitate and expedite adversary techniques simulation.
  • Allow security analysts around the world to test their skills with real data.
  • Improve the testing and validation of detection analytics in an easier, practical, modular and more affordable way.
  • Enable data scientists to have labeled and unlabeled data for initial research and features development.
  • Help the community map datasets to other open source projects such as Sigma, Atomic Red Team, Threat Hunter Playbook (Jupyter Notebooks) and MITRE ATT&CK.
  • Provide datasets for other social/community events such as Capture The Flags (CTFs) or hackathons to encourage collaboration.

Projects Using Security Datasets

Authors

Contributing

Help us build the largest library of datasets for the InfoSec community!. Learn more about how you could do it here!

License: GPL-3.0

Security Datasets's GNU General Public License