janusgraph

JanusGraph: an open-source, distributed graph database

5,536

1,196

5,536

551

View on GitHub

Top Related Projects

tinkerpop

2,051

Apache TinkerPop - a graph computing framework

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.

neo4j

14,798

Graphs for Everyone

arangodb

13,878

🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

dgraph

21,062

high-performance graph database for real-time use cases

nebula

11,447

A distributed, fast open-source graph database featuring horizontal scalability and high availability

Quick Overview

JanusGraph is an open-source, distributed graph database optimized for storing and querying large graphs with billions of vertices and edges. It supports various storage backends, including Apache Cassandra and HBase, and integrates with analytics systems like Apache Spark.

Pros

Scalable and distributed architecture for handling massive graphs
Supports multiple storage backends and indexing systems
Compatible with Apache TinkerPop graph computing framework
Offers advanced features like geo, numeric, and text search capabilities

Cons

Steeper learning curve compared to simpler graph databases
Configuration and setup can be complex, especially for large-scale deployments
Limited built-in visualization tools
Performance may vary depending on the chosen storage backend and configuration

Code Examples

Creating a graph and adding vertices and edges:

Graph graph = JanusGraphFactory.open("inmemory");
JanusGraphTransaction tx = graph.newTransaction();

Vertex alice = tx.addVertex("name", "Alice");
Vertex bob = tx.addVertex("name", "Bob");
alice.addEdge("knows", bob, "since", 2010);

tx.commit();

Querying the graph:

GraphTraversalSource g = graph.traversal();
List<Vertex> friends = g.V().has("name", "Alice").out("knows").toList();

Using indexing for efficient queries:

JanusGraphManagement mgmt = graph.openManagement();
PropertyKey name = mgmt.makePropertyKey("name").dataType(String.class).make();
mgmt.buildIndex("byNameComposite", Vertex.class).addKey(name).buildCompositeIndex();
mgmt.commit();

// After reindexing
g.V().has("name", "Alice").hasLabel("person").valueMap().toList();

Getting Started

Add JanusGraph dependency to your project:

<dependency>
    <groupId>org.janusgraph</groupId>
    <artifactId>janusgraph-core</artifactId>
    <version>0.6.2</version>
</dependency>

Create a configuration file janusgraph-config.properties:

storage.backend=inmemory

Initialize and use JanusGraph:

Graph graph = JanusGraphFactory.open("janusgraph-config.properties");
GraphTraversalSource g = graph.traversal();
// Use the graph...
graph.close();

Competitor Comparisons

tinkerpop

2,051

Apache TinkerPop - a graph computing framework

Pros of TinkerPop

More mature and widely adopted project with a larger community
Provides a standardized graph computing framework that works across multiple graph databases
Offers Gremlin, a powerful graph traversal language

Cons of TinkerPop

Steeper learning curve, especially for newcomers to graph databases
Less focus on specific optimizations for large-scale distributed graph processing

Code Comparison

TinkerPop (Gremlin):

g.V().hasLabel('person').
  has('name', 'John').
  out('knows').
  values('name')

JanusGraph:

graph.traversal().V().hasLabel("person").
  has("name", "John").
  out("knows").
  values("name")

Key Differences

JanusGraph is built on top of TinkerPop, extending its capabilities for large-scale graph processing. While TinkerPop provides a general-purpose graph computing framework, JanusGraph focuses on distributed graph database functionality with support for various storage backends.

JanusGraph offers additional features like advanced indexing, geo-spatial search, and better support for very large graphs. However, TinkerPop's Gremlin language is more widely supported across different graph databases, offering greater flexibility in choosing backend systems.

Both projects are open-source and have active communities, but TinkerPop has a longer history and broader adoption in the graph database ecosystem.

orientdb

4,847

Pros of OrientDB

Native multi-model database supporting graph, document, key-value, and object models
Built-in support for clustering and sharding, offering better scalability out-of-the-box
More mature project with a larger community and ecosystem

Cons of OrientDB

Less focus on distributed graph processing compared to JanusGraph
Limited support for external storage backends, primarily relying on its own storage engine
Steeper learning curve due to its multi-model nature and unique query language

Code Comparison

OrientDB (SQL-like syntax):

CREATE VERTEX Person SET name = 'John', age = 30
CREATE EDGE Knows FROM (SELECT FROM Person WHERE name = 'John') TO (SELECT FROM Person WHERE name = 'Jane')

JanusGraph (Gremlin syntax):

g.addV('person').property('name', 'John').property('age', 30)
g.V().has('name', 'John').addE('knows').to(g.V().has('name', 'Jane'))

Both databases offer powerful graph capabilities, but OrientDB provides a more versatile multi-model approach, while JanusGraph excels in distributed graph processing and integration with big data ecosystems. The choice between them depends on specific project requirements and existing technology stacks.

neo4j

14,798

Graphs for Everyone

Pros of Neo4j

More mature and widely adopted, with a larger community and ecosystem
Native graph database with optimized performance for graph operations
Powerful Cypher query language for intuitive graph querying

Cons of Neo4j

Limited scalability for very large datasets compared to JanusGraph
Proprietary licensing for enterprise features
Less flexible storage backend options

Code Comparison

Neo4j (Cypher query):

MATCH (n:Person)-[:KNOWS]->(m:Person)
WHERE n.name = 'Alice'
RETURN m.name

JanusGraph (Gremlin query):

g.V().has('name', 'Alice').out('KNOWS').values('name')

Both examples show a simple query to find friends of a person named Alice. Neo4j uses its Cypher query language, while JanusGraph uses Gremlin traversals. The syntax differs, but both achieve similar results in querying graph data.

JanusGraph offers more flexibility in terms of storage backends and integration with big data ecosystems, while Neo4j provides a more optimized native graph database experience. The choice between them depends on specific project requirements, scalability needs, and ecosystem preferences.

arangodb

13,878

Pros of ArangoDB

Multi-model database supporting graphs, documents, and key-value pairs
Native multi-threaded implementation for better performance
Built-in web interface for easy management and querying

Cons of ArangoDB

Less focus on distributed graph processing compared to JanusGraph
Smaller community and ecosystem than JanusGraph
Limited support for some advanced graph algorithms

Code Comparison

ArangoDB (AQL):

FOR v, e IN 1..3 OUTBOUND 'users/john' GRAPH 'social'
  RETURN {user: v.name, relationship: e.type}

JanusGraph (Gremlin):

g.V().has('name', 'john').outE().inV().
  path().by('name').by('type').
  limit(3)

Both databases offer query languages for graph traversal, but ArangoDB uses AQL (ArangoDB Query Language), while JanusGraph uses Gremlin. AQL is more SQL-like, whereas Gremlin is a graph-specific language.

ArangoDB provides a more versatile database solution with its multi-model approach, while JanusGraph focuses specifically on distributed graph processing. JanusGraph may be better suited for large-scale graph applications, while ArangoDB offers more flexibility for projects requiring different data models.

dgraph

21,062

high-performance graph database for real-time use cases

Pros of Dgraph

Native GraphQL support, allowing for easier integration with GraphQL-based applications
Designed for horizontal scalability, making it more suitable for large-scale distributed systems
Built-in support for full-text search and geospatial queries

Cons of Dgraph

Less flexible schema management compared to JanusGraph's dynamic schema approach
Smaller community and ecosystem, potentially leading to fewer resources and third-party integrations
Limited support for complex graph traversals and analytics compared to JanusGraph's Gremlin-based querying

Code Comparison

Dgraph query example:

{
  user(func: eq(name, "Alice")) {
    name
    age
    friends {
      name
    }
  }
}

JanusGraph query example (using Gremlin):

g.V().has('name', 'Alice')
     .valueMap('name', 'age')
     .as('user')
     .out('friends')
     .valueMap('name')
     .select('user', 'friends')

Both examples retrieve a user named Alice along with their age and friends' names. Dgraph uses GraphQL syntax, while JanusGraph uses Gremlin traversal language, showcasing the different query approaches of the two systems.

nebula

11,447

A distributed, fast open-source graph database featuring horizontal scalability and high availability

Pros of Nebula

Designed for large-scale distributed environments, offering better scalability
Faster query performance, especially for complex graph traversals
Native support for time series data and geospatial queries

Cons of Nebula

Steeper learning curve due to its unique query language (nGQL)
Less mature ecosystem and community support compared to JanusGraph
Limited support for OLAP-style graph analytics

Code Comparison

JanusGraph query example:

graph.traversal().V().has("name", "Alice")
    .out("knows").has("age", gt(30))
    .values("name")

Nebula query example:

MATCH (v:person{name:"Alice"})-[:knows]->(friend:person)
WHERE friend.age > 30
RETURN friend.name

Both examples demonstrate a simple graph traversal, but Nebula uses its custom nGQL language, while JanusGraph leverages the more widely-adopted Gremlin query language.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

JanusGraph is a highly scalable graph database optimized for storing and querying large graphs with billions of vertices and edges distributed across a multi-machine cluster. JanusGraph is a transactional database that can support thousands of concurrent users, complex traversals, and analytic graph queries.

Learn More

The project homepage contains more information on JanusGraph and provides links to documentation, getting-started guides and release downloads.

Visualization

JanusGraph has a web-based graph visualizer located in janusgraph-visualizer repository.

In additional to the web-based graph visualizer, JanusGraph supports a range of third party graph visualizers listed below:

Arcade Analytics
Cytoscape
Gephi plugin for Apache TinkerPop
Graphexp
Graph Explorer
Graphlytic
Gremlin-Visualizer - the original repository on which JanusGraph-Visualizer is based on.
G.V() - Gremlin IDE
KeyLines by Cambridge Intelligence
Ogma by Linkurious
ReGraph by Cambridge Intelligence
Tom Sawyer Perspectives

Community

GitHub Discussions: see GitHub Discussions for all general discussions and questions about JanusGraph
Discord for interactive discussions and questions about JanusGraph: Join the server
Stack Overflow: see the janusgraph tag
Twitter: follow @JanusGraph for news and updates
LinkedIn: follow JanusGraph for news and updates
Mailing lists:
- janusgraph-users (at) lists.lfaidata.foundation (archives) for questions about using JanusGraph, installation, configuration, integrations
  
  To join with a LF AI & Data account, use the web UI; to subscribe/unsubscribe with an arbitrary email address, send an email to:
  - janusgraph-users+subscribe (at) lists.lfaidata.foundation
  - janusgraph-users+unsubscribe (at) lists.lfaidata.foundation
- janusgraph-dev (at) lists.lfaidata.foundation (archives) for internal implementation of JanusGraph itself
  
  To join with a LF AI & Data account, use the web UI; to subscribe/unsubscribe with an arbitrary email address, send an email to:
  - janusgraph-dev+subscribe (at) lists.lfaidata.foundation
  - janusgraph-dev+unsubscribe (at) lists.lfaidata.foundation
- janusgraph-announce (at) lists.lfaidata.foundation (archives) for new releases and news announcements
  
  To join with a LF AI & Data account, use the web UI; to subscribe/unsubscribe with an arbitrary email address, send an email to:
  - janusgraph-announce+subscribe (at) lists.lfaidata.foundation
  - janusgraph-announce+unsubscribe (at) lists.lfaidata.foundation

Contributing

Please see CONTRIBUTING.md for more information, including CLAs and best practices for working with GitHub.

Powered by JanusGraph

Apache Atlas - metadata management for governance (website)
Eclipse Keti - access control service to protect RESTful APIs (website)
Exakat - PHP static analysis (website)
Open Network Automation Platform (ONAP) - automation and orchestration for Software-Defined Networks

Uber Knowledge Graph (event info)
Express-Cassandra - Cassandra ORM/ODM/OGM for Node.js with optional support for Elassandra & JanusGraph

Windup by RedHat - application migration and assessment tool (website)

Users

The following users have deployed JanusGraph in production.

CELUM
eBay - video
FiNC
G DATA - blog post series about malware analysis use case
Mapped
Netflix - video and slides (graph discussion starts at #86)
Qihoo 360 (about)
Red Hat - application migration and assessment tool built on Windup
Times Internet
Uber

Cite

If you use JanusGraph in your research, please cite it using this metadata:

@software{janusgraph_2024,
    title = {JanusGraph: an open-source, distributed graph database},
    author = {{JanusGraph Contributors}},
    year = {2024},
    version = {1.1.0},
    publisher = {Zenodo},
    doi = {10.5281/zenodo.14807604}
}

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot