Top Related Projects
Ceph is a distributed object, block, and file storage platform
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
Gluster Filesystem : Build your distributed storage in minutes
Most popular & widely deployed Open Source Container Native Storage platform for Stateful Persistent Applications on Kubernetes.
Quick Overview
JuiceFS is an open-source, high-performance distributed file system designed for cloud-native environments. It supports various object storage services as the underlying storage layer and provides POSIX-compatible file system interfaces. JuiceFS is optimized for big data and AI workloads, offering features like data compression, encryption, and caching.
Pros
- Seamless integration with existing cloud storage services (S3, Google Cloud Storage, etc.)
- High performance with intelligent caching and metadata management
- POSIX-compatible, allowing easy integration with existing applications
- Strong data consistency and reliability
Cons
- Requires additional setup and configuration compared to native file systems
- Potential network latency when accessing remote object storage
- Learning curve for users unfamiliar with distributed file systems
- Dependency on external object storage services
Code Examples
- Mounting a JuiceFS volume:
juicefs mount -d redis://localhost:6379/1 s3://mybucket /mnt/jfs
This command mounts a JuiceFS volume using Redis for metadata and S3 for data storage.
- Creating a new JuiceFS volume:
juicefs format \
--storage s3 \
--bucket https://mybucket.s3.amazonaws.com \
--access-key myAccessKey \
--secret-key mySecretKey \
redis://localhost:6379/1 \
myjfs
This example creates a new JuiceFS volume named "myjfs" using S3 for storage and Redis for metadata.
- Configuring cache settings:
juicefs mount --cache-dir /var/jfsCache --cache-size 100000 redis://localhost:6379/1 /mnt/jfs
This command mounts a JuiceFS volume with a local cache directory and a specified cache size.
Getting Started
To get started with JuiceFS:
- Install JuiceFS:
curl -sSL https://d.juicefs.com/install | sh -
-
Set up a Redis server for metadata storage.
-
Create a new JuiceFS volume:
juicefs format --storage s3 --bucket https://mybucket.s3.amazonaws.com \
--access-key myAccessKey --secret-key mySecretKey \
redis://localhost:6379/1 myjfs
- Mount the JuiceFS volume:
juicefs mount redis://localhost:6379/1 /mnt/jfs
- Start using the mounted file system at
/mnt/jfs
.
Competitor Comparisons
Ceph is a distributed object, block, and file storage platform
Pros of Ceph
- Highly scalable and distributed storage system
- Supports multiple storage types (object, block, file)
- Mature project with extensive documentation and community support
Cons of Ceph
- Complex setup and configuration process
- Higher resource requirements for deployment
- Steeper learning curve for management and maintenance
Code Comparison
JuiceFS:
jfs, err := juicefs.NewJFS(meta, conf)
if err != nil {
log.Fatalf("Initialize JuiceFS: %s", err)
}
Ceph:
import rados
cluster = rados.Rados(conffile='ceph.conf')
cluster.connect()
ioctx = cluster.open_ioctx('my_pool')
JuiceFS is designed as a lightweight, POSIX-compatible file system that can be easily integrated into existing applications. It focuses on simplicity and ease of use, making it suitable for various cloud storage scenarios.
Ceph, on the other hand, is a comprehensive storage platform that offers a wider range of storage solutions, including object, block, and file storage. It's designed for large-scale deployments and provides advanced features like data replication and self-healing capabilities.
While JuiceFS excels in simplicity and cloud integration, Ceph offers more flexibility and scalability for complex storage requirements. The choice between the two depends on specific use cases and infrastructure needs.
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
Pros of SeaweedFS
- Simpler architecture, easier to set up and maintain
- Better performance for small file operations
- Native support for object storage and S3 API
Cons of SeaweedFS
- Less mature POSIX compatibility compared to JuiceFS
- Limited support for advanced features like data compression and encryption
- Smaller community and ecosystem
Code Comparison
SeaweedFS (Go):
func (v *Volume) Write(needle *Needle) (size uint32, err error) {
nv, ok := v.nm.Get(needle.Id)
if !ok || nv.Offset == 0 {
size, err = v.appendNeedle(needle)
} else {
size, err = v.updateNeedle(needle, nv.Offset)
}
return
}
JuiceFS (Go):
func (v *Volume) Write(ctx context.Context, inode Ino, offset uint64, data []byte) (err error) {
v.Lock()
defer v.Unlock()
return v.write(ctx, inode, offset, data)
}
Both projects use Go and implement distributed file systems, but SeaweedFS focuses on object storage with a simpler design, while JuiceFS provides a more feature-rich POSIX-compliant file system. SeaweedFS may be better suited for scenarios requiring high-performance small file operations, while JuiceFS offers more advanced features and better POSIX compatibility.
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
Pros of MinIO
- Highly scalable and performant object storage system
- Supports S3 API compatibility out of the box
- Extensive ecosystem with integrations for various cloud services
Cons of MinIO
- Limited file system features compared to JuiceFS
- Lacks built-in data compression and encryption capabilities
- May require more complex setup for distributed deployments
Code Comparison
MinIO (Go):
func (xl xlObjects) PutObject(ctx context.Context, bucket, object string, data *PutObjReader, opts ObjectOptions) (objInfo ObjectInfo, err error) {
// Implementation details
}
JuiceFS (Go):
func (v *VFS) Create(ctx context.Context, path string, mode uint32) (File, error) {
// Implementation details
}
Both projects are written in Go, but they serve different purposes. MinIO focuses on object storage with S3 compatibility, while JuiceFS provides a distributed file system with POSIX compliance. The code snippets show different approaches to handling file operations, reflecting their distinct architectures and use cases.
Gluster Filesystem : Build your distributed storage in minutes
Pros of GlusterFS
- Mature and battle-tested distributed file system with a large user base
- Supports a wide range of storage configurations and deployment scenarios
- Native integration with many enterprise-level applications and systems
Cons of GlusterFS
- Can be complex to set up and manage, especially for smaller deployments
- Performance may degrade with a large number of small files
- Limited support for object storage interfaces
Code Comparison
GlusterFS volume creation:
gluster volume create test-volume replica 2 server1:/exp1 server2:/exp2
gluster volume start test-volume
JuiceFS volume creation:
juicefs format \
--storage s3 \
--bucket https://mybucket.s3.amazonaws.com \
redis://localhost/1 \
myvol
GlusterFS focuses on distributed storage across multiple nodes, while JuiceFS emphasizes cloud-native object storage integration. GlusterFS uses a more traditional volume-based approach, whereas JuiceFS leverages object storage backends with metadata stored separately.
Both systems aim to provide scalable and distributed file storage, but JuiceFS offers a more modern, cloud-oriented approach with simpler setup and management. GlusterFS, on the other hand, provides more flexibility in terms of storage configurations and has a longer track record in enterprise environments.
Most popular & widely deployed Open Source Container Native Storage platform for Stateful Persistent Applications on Kubernetes.
Pros of OpenEBS
- Provides storage for stateful applications in Kubernetes with multiple storage engines
- Supports local and cloud storage, offering flexibility in deployment
- Has a large community and is a CNCF project, ensuring long-term support and development
Cons of OpenEBS
- Can be complex to set up and manage compared to JuiceFS's simpler architecture
- May have higher resource overhead due to its multiple storage engine options
- Performance can vary depending on the chosen storage engine and configuration
Code Comparison
OpenEBS (using cStor engine):
apiVersion: openebs.io/v1alpha1
kind: StoragePoolClaim
metadata:
name: cstor-disk-pool
annotations:
cas.openebs.io/config: |
- name: PoolResourceRequests
value: |-
memory: 2Gi
JuiceFS:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: juicefs-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: juicefs-sc
resources:
requests:
storage: 5Gi
Both OpenEBS and JuiceFS provide storage solutions for Kubernetes, but they have different approaches. OpenEBS offers multiple storage engines and is more flexible, while JuiceFS focuses on a distributed file system with POSIX compatibility. The choice between them depends on specific use cases and requirements.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
JuiceFS is a high-performance POSIX file system released under Apache License 2.0, particularly designed for the cloud-native environment. The data, stored via JuiceFS, will be persisted in Object Storage (e.g. Amazon S3), and the corresponding metadata can be persisted in various compatible database engines such as Redis, MySQL, and TiKV based on the scenarios and requirements.
With JuiceFS, massive cloud storage can be directly connected to big data, machine learning, artificial intelligence, and various application platforms in production environments. Without modifying code, the massive cloud storage can be used as efficiently as local storage.
ð Document: Quick Start Guide
Highlighted Features
- Fully POSIX-compatible: Use as a local file system, seamlessly docking with existing applications without breaking business workflow.
- Fully Hadoop-compatible: JuiceFS' Hadoop Java SDK is compatible with Hadoop 2.x and Hadoop 3.x as well as a variety of components in the Hadoop ecosystems.
- S3-compatible: JuiceFS' S3 Gateway provides an S3-compatible interface.
- Cloud Native: A Kubernetes CSI Driver is provided for easily using JuiceFS in Kubernetes.
- Shareable: JuiceFS is a shared file storage that can be read and written by thousands of clients.
- Strong Consistency: The confirmed modification will be immediately visible on all the servers mounted with the same file system.
- Outstanding Performance: The latency can be as low as a few milliseconds, and the throughput can be expanded nearly unlimitedly (depending on the size of the Object Storage). Test results
- Data Encryption: Supports data encryption in transit and at rest (please refer to the guide for more information).
- Global File Locks: JuiceFS supports both BSD locks (flock) and POSIX record locks (fcntl).
- Data Compression: JuiceFS supports LZ4 or Zstandard to compress all your data.
Architecture | Getting Started | Advanced Topics | POSIX Compatibility | Performance Benchmark | Supported Object Storage | Who is using | Roadmap | Reporting Issues | Contributing | Community | Usage Tracking | License | Credits | FAQ
Architecture
JuiceFS consists of three parts:
- JuiceFS Client: Coordinates Object Storage and metadata storage engine as well as implementation of file system interfaces such as POSIX, Hadoop, Kubernetes, and S3 gateway.
- Data Storage: Stores data, with supports of a variety of data storage media, e.g., local disk, public or private cloud Object Storage, and HDFS.
- Metadata Engine: Stores the corresponding metadata that contains information of file name, file size, permission group, creation and modification time and directory structure, etc., with supports of different metadata engines, e.g., Redis, MySQL, SQLite and TiKV.
JuiceFS can store the metadata of file system on different metadata engines, like Redis, which is a fast, open-source, in-memory key-value data storage, particularly suitable for storing metadata; meanwhile, all the data will be stored in Object Storage through JuiceFS client. Learn more
Each file stored in JuiceFS is split into "Chunk" s at a fixed size with the default upper limit of 64 MiB. Each Chunk is composed of one or more "Slice"(s), and the length of the slice varies depending on how the file is written. Each slice is composed of size-fixed "Block" s, which are 4 MiB by default. These blocks will be stored in Object Storage in the end; at the same time, the metadata information of the file and its Chunks, Slices, and Blocks will be stored in metadata engines via JuiceFS. Learn more
When using JuiceFS, files will eventually be split into Chunks, Slices and Blocks and stored in Object Storage. Therefore, the source files stored in JuiceFS cannot be found in the file browser of the Object Storage platform; instead, there are only a chunks directory and a bunch of digitally numbered directories and files in the bucket. Don't panic! This is just the secret of the high-performance operation of JuiceFS!
Getting Started
Before you begin, make sure you have:
- One supported metadata engine, see How to Set Up Metadata Engine
- One supported Object Storage for storing data blocks, see Supported Object Storage
- JuiceFS Client downloaded and installed
Please refer to Quick Start Guide to start using JuiceFS right away!
Command Reference
Check out all the command line options in command reference.
Containers
JuiceFS can be used as a persistent volume for Docker and Podman, please check here for details.
Kubernetes
It is also very easy to use JuiceFS on Kubernetes. Please find more information here.
Hadoop Java SDK
If you wanna use JuiceFS in Hadoop, check Hadoop Java SDK.
Advanced Topics
- Redis Best Practices
- How to Setup Object Storage
- Cache
- Fault Diagnosis and Analysis
- FUSE Mount Options
- Using JuiceFS on Windows
- S3 Gateway
Please refer to JuiceFS Document Center for more information.
POSIX Compatibility
JuiceFS has passed all of the compatibility tests (8813 in total) in the latest pjdfstest .
All tests successful.
Test Summary Report
-------------------
/root/soft/pjdfstest/tests/chown/00.t (Wstat: 0 Tests: 1323 Failed: 0)
TODO passed: 693, 697, 708-709, 714-715, 729, 733
Files=235, Tests=8813, 233 wallclock secs ( 2.77 usr 0.38 sys + 2.57 cusr 3.93 csys = 9.65 CPU)
Result: PASS
Aside from the POSIX features covered by pjdfstest, JuiceFS also provides:
- Close-to-open consistency. Once a file is written and closed, it is guaranteed to view the written data in the following opens and reads from any client. Within the same mount point, all the written data can be read immediately.
- Rename and all other metadata operations are atomic, which are guaranteed by supported metadata engine transaction.
- Opened files remain accessible after unlink from same mount point.
- Mmap (tested with FSx).
- Fallocate with punch hole support.
- Extended attributes (xattr).
- BSD locks (flock).
- POSIX record locks (fcntl).
Performance Benchmark
Basic benchmark
JuiceFS provides a subcommand that can run a few basic benchmarks to help you understand how it works in your environment:
Throughput
A sequential read/write benchmark has also been performed on JuiceFS, EFS and S3FS by fio.
Above result figure shows that JuiceFS can provide 10X more throughput than the other two (see more details).
Metadata IOPS
A simple mdtest benchmark has been performed on JuiceFS, EFS and S3FS by mdtest.
The result shows that JuiceFS can provide significantly more metadata IOPS than the other two (see more details).
Analyze performance
See Real-Time Performance Monitoring if you encountered performance issues.
Supported Object Storage
- Amazon S3 (and other S3 compatible Object Storage services)
- Google Cloud Storage
- Azure Blob Storage
- Alibaba Cloud Object Storage Service (OSS)
- Tencent Cloud Object Storage (COS)
- Qiniu Cloud Object Storage (Kodo)
- QingStor Object Storage
- Ceph RGW
- MinIO
- Local disk
- Redis
- ...
JuiceFS supports numerous Object Storage services. Learn more.
Who is using
JuiceFS is production ready and used by thousands of machines in production. A list of users has been assembled and documented here. In addition JuiceFS has several collaborative projects that integrate with other open source projects, which we have documented here. If you are also using JuiceFS, please feel free to let us know, and you are welcome to share your specific experience with everyone.
The storage format is stable, and will be supported by all future releases.
Roadmap
- User and group quotas
- Snapshots
- Write once read many (WORM)
Reporting Issues
We use GitHub Issues to track community reported issues. You can also contact the community for any questions.
Contributing
Thank you for your contribution! Please refer to the JuiceFS Contributing Guide for more information.
Community
Welcome to join the Discussions and the Slack channel to connect with JuiceFS team members and other users.
Usage Tracking
JuiceFS collects anonymous usage data by default to help us better understand how the community is using JuiceFS. Only core metrics (e.g. version number) will be reported, and user data and any other sensitive data will not be included. The related code can be viewed here.
You could also disable reporting easily by command line option --no-usage-report
:
juicefs mount --no-usage-report
License
JuiceFS is open-sourced under Apache License 2.0, see LICENSE.
Credits
The design of JuiceFS was inspired by Google File System, HDFS and MooseFS. Thanks for their great work!
FAQ
Why doesn't JuiceFS support XXX Object Storage?
JuiceFS supports many Object Storage services. Please check out this list first. If the Object Storage you want to use is compatible with S3, you could treat it as S3. Otherwise, try reporting any issue.
Can I use Redis Cluster as metadata engine?
Yes. Since v1.0.0 Beta3 JuiceFS supports the use of Redis Cluster as the metadata engine, but it should be noted that Redis Cluster requires that the keys of all operations in a transaction must be in the same hash slot, so a JuiceFS file system can only use one hash slot.
See "Redis Best Practices" for more information.
What's the difference between JuiceFS and XXX?
See "Comparison with Others" for more information.
For more FAQs, please see the full list.
Stargazers over time
Top Related Projects
Ceph is a distributed object, block, and file storage platform
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
Gluster Filesystem : Build your distributed storage in minutes
Most popular & widely deployed Open Source Container Native Storage platform for Stateful Persistent Applications on Kubernetes.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot