Menu
Home Explore People Places Arts History Plants & Animals Science Life & Culture Technology
On this page
Comparison of distributed file systems
List article

In computing, a distributed file system (DFS) or network file system is any file system that allows access from multiple hosts to files shared via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources.

Distributed file systems differ in their performance, mutability of content, handling of concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing content.

Locally managed

FOSS

ClientWritten inLicenseAccess APIHigh availabilityShardsEfficient RedundancyRedundancy GranularityInitial release yearMemory requirements (GB)
Alluxio (Virtual Distributed File System)JavaApache License 2.0HDFS, FUSE, HTTP/REST, S3hot standbyNoReplication1File22013
CephC++LGPLlibrados (C, C++, Python, Ruby), S3, Swift, FUSEYesYesPluggable erasure codes3Pool420101 per TB of storage
CodaCGPLCYesYesReplicationVolume51987
GlusterFSCGPLv3libglusterfs, FUSE, NFS, SMB, Swift, libgfapimirrorYesReed-Solomon6Volume72005
HDFSJavaApache License 2.0Java and C client, HTTP, FUSE8transparent master failoverNoReed-Solomon9File102005
IPFSGoApache 2.0 or MITHTTP gateway, FUSE, Go client, Javascript client, command line toolYeswith IPFS ClusterReplication11Block12201513
LizardFS14C++GPLv3POSIX, FUSE, NFS-Ganesha, Ceph FSAL (via libcephfs)masterNoReed-Solomon15File162013
LustreCGPLv2POSIX, NFS-Ganesha, NFS, SMBYesYesNo redundancy1718No redundancy19202003
MinIOGoAGPL3.0AWS S3 API, FTP, SFTPYesYesReed-Solomon21Object222014
MooseFSCGPLv2POSIX, FUSEmasterNoReplication23File242008
OpenAFSCIBM Public LicenseVirtual file system, Installable File SystemReplicationVolume25200026
OpenIO27CAGPLv3 / LGPLv3Native (Python, C, Java), HTTP/REST, S3, Swift, FUSE (POSIX, NFS, SMB, FTP)YesPluggable erasure codes28Object2920150.5
Quantcast File SystemCApache License 2.0C++ client, FUSE (C++ server: MetaServer and ChunkServer are both in C++)masterNoReed-Solomon30File312012
RozoFSC, PythonGPLv2FUSE, SMB, NFS, key/valueYesMojette32Volume33201134
Tahoe-LAFSPythonGNU GPL35HTTP (browser or CLI), SFTP, FTP, FUSE via SSHFS, pyfilesystemReed-Solomon36File372007
XtreemFSJava, C++BSD Licenselibxtreemfs (Java, C++), FUSEReplication38File392009

Proprietary

ClientWritten inLicenseAccess API
BeeGFSC / C++FRAUNHOFER FS (FhGFS) EULA,40

GPLv2 client

POSIX
ObjectiveFS41CProprietaryPOSIX, FUSE
Spectrum Scale (GPFS)C, C++ProprietaryPOSIX, NFS, SMB, Swift, S3, HDFS
MapR-FSC, C++ProprietaryPOSIX, NFS, FUSE, S3, HDFS, CLI
Isilon OneFSC/C++ProprietaryPOSIX, NFS, SMB/CIFS, HDFS, HTTP, FTP, SWIFT Object, CLI, Rest API
QumuloC/C++ProprietaryPOSIX, NFS, SMB/CIFS, CLI, S3, Rest API
ScalityCProprietaryFUSE, NFS, REST, AWS S3

Remote access

NameRun byAccess API
Amazon S3Amazon.comHTTP (REST/SOAP)
Google Cloud StorageGoogleHTTP (REST)
SWIFT (part of OpenStack)Rackspace, Hewlett-Packard, othersHTTP (REST)
Microsoft AzureMicrosoftHTTP (REST)
IBM Cloud Object StorageIBM (formerly Cleversafe)42HTTP (REST)

Comparison

Some researchers have made a functional and experimental analysis of several distributed file systems including HDFS, Ceph, Gluster, Lustre and old (1.6.x) version of MooseFS, although this document is from 2013 and a lot of information are outdated (e.g. MooseFS had no HA for Metadata Server at that time).43

The cloud based remote distributed storage from major vendors have different APIs and different consistency models.44

See also

References

  1. "Caching: Managing Data Replication in Alluxio". https://docs.alluxio.io/os/user/stable/en/core-services/Caching.html#managing-data-replication-in-alluxio

  2. "Caching: Managing Data Replication in Alluxio". https://docs.alluxio.io/os/user/stable/en/core-services/Caching.html#managing-data-replication-in-alluxio

  3. "Erasure Code Profiles". https://docs.ceph.com/en/latest/rados/operations/erasure-code-profile/

  4. "Pools". https://docs.ceph.com/en/latest/rados/operations/pools/

  5. Satyanarayanan, Mahadev; Kistler, James J.; Kumar, Puneet; Okasaki, Maria E.; Siegel, Ellen H.; Steere, David C. "Coda: A Highly Available File System for a Distributed Workstation Environment" (PDF). {{cite journal}}: Cite journal requires |journal= (help) https://www.csee.umbc.edu/courses/graduate/CMSC621/fall2006/lectures/coda.pdf

  6. "Erasure coding implementation". GitHub. 2 November 2021. https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/ec-implementation.md

  7. "Setting up GlusterFS Volumes". https://docs.gluster.org/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/

  8. "MountableHDFS". https://cwiki.apache.org/confluence/display/HADOOP2/MountableHDFS

  9. "HDFS-7285 Erasure Coding Support inside HDFS". https://issues.apache.org/jira/browse/HDFS-7285

  10. "Apache Hadoop: setrep". https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep

  11. Erasure coding plan: "Reed-Solomon layer over IPFS #196". GitHub., "Erasure Coding Layer #6". GitHub. https://github.com/ipfs/notes/issues/196

  12. "CLI Commands: ipfs bitswap wantlist". https://docs.ipfs.io/reference/cli/#ipfs-bitswap-wantlist

  13. "Why The Internet Needs IPFS Before It's Too Late". 4 October 2015. https://techcrunch.com/2015/10/04/why-the-internet-needs-ipfs-before-its-too-late/

  14. "Is LizardFS development still alive?". GitHub. https://github.com/lizardfs/lizardfs/issues/805#issuecomment-2238866486

  15. "Configuring Replication Modes". https://docs.lizardfs.com/adminguide/replication.html

  16. "Configuring Replication Modes: Set and show the goal of a file/directory". https://docs.lizardfs.com/adminguide/replication.html#set-and-show-the-goal-of-a-file-directory

  17. "Lustre Operations Manual: What a Lustre File System Is (and What It Isn't)". https://doc.lustre.org/lustre_manual.xhtml#understandinglustre.whatislustre

  18. Reed-Solomon in progress: "LU-10911 FLR2: Erasure coding". https://jira.whamcloud.com/browse/LU-10911

  19. "Lustre Operations Manual: Lustre Features". https://doc.lustre.org/lustre_manual.xhtml#idm139974537188976

  20. File-level redundancy plan: "File Level Redundancy Solution Architecture". https://wiki.lustre.org/File_Level_Redundancy_Solution_Architecture

  21. "MinIO Erasure Code Quickstart Guide". https://docs.min.io/docs/minio-erasure-code-quickstart-guide.html

  22. "MinIO Storage Class Quickstart Guide". GitHub. https://github.com/minio/minio/tree/master/docs/erasure/storage-class

  23. Only available in the proprietary version 4.x "[feature] erasure-coding #8". GitHub. https://github.com/moosefs/moosefs/issues/8

  24. "mfsgoal(1)". https://fossies.org/linux/moosefs/mfsmanpages/mfsgoal.1

  25. "Replicating Volumes (Creating Read-only Volumes)". http://docs.openafs.org/AdminGuide/HDRWQ192.html

  26. "OpenAFS". https://www.openafs.org/release/openafs-1.0.html

  27. "OpenIO SDS Documentation". docs.openio.io. https://docs.openio.io/

  28. "Erasure Coding". https://docs.openio.io/latest/source/admin-guide/configuration_ec.html

  29. "Declare Storage Policies". https://docs.openio.io/latest/source/admin-guide/configuration_storagepolicies.html

  30. "The Quantcast File System" (PDF). https://www.cs.utah.edu/~hari/teaching/bigdata/qfs-ovsiannikov.pdf

  31. "qfs/src/cc/tools/cptoqfs_main.cc". GitHub. 8 December 2021. https://github.com/quantcast/qfs/blob/2.2.2/src/cc/tools/cptoqfs_main.cc#L259

  32. "About RozoFS: Mojette Transform". http://rozofs.github.io/rozofs/develop/AboutRozoFS.html#mojette-transform

  33. "Setting up RozoFS: Exportd Configuration File". http://rozofs.github.io/rozofs/develop/SettingUpRozoFS.html#exportd-configuration-file

  34. "Initial commit". GitHub. https://github.com/rozofs/rozofs/commit/9818e92f73fe4432c8d29236158e271da9ee3bf2

  35. "About Tahoe-LAFS". GitHub. 24 February 2022. https://github.com/tahoe-lafs/tahoe-lafs/blob/master/README.rst#licence

  36. "zfec -- a fast C implementation of Reed-Solomon erasure coding". GitHub. 24 February 2022. https://github.com/tahoe-lafs/zfec

  37. "Tahoe-LAFS Architecture: File Encoding". https://tahoe-lafs.readthedocs.io/en/latest/architecture.html#file-encoding

  38. "Under the Hood: File Replication". http://www.xtreemfs.org/how_replication_works.php

  39. "Quickstart: Replicate A File". http://www.xtreemfs.org/quickstart_repl.php

  40. "FRAUNHOFER FS (FhGFS) END USER LICENSE AGREEMENT". Fraunhofer Society. 2012-02-22. http://www.fhgfs.com/docs/FraunhoferFS_EULA.txt

  41. "ObjectiveFS official website". https://objectiveFS.com

  42. "IBM Plans to Acquire Cleversafe for Object Storage in Cloud". www-03.ibm.com. 2015-10-05. Archived from the original on October 8, 2015. Retrieved 2019-05-06. https://web.archive.org/web/20151008001155/http://www-03.ibm.com/press/us/en/pressrelease/47776.wss

  43. Séguin, Cyril; Depardon, Benjamin; Le Mahec, Gaël. "Analysis of Six Distributed File Systems" (PDF). HAL. https://hal.archives-ouvertes.fr/file/index/docid/789086/filename/a_survey_of_dfs.pdf

  44. "Data Consistency Models of Public Cloud Storage Services: Amazon S3, Google Cloud Storage and Windows Azure Storage". SysTutorials. 4 February 2014. Retrieved 19 June 2017. https://www.systutorials.com/3551/data-consistency-models-of-public-cloud-storage-services-amazon-s3-google-cloud-storage-and-windows-azure-storage/