International Business Machines Corporation
Enabling a Hadoop file system with POSIX compliance
Last updated:
Abstract:
A distributed file system (DFS) is provided that is configured to store data in a General Parallel File system (GPFS) and interface with a client configured to interface with a HADOOP Distributed File System (HDFS). The DFS includes a first Node; and a plurality of second Nodes including the GPFS. The first Node is configured to convert an HDFS command from the client into a GPFS command, apply the GPFS command to the GPFS to access a GPFS file, format an HDFS data structure to include identifiers of a set of the second nodes storing the GPFS file, a filename of the GPFS file, and an offset into the GFPS file, and send the HDFS data structure to the client. Each of the second Nodes is configured to access the GPFS using a part of the HDFS data structure received from the client.
Utility
9 Jun 2016
31 Aug 2021