The Gfarm filesystem daemon (gfsd) runs on each node of the Gfarm pool to facilitate remote file operations with access control in the Gfarm filesystem using a light-weight GFS RPC, as well as dynamic execution-loading from the Gfarm server; other roles are node resource status monitoring and control.
Metadata of files in the Gfarm filesystem is stored into the Gfarm Meta Database, which consists of a mapping from a logical Gfarm file name to physical distributed fragment file names and file status information including file size, protection, access/modification/change time and checksum as well as a replica catalog and a history. The history is needed to re-compute the data when a node or a disk fails, or to validate how the data is generated. Metadata is registered at the close operation of each Gfarm fragment and checked validity after all parallel processes terminate. When one of user processes terminates unexpectedly without registering metadata while the other processes correctly register metadata, metadata remains invalid and will be deleted by the system.
The Gfarm server is based on the network-enabled server that is a major component of the GridRPC, enhanced with the Gfarm filesystem capability. The Gfarm server authenticates the Gfarm client using the Generic Security Service for mutual authentication and single sign-on, and executes a parallel program that may be a user program registered by the Gfarm client, on the Gfarm pool spread over the Grid. The Gfarm server analyses input and output Gfarm files, and schedules Gfarm pool nodes to be executed by inquiring of the Gfarm Meta Database. The scheduling should consider physical locations of fragments of Gfarm files, the replica information and node status in the Gfarm pool.
The Gfarm client interacts with the Gfarm server and the Gfarm system using GUIs or a shell front end called the Gfarm shell; more sophisticated client program interaction is possible with GridRPC that makes it easy for the users to execute a remote procedure with the feature of dynamic Interface Description Language (IDL) loading and management. Users can register and execute his analysis software as well as monitor and administer the system using the shell.
There are several tools to interact with conventional filesystems or network streams by GridFTP and so on. Gfimport imports and scatters large-scale data, and gfexport gathers and exports the data. The Gfarm system handles load balancing by redistributing the data based on program profiling.