Integrating Theory and Practice in Parallel File Systems

Thomas H. Cormen, Dartmouth College
David Kotz, Dartmouth College

Describing the file system capabilities needed by parallel I/O algorithms to effectively use a parallel disk system. Cite cormen:integrate.

Abstract

Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present a list of capabilities that must be provided by the system to support these optimal algorithms: control over declustering, querying about the configuration, independent I/O, turning off file caching and prefetching, and bypassing parity. We summarize recent theoretical and empirical work that justifies the need for these capabilities.