f

Scalable Parallel File System Simulation

This project was a LANL/UCSC collaboration and created strong interest at labs and universities: the goal was to create a simulator for parallel file systems. Such a simulator would have:

  • enable file system designers and researchers to try out innovative data placement strategies and other novel subsystems at scale,
  • facilitate file system deployment by providing a low-cost platform for “what-if” workload and file system tuning scenarios,
  • empower scientist to quickly tune existing file systems for specific workloads,
  • aid instructors by providing a platform for class room experiments.

In the first phase we started with building the simulator based on a very simple model of parallel file systems and a set of placement strategies from commonly used systems (so far: PVFS, PanFS, and Ceph). The plan was to validate the simulator by replaying traces collected by the Peta-scale Data Storage Institute (PDSI), LLNL, and at industry. Validation was planned to use a careful and disciplined process of adding and removing features to the simulator’s file system model to arrive at the minimal set of features necessary to reproduce real system’s behavior.

The work so far was presented at FAST 2009 as WiP talk and poster. The project however was never completed.

Mentors

  • John Bent
  • Gary Grider
  • James Nunez
  • Scott Brandt
  • Kleoni Ioannidou
  • Carlos Maltzahn

Sponsors

  • Petascale Data Storage Institute (PDSI)
  • Institute for Scalable Scientific Data Management (ISSDM)
  • GAANN