Abstract
Sorrento: A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications
by: Hong Tang, Aziz Golbeden, Jingyu Zhou, Lingkun Chu, and Tao Yang
Abstract:
This paper describes the design and implementation of Sorrento -- aself-organizing storage cluster built upon commodity components.Sorrento complements previous researches on distributed file/storagesystems by focusing on incremental expandability and manageability ofthe system and on design choices for optimizing performance of paralleldata-intensive applications with low write-sharing patterns. Sorrentovirtualizes distributed storage devices as incrementally expandablevolumes and automatically manages storage node additions and failures.Its consistency model chooses a version-based scheme for data updatingand replica management, which is especially suitable for data-intensiveapplications where distributed processes access disjoint datasets mostof the time. To further facilitate parallel I/O, Sorrento providesload-aware or locality-driven data placement and an adaptive migrationstrategy. This paper presents experimental results to demonstratefeatures and performance of Sorrento using both microbenchmarks andtrace-replay of real applications from several domains, includingscientific computing, data mining, and offline processing for web search.
Keywords:
storage cluster, distributed file systems, parallel I/O, manageability, incremental expansion, load balancing
Date:
October 2003
Document: 2003-30