Snowball: Scalable Storage on Networks of Workstations with Balanced Load

    loading  Checking for direct PDF access through Ovid

Abstract

Networks of workstations are an emerging architectural paradigm for high-performance parallel and distributed systems. Exploiting networks of workstations for massive data management poses exciting challenges. We consider here the problem of managing record-structured data in such an environment. For example, managing collections of HTML documents on a cluster of WWW servers is an important application for which our approach provides support. The records are accessed by a dynamically growing set of clients based on a search key (e.g., a URL). To scale up the throughput of client accesses with approximately constant response time, the records and thus also their access load are dynamically redistributed across a growing set of workstations. The paper addresses two problems of realistic workloads: skewed access frequencies to the records and evolving access patterns where previously cold records may become hot and vice versa. Our solution incorporates load tracking at different levels of granularity and automatically chooses the appropriate granularity for dynamic data migrations. Experimental results based on a detailed simulation model show that our method is indeed successful in providing scalable cost/performance and explicitly controlling its level.

Related Topics

    loading  Loading Related Articles