Scalable community-driven data sharing in e-science grids

Tobias Scholl, Bernhard Bauer, Benjamin Gufler, Richard Kuntschke, Angelika Reiser, Alfons Kemper

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

E-science projects of various disciplines face a fundamental challenge: thousands of users want to obtain new scientific results by application-specific and dynamic correlation of data from globally distributed sources. Considering the involved enormous and exponentially growing data volumes, centralized data management reaches its limits. Since scientific data are often highly skewed and exploration tasks exhibit a large degree of spatial locality, we propose the locality-aware allocation of data objects onto a distributed network of interoperating databases. HiSbase is an approach to data management in scientific federated Data Grids that addresses the scalability issue by combining established techniques of database research in the field of spatial data structures (quadtrees), histograms, and parallel databases with the scalable resource sharing and load balancing capabilities of decentralized Peer-to-Peer (P2P) networks. The proposed combination constitutes a complementary e-science infrastructure enabling load balancing and increased query throughput.

Original languageEnglish
Pages (from-to)290-300
Number of pages11
JournalFuture Generation Computer Systems
Volume25
Issue number3
DOIs
StatePublished - Mar 2009

Keywords

  • Data sharing (H.3.5)
  • Distributed databases (H.2.4)
  • Scientific databases (H.2.8)

Fingerprint

Dive into the research topics of 'Scalable community-driven data sharing in e-science grids'. Together they form a unique fingerprint.

Cite this