TY - JOUR
T1 - Scalable community-driven data sharing in e-science grids
AU - Scholl, Tobias
AU - Bauer, Bernhard
AU - Gufler, Benjamin
AU - Kuntschke, Richard
AU - Reiser, Angelika
AU - Kemper, Alfons
PY - 2009/3
Y1 - 2009/3
N2 - E-science projects of various disciplines face a fundamental challenge: thousands of users want to obtain new scientific results by application-specific and dynamic correlation of data from globally distributed sources. Considering the involved enormous and exponentially growing data volumes, centralized data management reaches its limits. Since scientific data are often highly skewed and exploration tasks exhibit a large degree of spatial locality, we propose the locality-aware allocation of data objects onto a distributed network of interoperating databases. HiSbase is an approach to data management in scientific federated Data Grids that addresses the scalability issue by combining established techniques of database research in the field of spatial data structures (quadtrees), histograms, and parallel databases with the scalable resource sharing and load balancing capabilities of decentralized Peer-to-Peer (P2P) networks. The proposed combination constitutes a complementary e-science infrastructure enabling load balancing and increased query throughput.
AB - E-science projects of various disciplines face a fundamental challenge: thousands of users want to obtain new scientific results by application-specific and dynamic correlation of data from globally distributed sources. Considering the involved enormous and exponentially growing data volumes, centralized data management reaches its limits. Since scientific data are often highly skewed and exploration tasks exhibit a large degree of spatial locality, we propose the locality-aware allocation of data objects onto a distributed network of interoperating databases. HiSbase is an approach to data management in scientific federated Data Grids that addresses the scalability issue by combining established techniques of database research in the field of spatial data structures (quadtrees), histograms, and parallel databases with the scalable resource sharing and load balancing capabilities of decentralized Peer-to-Peer (P2P) networks. The proposed combination constitutes a complementary e-science infrastructure enabling load balancing and increased query throughput.
KW - Data sharing (H.3.5)
KW - Distributed databases (H.2.4)
KW - Scientific databases (H.2.8)
UR - http://www.scopus.com/inward/record.url?scp=55949107606&partnerID=8YFLogxK
U2 - 10.1016/j.future.2008.05.006
DO - 10.1016/j.future.2008.05.006
M3 - Article
AN - SCOPUS:55949107606
SN - 0167-739X
VL - 25
SP - 290
EP - 300
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
IS - 3
ER -