TY - GEN
T1 - Community training
T2 - E-Science 2007, 3rd IEEE International Conference on E-Science and Grid Computing
AU - Scholl, Tobias
AU - Kuntschke, Richard
AU - Reiser, Angelika
AU - Kemper, Alfons
PY - 2007
Y1 - 2007
N2 - In federated Data Grids, individual institutions share their data sets within a community to enable collaborative data analysis. Data access needs to be provided in a scalable fashion since in most e-science communities, data sets do not only grow exponentially but also experience an increasing popularity. If data autonomy is retained, each individual institution has to ensure efficient access to its data. Analyzing application-specific data properties (such as data skew) or query characteristics (query patterns) and distributing data within Data Grids accordingly, allows for improved throughput for data-intensive applications and enables better load-balancing between shared resources. We propose a framework for investigating application-specific index structures for creating suitable partitioning schemes. We evaluate two variants of the well-known Quadtree data structure as well as the Zones approach, an index structure from the astrophysics domain, according to several criteria. Our framework improves data access within federated Data Grids and can be combined with well-established Grid methods as well as with more flexible P2P technologies.
AB - In federated Data Grids, individual institutions share their data sets within a community to enable collaborative data analysis. Data access needs to be provided in a scalable fashion since in most e-science communities, data sets do not only grow exponentially but also experience an increasing popularity. If data autonomy is retained, each individual institution has to ensure efficient access to its data. Analyzing application-specific data properties (such as data skew) or query characteristics (query patterns) and distributing data within Data Grids accordingly, allows for improved throughput for data-intensive applications and enables better load-balancing between shared resources. We propose a framework for investigating application-specific index structures for creating suitable partitioning schemes. We evaluate two variants of the well-known Quadtree data structure as well as the Zones approach, an index structure from the astrophysics domain, according to several criteria. Our framework improves data access within federated Data Grids and can be combined with well-established Grid methods as well as with more flexible P2P technologies.
UR - http://www.scopus.com/inward/record.url?scp=44949242948&partnerID=8YFLogxK
U2 - 10.1109/E-SCIENCE.2007.20
DO - 10.1109/E-SCIENCE.2007.20
M3 - Conference contribution
AN - SCOPUS:44949242948
SN - 0769530648
SN - 9780769530642
T3 - Proceedings - e-Science 2007, 3rd IEEE International Conference on e-Science and Grid Computing
SP - 195
EP - 202
BT - Proceedings - e-Science 2007, 3rd IEEE International Conference on e-Science and Grid Computing
Y2 - 10 December 2007 through 13 December 2007
ER -