TY - GEN
T1 - In-database machine learning
T2 - Datenbanksysteme fur Business, Technologie und Web, BTW 2019 and 18. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", DBIS 2019 - Database Systems for Business, Technology and Web, BTW 2019 and 18th Symposium of the GI Department "Databases and Information Systems", DBIS 2019
AU - Schüle, Maximilian
AU - Simonis, Frédéric
AU - Heyenbrock, Thomas
AU - Kemper, Alfons
AU - Günnemann, Stephan
AU - Neumann, Thomas
N1 - Publisher Copyright:
© 2019 Gesellschaft fur Informatik (GI). All rights reserved.
PY - 2019
Y1 - 2019
N2 - Machine learning tasks such as regression, clustering, and classification are typically performed outside of database systems using dedicated tools, necessitating the extraction, transformation, and loading of data. We argue that database systems when extended to enable automatic differentiation, gradient descent, and tensor algebra are capable of solving machine learning tasks more efficiently by eliminating the need for costly data communication. We demonstrate our claim by implementing tensor algebra and stochastic gradient descent using lambda expressions for loss functions as a pipelined operator in a main memory database system. Our approach enables common machine learning tasks to be performed faster than by extended disk-based database systems or as well as dedicated tools by eliminating the time needed for data extraction. This work aims to incorporate gradient descent and tensor data types into database systems, allowing them to handle a wider range of computational tasks.
AB - Machine learning tasks such as regression, clustering, and classification are typically performed outside of database systems using dedicated tools, necessitating the extraction, transformation, and loading of data. We argue that database systems when extended to enable automatic differentiation, gradient descent, and tensor algebra are capable of solving machine learning tasks more efficiently by eliminating the need for costly data communication. We demonstrate our claim by implementing tensor algebra and stochastic gradient descent using lambda expressions for loss functions as a pipelined operator in a main memory database system. Our approach enables common machine learning tasks to be performed faster than by extended disk-based database systems or as well as dedicated tools by eliminating the time needed for data extraction. This work aims to incorporate gradient descent and tensor data types into database systems, allowing them to handle a wider range of computational tasks.
UR - http://www.scopus.com/inward/record.url?scp=85072114855&partnerID=8YFLogxK
U2 - 10.18420/btw2019-16
DO - 10.18420/btw2019-16
M3 - Conference contribution
AN - SCOPUS:85072114855
T3 - Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)
SP - 247
EP - 266
BT - Datenbanksysteme fur Business, Technologie und Web, BTW 2019 and 18. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", DBIS 2019
A2 - Grust, Torsten
A2 - Naumann, Felix
A2 - Bohm, Alexander
A2 - Lehner, Wolfgang
A2 - Harder, Theo
A2 - Rahm, Erhard
A2 - Heuer, Andreas
A2 - Klettke, Meike
A2 - Meyer, Holger
PB - Gesellschaft fur Informatik (GI)
Y2 - 4 March 2019 through 8 March 2019
ER -