Large 00016

Rethinking the Foundations of Databases

Christoph Koch

Recorded 25 November 2010 in Lausanne, Vaud, Switzerland

Event: KTN - Know Thy Neighbor


Contemporary database query languages are ultimately founded on logic and feature an additive operation usually a form of (multi)set union or disjunction that is asymmetric in that additions or updates do not always have an inverse.

This asymmetry puts a greater part of the machinery of mainstream mathematics for equation solving outside the reach of databases. However, such equation solving would be a key functionality that problems such as query equivalence testing, the view update problem, and data integration could be reduced to: In the presence of an asymmetric additive operation they are undecidable. Moreover, query languages with a symmetric additive operation (i.e., which has an inverse and is thus based on ring theory) would open up databases for a large range of scientific applications that are currently the domain of computer algebra systems.

This talk motivates an effort to reinvent database management systems with a foundation in abstract algebra and specifically in ring theory. The presence of an additive inverse allows to cleanly define differences between queries. This gives rise to a database analog of differential calculus that leads to radically new incremental and adaptive query evaluation algorithms that substantially outperform the state of the art techniques.

These algorithms enable a new class of systems which I call Dynamic Data Management Systems. Such systems can maintain continuously fresh query views at extremely high update rates and have important applications in interactive Large-scale Data Analysis. There is a natural connection between differences and updates, motivating the group theoretic study of updates that will lead to better ways of creating out-of-core data processing algorithms for new storage devices. Basing queries on ring theory leads to a new class of systems, Algebraic Data Management Systems, which herald a convergence of database systems and computer algebra systems.

Watched 10059 times.