FUnctionality Sharing In Open eNvironments
Heinz Nixdorf Chair for Distributed Information Systems

QUIS: Query Heterogeneous Data In-Situ

Startdate: 2011-09-01

Finishdate: 2017-12-31

Status: completed

Birgitta König-Ries
Barzan Mozafari
H.V. Jagadish

Former Member:
Javad Chamanara


Data of interest are often found in a variety of data sources, many of which are not relational databases, but have their own data organization and query capabilities. To answer questions of interest, one has to run queries across data from these heterogeneous sources. The traditional approach is to perform multiple individual data transformation tasks, one per data source, to import the data into a common repository where they can be queried and analyzed. Drawbacks of this approach include the manual effort and the cost of transforming and importing potentially large data sets, and the lost opportunity to exploit any query facilities provided by the data sources.

QUIS (QUery In-Situ) proposes an approach for querying the data “in-situ” to the greatest extent possible, by taking the user query and transforming appropriate portions of it into corresponding query expressions on individual data sources. Realizing this approach requires the development of a unified query model. This model can extract sub-queries matching heterogeneous capabilities of individual sources, perform heterogeneous joins on intermediate results as necessary, and deal with barriers such as incompatible type systems en route.

Early experiments have shown that QUIS almost eliminates the time to prepare the data while paying only a small cost in query execution time compared to a fully integrated, indexed, and loaded relational database.

The system is an open source project maintained on Github. It consists of a query execution engine, GUI and command line based clients, and an R package, RQUIS, to provide the functionality from inside R.

A set of executable versions and their documentations are available online. Also, a Docker image is provided for easy installation on Linux machines as well as on private or public clouds. You can pull the image by issuing this command on your terminal: docker pull javadch/rquis