This project seeks to develop a general understanding of language-integrated query techniques, including the approach taken in practical languages such as C# and F# as well as research prototypes such as Kleisli and Links.
Provenance, or information about the origin or derivation of data, is important for assessing the trustworthiness of data and identifying and correcting mistakes. Most prior implementations of data provenance have involved heavyweight modifications to database systems and little attention has been paid to how the provenance data can be used outside such a system. We present extensions to the Links programming language that build on its support for language-integrated query to support provenance queries by rewriting and normalizing monadic comprehensions and extending the type system to distinguish provenance metadata from normal data. The main contribution of this paper is to show that the two most common forms of provenance can be implemented efficiently and used safely as a programming language feature with no changes to the database system.
Nested relational query languages have been explored extensively, and underlie industrial language-integrated query systems such as Microsoft's LINQ. However, relational databases do not natively support nested collections in query results. This can lead to major performance problems: if programmers write queries that yield nested results, then such systems typically either fail or generate a large number of queries. We present a new approach to query shredding, which converts a query returning nested data to a fixed number of SQL queries. Our approach, in contrast to prior work, handles multiset semantics, and generates an idiomatic SQL:1999 query directly from a normal form for nested queries. We provide a detailed description of our translation and present experiments showing that it offers comparable or better performance than a recent alternative approach on a range of examples.
Language-integrated query techniques have been explored in a number of different language designs. We consider two different, type-safe approaches employed by Links and F#. Both approaches provide rich dynamic query generation capabilities, and thus amount to a form of heterogeneous staged computation, but to date there has been no formal investigation of their relative expressiveness. We present two core calculi Eff and Quot, respectively capturing the essential aspects of language-integrated querying using effects in Links and quotation in LINQ. We show via translations from Eff to Quot and back that the two approaches are equivalent in expressiveness. Based on the translation from Eff to Quot, we extend a simple Links compiler to handle queries.
Language-integrated query is receiving renewed attention, in part because of its support through Microsoft's LINQ framework. We present a theory of language-integrated query based on quotation and normalisation of quoted terms. Our technique supports abstraction over values and predicates, composition of queries, dynamic generation of queries, and queries with nested intermediate data. Higher-order features prove useful even for constructing first-order queries. We prove that normalisation always succeeds in translating any query of flat relation type to SQL. We present experimental results confirming our technique works, even in situations where Microsoft's LINQ framework either fails to produce an SQL query or, in one case, produces an avalanche of SQL queries.
A practical theory of language-integrated query, J. Cheney, S. Lindley and P. Wadler, to appear, ICFP 2013
Work is in progress to develop a contributed F# 3.0 library that provides composable queries asa drop-in replacement for the standard F# 3.0 query operator. The FSharpComposableQuery library is currently in an beta development stage; the current release can be obtained from NuGet or GitHub from the links below.
The essence of language-integrated query, J. Cheney, S. Lindley and P. Wadler, draft. This is an earlier version of the ICFP 2013 paper.
This file incorporates modifications to the F# PowerPack needed to run the examples in the paper.
This .zip file contains the code for the Practical LINQ library and tests presented in the paper. To run all of the examples successfully it is necessary to first build the FSharp.PowerPack.Linq.modified.dll library using the modified library code above. For the examples to run it is necessary to configure SQL Server by adding databases with appropriate schemas and data, and modify the connection strings.
This code will eventually migrate to a public repository and library packaged to make it easier for F# programmers to use.
The XML file used for the tests is a subset of this movie database, described further here. The actual file used is here.