Latest modified: Jan 12, 2015
Many computing applications require indexed data structures, such as arrays, nested sequences, hash tables, or distributed data parallel entities. Collection-oriented languages offer a convenient way to specify computations over such structures, through operations which act directly on the structures rather than on the individual elements. Data Field Haskell is an attempt to extend Haskell with flexible and generic features for collection-oriented programming.
Data Field Haskell provides an instance of data fields. Data fields generalize arrays: they are pairs (f,b) where f is a function and b is a "bound", an entity which can be interpreted as a predicate (or set). The bound could be two tuples of numbers defining the extents of an array: however, the data field model allows also bounds defining other "shapes" (for instance nested, sparse, and even for nonnumerical indices). The model is designed not to make any undue assumptions on the bounds: rather, it postulates some properties which are necessary to make common collection-oriented operations well-defined. Thus, bounds and data fields can be seen as abstract data types.
A data field can be applied to an index and thus defines a partial function from index domain to value domain. Partial function are conveniently defined through lambda-syntax, and a similar syntax is defined for data fields: just as \x->t defines a function (using Haskell syntax for lambda-expressions), the expression forall x->t defines a data field. Here, the bound is not given explicitly but is instead derived from the bounds in data fields appearing in t according to a set of rules. Bounds correspond to function domains: thus, these rules are designed to compute bounds for forall-expressions corresponding to the function domains for the corresponding lambda-expressions. For instance, forall x->((f,b1)!x + (g,b2)!x) obtains the bound b1 `meet` b2 since (assuming "+" is strict in both arguments) \x->(f x + g x) is defined on the intersection of the domains of f and g. (Here, "!" denotes data field application, and "meet" an abstract operation on bounds which corresponds to set intersection.) A number of rules of this kind can be defined. They make the forall-syntax convenient for expressing many collection-oriented algorithms with a minimum of explicit handling of bounds.
Data field bounds can be overapproximations of the domain of the corresponding partial function. Semantically, a particular error value represents the result of calls falling out of bounds. This value can be stored within a data field, which then represents a partial function with a domain strictly smaller than the set defined by its bound. This makes it possible to, e.g., to use data fields to model SIMD-like data parallel entities, which often are regular arrays where certain elements can be "turned off".
There is a theory for data fields which includes a formal calculus for "forall-abstraction". The homepage of the Data Field project, which encompasses the Data Field Haskell effort, contains a number of references.
Data Field Haskell is a Haskell dialect where the arrays have been replaced with an instance of data fields. This particular instance provides dense, "traditional" array-like data fields, sparse data fields, infinite data fields, and multidimensional data fields which can be sparse in some dimension, dense in some, and infinite in others. The type of data field is determined by the type of bound, which can be finite dense (pair of indices), finite sparse (set of indices), general predicate, which is infinite, and products which are multidimensional bounds formed from simpler bounds. There are also the special bounds empty and universe which represent the empty and universal set, respectively.
Data Field Haskell provides a number of operations on data fields and bounds. There are functions to create data fields and extract their bounds, and to apply data fields to indices. For finite data fields there are two more kinds of operations: data field evaluators, which force an evaluation to various degree of all elements in data fields, and a number of folds over data fields. There are also a number of operations on bounds: join and meet, which correspond to union and intersection of sets, explicit restriction of a data field with a bound, a test for finiteness, an enumeration of the set defined by a finite bound, the size, and a test whether an index belongs to the set defined by a bound or not. However, the intention is that a programmer should not have to use these operations explicitly that often. Data Field Haskell provides forall-abstraction which works as sketched above, and the idea is that a programmer, by using this mechanism, could avoid much explicit handling of bounds. The syntax of forall-abstraction is precisely analogous with the one of lambda-abstraction and the typing is similar. For a concise description of the data field extensions in Data Field Haskell, see here. A more detailed description is found here.
The current implementation of Data Field Haskell is based on nhc98 and thus extends Haskell 98. There is also an earlier Haskell 1.3 version, which is based on nhc13.
This is an experimental compiler based on nhc98 pre-release 19 (2000-06-05). Nhc stands for "Nearly a Haskell Compiler". It was originally written by Nicklas Röjemo, and has since then been extended and maintained by the functional programming group at York. Note that neither Nicklas Röjemo nor the York group takes any responsibility for our modified version.
There are binaries for Linux and Sparc-Solaris 2:
Haskell source code.
This compiler is based on nhc13 release 980327. There are binaries for Linux and Solaris 5.5.1:
Here is the Haskell source code for this version.
nhc98, the current version of nhc from York.
The Haskell home page
This article has been translated into French by Vicky Rotarova.Björn Lisper, email@example.com