optimizing recursive queries with monotonic aggregates in...

Optimizing Recursive Querieswith Monotonic Aggregates in DeALS

Alexander Shkapsky, Mohan Yang, Carlo ZanioloUniversity of California, Los Angeles

{shkapsky, yang, zaniolo}@cs.ucla.edu

Abstract—The exploding demand for analytics has refocusedthe attention of data scientists on applications requiring aggrega-tion in recursion. After resisting the efforts of researchers formore than twenty years, this problem is being addressed byinnovative systems that are raising logic-oriented data languagesto the levels of generality and performance that are neededto support efficiently a broad range of applications. Foremostamong these new systems, the Deductive Application LanguageSystem (DeALS) achieves superior generality and performancevia new constructs and optimization techniques for monotonicaggregates which are described in the paper. The use of a specialclass of monotonic aggregates in recursion was made possibleby recent theoretical results that proved that they preserve therigorous least-fixpoint semantics of core Datalog programs. Thispaper thus describes how DeALS extends their definitions andmodifies their syntax to enable a concise expression of applicationsthat, without them, could not be expressed in performance-conducive ways, or could not be expressed at all. Then thepaper turns to the performance issue, and introduces novelimplementation and optimization techniques that outperform tra-ditional approaches, including Semi-naive evaluation. An extensiveexperimental evaluation was executed comparing DeALS withother systems on large datasets. The results suggest that, unlikeother systems, DeALS indeed combines superior generality withsuperior performance.

I. INTRODUCTION

The fast-growing demand for analytics has placed renewedfocus on improving support for aggregation in recursion.Aggregates in recursive queries are essential in many importantapplications and are increasingly being applied in areas suchas computer networking [1] and social networks [2]. Manysignificant applications require iterating over counts or proba-bility computations, including machine learning algorithms forMarkov chains and hidden Markov models, and data miningalgorithms such as Apriori. Besides these new applications,we can mention a long list of traditional ones such as Billof Materials (BOM), a.k.a. subparts explosion: this classicalrecursive query for DBMS requires aggregating the variousparts in the part-subpart hierarchy. Finally, we have problemssuch as computing the shortest paths or counting the numberof paths between vertices in a graph, which are now coveredas foundations by most CS101 textbooks.

Although aggregates were not covered in E.F. Codd’s def-inition of the relational calculi [3], it did not take long beforeearly versions of relational languages such as SQL includedsupport for aggregate functions, namely count, sum, avg, minand max, along with associated constructs such as group-by.However, a general extension of recursive query theory andimplementation to allow for aggregates proved an elusive goal,and even recent versions of SQL that provide strong support

for OLAP and other advanced aggregates disallow the useof aggregates in recursion and only support queries that arestratified w.r.t. to aggregates.

Yet the desirability of extending aggregates to recursivequeries was widely recognized early and many partial solutionswere proposed over the years for Datalog languages [4]–[10]. The fact that, in general, aggregates are non-monotonicw.r.t. set-containment led to proposals based on non-monotonictheories, such as locally stratified programs and perfect models[11], [12], well-founded models [13] and stable models [14].An alternative approach was due to Ross and Sagiv [9],who observed that particular aggregates, such as continuouscount, are monotonic over lattices other than set-containmentand thus can be used in non-stratified programs. Howeverpractical difficulties with this approach were soon pointed out,namely that determining the correct lattices by programmersand compilers would be quite difficult [15], and this preventedthe deployment of the monotonicity idea in practical querylanguages for a long time. Fortunately, we recently witnessedsome dramatic developments change the situation completely.Firstly, Hellerstein et al., after announcing a resurgence ofDatalog, showed that monotonicity in special lattices can bevery useful in proving formal properties such as eventual con-sistency [16]. Secondly, we see monotonic aggregates makinga strong comeback in practical query languages thanks to theresults published in [17], [18] and in [2], summarized below.

The formalization of monotonic aggregates proposed in[17], [18] preserves monotonicity w.r.t. set containment, andit is thus conducive to simplicity and performance that followrespectively from the facts that (i) users no longer have to dealwith lattices, and (ii) the query optimization techniques, suchas Semi-naive and magic sets remain applicable [17]. SociaLite[2] also made an important contribution by showing thatshortest-path queries, and other algorithms using aggregates inrecursion, can be implemented very efficiently so that in manysituations the Datalog approach becomes preferable to that ofhand-coding big-data analytics in some procedural language.

These dramatic advances represented a major source ofopportunities and challenges for our Deductive ApplicationLanguage (DeAL) and its system DeALS. In fact, unlikethe design of the SociaLite system where the performanceof recursive graph algorithms with aggregates had played arole, DeALS has been designed as a very general systemseeking to satisfy the many needs and lessons that had beenlearned in the course of a long experience with logic-baseddata languages, and the LDL [19] and LDL++ [20] experiencesin particular. Thus, DeALS supports key non-monotonic con-structs having formal stable model semantics, including, e.g.,

XY-stratification and the choice construct that were found quiteuseful in program analysis [21], and user-defined aggregatesthat enabled important knowledge-discovery applications [22].

In addition to a rich set of constructs, DeALS wasalso designed to support a roster of optimization techniquesincluding magic sets, supplementary magic sets and existentialquantification. Introducing powerful new constructs and theiroptimization techniques by retrofitting a system that alreadysupports a rich set of constructs and optimizations representeda difficult technical challenge. In this paper, we describe howthis challenge was met with the introduction of new optimiza-tion techniques for monotonic aggregates. We will show thatDeALS now achieves both performance and generality, andwe will underscore this by comparing not only with SociaLitebut also with systems such as DLV [23] and LogicBlox [24]that realize different performance/generality tradeoffs.

Overview. The first of two main parts of this paper begins withSection II which presents the syntax and semantics for the min(mmin) and max (mmax) monotonic aggregates. Section IIIdiscusses the evaluation and optimization of monotonic aggre-gate programs. Section IV presents implementation details formmin and mmax and the DeALS storage manager. Section Vpresents experimental results for Sections II-IV. The secondpart of this paper begins with Section VI discussing the count(mcount) and sum (msum) monotonic aggregates, followedby their implementation in Section VII and experimental vali-dation in Section VIII. Section IX provides additional DeALprogram examples. Section X presents the formal semanticson which our aggregates are based. Additional related worksare reviewed in Section XI and we conclude in Section XII.

Preliminaries. A Datalog program P is a finite set of rules,or Horn Clauses, where rule r in P has the form A ←A1, . . . , An. The atom A is the head of r. A1, . . . , An, thebody of r, are literals, or goals, where each literal can beeither a positive or negated atom. An atom has the formp(t1, . . . , tj) where p is a predicate and t1, . . . , tj are termswhich can be constants, variables or functions. An r with anempty body is a fact. A successful assignment of all variablesof rule body goals results in a successful derivation for therule head predicate. Datalog programs use set semantics andare (typically) stratified (i.e. partitioned into levels based onrule dependencies) and executed in level-by-level order, in abottom-up fashion. Datalog programs can be evaluated usingan iterative approach such as Semi-naive evaluation [10].

II. MMIN AND MMAX MONOTONIC AGGREGATES

An mmin or mmax monotonic aggregate rule has the form:

p(K1, . . . , Km, aggr〈T〉)← Rule Body.

In the rule head, K1, . . . , Km are the zero or more group-byarguments we also refer to as K, aggr ∈ {mmax, mmin} is themonotonic aggregate, and T, the aggregate term, is a variable.

The aggregate functions mmin and mmax map an inputset or multiset, we will call G, to an output set, we willcall D. Then, given G, for each element g ∈ G mmin willput g into output set D if g is less than the least valuemmin has previously computed (observed) for G. Similarly,given an input set G, for each element g ∈ G mmax will

put g in output set D if g is greater than the greatest valuemmax has previously computed for G. The mmin and mmaxaggregates are monotonic w.r.t. set-containment and can beused in recursive rules, and G should be viewed as a setcontaining the union of all values for a single group (group-bykey) across all iterations. These aggregates memorize the mostrecently computed value and thus require a single pass1 overG. When viewed as a sequence, the values produced by mminand mmax are monotonic.

A. Running Example

The All-Pairs Shortest Paths (APSP) program has receivedmuch attention in the literature [6], [9], [13], [25], [26]. APSPcalculates the length of the shortest path between each pair ofconnected vertices in a weighted directed graph.

Example 1: APSP with mminr1 . spaths(X, Y, mmin〈D〉)← edge(X, Y, D).r2 . spaths(X, Y, mmin〈D〉)← spaths(X, Z, D1), edge(Z, Y, D2),

D = D1+ D2.r3 . shortestpaths(X, Y, min〈D〉)← spaths(X, Y, D).

Example 1 is the DeAL APSP program with the mminaggregate. The edge predicate denotes the edges of the graph.The intuition for this program is as follows. In the recursion(r1, r2), an spaths fact will be derived if a path from X to Y iseither i) new or ii) has length shorter than the currently knownlength from X to Y. r1 finds the shortest path for each edge. r2is the left-linear recursive rule that computes new shortest pathsfor spaths by extending previously derived paths in spathswith an edge. Logically, this approach can result in many factsspaths for X, Y, each with a different length. Therefore, theprogram is stratified using a traditional (non-monotonic) minaggregate (r3) to select the shortest path for each X, Y.

APSP By Example. Next, we walk through a Semi-naiveevaluation of the APSP program from Example 1.

edge(a, b, 1). edge(a, c, 3). edge(a, d, 4).edge(b, c, 1). edge(b, d, 4). edge(c, d, 1).

Fig. 1. edge Facts for Example 1

First, r1 in Example 1, the exit rule, is evaluated on theedge facts in Fig. 1. In the rule head in r1, X and Y, thenon-aggregate arguments, are the group-by arguments. Themmin aggregate is applied to each of the six edge factsand six spaths facts are successfully derived (not displayedto conserve space) because no aggregate values had beenpreviously computed (memorized) and each group (i.e. (a, b))was represented among the facts only once. For the spathspredicate, mmin is now initialized with a value for each group.

spaths(a, c, 2) ← spaths(a, b, 1), edge(b, c, 1), 2=1+1.

FAIL← spaths(a, b, 1), edge(b, d, 4), 5=1+4. [i]FAIL← spaths(a, c, 3), edge(c, d, 1), 4=3+1. [ii]

spaths(b, d, 2) ← spaths(b, c, 1), edge(c, d, 1), 2=1+1.

Fig. 2. Derivations of Example 1, r2 - Iteration 1

Semi-naive evaluates the recursive r2 rule from Example1 using the six spaths derived by r1. Fig. 2 displays four

1SQL 2003 max, min, count and sum aggregates on the unlimitedpreceding window are similar to DeAL’s monotonic aggregates.

derivations attempted by r2 in its first iteration. Derivationsnot displayed failed to join spaths and edge facts. The firstattempt results in a new spaths fact because spaths(a, c, 2)has an aggregate value less than the previous value for (a, c),which was 3 (from r1). The failures denoted [i] and [ii]occurred because the facts to be derived would have aggregatevalues not less than the previous value for (a, d), which is 4.Finally, spaths(b, d, 2) is derived (2 < 4 for (b, d)).

spaths(a, d, 3)← spaths(a, c, 2), edge(c, d, 1), 3=2+1.


Using the two facts derived in Fig. 2, Semi-naive per-forms a second iteration using r2. As displayed in Fig. 3,spaths(a, d, 3) is derived because (3 < 4) for (a, d). Now,no new facts can be derived and a fixpoint is reached.

shortestpaths(a, c, 2)← {spaths(a, c, 3), spaths(a, c, 2)}shortestpaths(a, d, 3)← {spaths(a, d, 4), spaths(a, d, 3)}shortestpaths(b, d, 2)← {spaths(b, d, 4), spaths(b, d, 2)}

Fig. 4. Derivations of Example 1, r3

Lastly, r3 is evaluated over the spaths facts derived duringrecursion and uses a stratified min aggregate to derive onlythe fact with the shortest path for each group. Fig. 4 displaysderivations of r3 on groups that had multiple facts derivedin recursion showing why rules like r3 are necessary with oursemantics. In Section IV, we will discuss optimizations so rulessuch as r3 do not have to be evaluated.

III. MONOTONIC AGGREGATE EVALUATION

In this section, we present optimized evaluation techniquesfor programs with monotonic aggregates. We start with areview of Semi-naive fixpoint evaluation, the technique thatserves as the basis for our optimized evaluation approaches.

In Fig. 5, the algorithm for Semi-naive, M is the initialmodel (database), S contains all facts obtained thus far, δSand δS′ contain facts obtained during the previous and currentiteration, respectively, and TE and TR are the ImmediateConsequence Operator (ICO) for the exit rule(s) and the recur-sive rule(s), respectively. The algorithm evaluates as follows.Firstly, Semi-naive applies TE (i.e. the exit rules) on M toderive the first set of new δ facts δS (line 2). Then, until nonew facts are derived during an iteration, Semi-naive evaluatesTR on δS to derive new facts to be used in the next iteration.The new set of δ facts (δS′) is produced only after the removalof facts found in previous steps (line 5).

1: S := M ;2: δS := TE(M);3: S := S ∪ δS;4: while δS 6= ∅ do5: δS′ := TR(δS)− S;6: S := S ∪ δS′;7: δS := δS′;

8: return S;

Fig. 5. Semi-naive Evaluation

Symbolic differentiation rules [10] can be applied to mono-tonic aggregate rules in a straightforward manner to producerules for Semi-naive. We omit details in the interest of space.

Although Semi-naive efficiently evaluates general Datalogprograms, monotonic aggregate programs can be evaluatedwith even greater efficiency than Semi-naive provides. Themax-based optimization [18] identified that counting onlyneeds to be performed on maximum (max) values if onlymonotonic arithmetic and boolean functions are used. In thiswork, we expand this observation which we refer to as theMonotonic Optimization. The intuition behind the MonotonicOptimization is that with our monotonic aggregates, mono-tonicity is preserved and values other than the max (mmax) ormin (mmin) will add no new results and thus can be ignored.Only the max (min) intermediate values need to be used inderivations to produce the final max (min) value. In fact, thelast fact produced by the aggregate for a group contains thegreatest (mmax) or least (mmin) aggregate value, making thisfact the only fact for the group that we need to produce forthe next iteration.

A. Monotonic Aggregate Semi-naive Evaluation

1: S := M ;2: δS := getLast(TE(M));3: S := S ∪ δS;4: while δS 6= ∅ do5: δS′ := getLast(TR(δS))− S;6: S := S ∪ δS′;7: δS := δS′;8: return S;

Fig. 6. Monotonic Aggregate Semi-naive Evaluation (MASN)

The Monotonic Optimization enables an optimized Semi-naive for monotonic aggregates we call Monotonic AggregateSemi-naive Evaluation (MASN). Fig. 6 is the algorithm forMASN, which closely resembles Semi-naive. MASN’s differ-ences with Semi-naive are as follows. Here we use getLast()2

to produce, from the input set, a set containing i) all factsfrom predicates that do not have monotonic aggregates, and ii)the last derived fact for each group from monotonic aggregatepredicates. Now after the TE or TR produces a set of facts,getLast will be applied to produce the actual new δS′.Otherwise, MASN is the same as Semi-naive.

B. Eager Monotonic Aggregate Semi-naive Evaluation

MASN employs a level-by-level iteration boundary of abreadth-first search (BFS) algorithm where δ facts derivedduring the current iteration will be held for use until the nextiteration. However, facts produced from monotonic aggregaterules can be used immediately upon derivation. Looking at thederivations in the walk-through evaluation of APSP in SectionII-A one can see a case where Semi-naive, and in this caseMASN as it would have evaluated the same as Semi-naive, didnot capitalize on this property of monotonic aggregates.

Fig. 7 shows the derivations of interest extracted fromFig. 2. We see the second derivation performed using

2DeALS supports MASN by maintaining a single fact per group in δS′.

Example 1 r2 evaluation with Semi-naive or MASNspaths(a, c, 2)← spaths(a, b, 1), edge(b, c, 1), 2=1+1.

FAIL← spaths(a, c, 3), edge(c, d, 1), 4=3+1.

Fig. 7. Example of Iteration Boundary of MASN

spaths(a, c, 3) (from δS) and resulting in failure because thevalue for (a, d) was 4. However, at the time the derivationis attempted, spaths(a, c, 2), the result of the immediatelyprevious derivation, existed. Had spaths(a, c, 2) been used,spaths(a, d, 3) would have been derived here, instead ofrequiring another iteration (Fig. 3).

To further capitalize on the Monotonic Optimization, wepropose Eager Monotonic Aggregate Semi-naive Evaluation(EMSN). With EMSN, facts produced from monotonic aggre-gate rules are immediately available to be used in derivations.EMSN evaluates recursive rules with monotonic aggregates ina fact-oriented (fact-at-a-time) manner and the facts to usein an iteration are determined by the set of groups (keys)that had aggregate facts derived during the previous iteration.With EMSN, derivations with monotonic aggregates are alwaysperformed with the current aggregate value for the group.

Fig. 8 is EMSN. In EMSN, recursive rules with monotonicaggregates are evaluated fact-at-a-time and all other recursiverules are evaluated using Semi-naive. Rules are partitionedinto two sets, each with its own ICOs – TEA

and TRAare

the ICO for the monotonic aggregate exit and recursive rules,respectively, and TEN

and TRNare the ICO for the remaining

exit and recursive rules, respectively. TRAwill be applied on

one fact at a time. δSA and δS′A are the sets of facts obtained

during the previous and current iteration, respectively, for themonotonic aggregate rules, while δS and δS′ are the setsof facts obtained during the previous and current iteration,respectively, for the remaining rules. δSAKeys

is the set ofkeys for the aggregate groups that had facts derived duringthe previous iteration. M is the initial model, S contains allfacts obtained thus far. newfact is a fact derived from a singleapplication of TRA

.

Important points of Fig. 8 are as follows. We usegetKeys() to project out the aggregate value from the ag-gregate facts to produce the set of facts representing groups(keys) to use in derivations in the next iteration. For example,getKeys({spaths(a, b, 1)}) would produce {spaths(a, b)}.

1: S :=M ;2: δSA := TEA(M);3: δSAKeys := getKeys(δSA);4: δS := TEN (M);5: S := S ∪ δS ∪ δSA;6: while δS 6= ∅ and δSAKeys 6= ∅ do7: for all key ∈ δSAKeys do8: while (newfact := TRA(getFact(key))) do9: δS′A := δS′A ∪ {newfact};

10: δS′ := TRN (δS)− S;11: S := S ∪ δS′ ∪ δS′A;12: δSAKeys := getKeys(δS′A);13: δS := δS′; δS′A = ∅14: return S;

Fig. 8. Eager Monotonic Aggregate Semi-naive Evaluation Sketch (EMSN)

getKeys() is applied to the set produced by TEA(M) to

produce the initial δSAKeys(line 3). TEN

(remaining rules)is applied to M to produce the initial δS (line 4). Oncein the recursion, individually, each key in δSAKeys

is usedto retrieve its group’s current fact from the aggregate re-lation (getFact(key)), which TRA

is then applied to (line8). Successful derivations result in newfact being added toδS′

A (line 9). Then, TRN(remaining rules) is applied to δS

and duplicates are eliminated producing δS′, the set of non-monotonic-aggregate facts to be used in the next iteration (line10). getKeys(δS′

A) produces the set of keys to be used inderivations in the next iteration (line 12). This process repeatsuntil no new facts are produced during an iteration.

Example 1 r2 evaluation with EMSNspaths(a, c, 2)← spaths(a, b, 1), edge(b, c, 1), 2=1+1.spaths(a, d, 3)← spaths(a, c, 2), edge(c, d, 1), 3=2+1.

Fig. 9. EMSN Fact-at-a-time Efficiency

Now, consider the same scenario from Fig. 7, but this timeusing EMSN. As shown in Fig. 9, now after spaths(a, c, 2)is produced, it is immediately used in the next derivationresulting in spaths(a, d, 3) being derived an iteration earlierthan with Semi-naive or MASN. Moreover, spaths(a, c, 3)will not be used in any further derivations as it would resultin the derivations of facts that will not lead to a final answernow with the existence of spaths(a, c, 2).

Discussion. With the application of the ICO for recursivemonotonic aggregate rules (TRA

) on an individual fact, ratherthan on a set of facts, EMSN can use facts immediatelyupon derivation. Although EMSN is based on Semi-naive, andtherefore BFS, EMSN has depth-first search (DFS) character-istics. Like BFS, EMSN still uses a level-at-a-time (iteration)approach guided by facts in δS and δSA that were derivedduring the previous iteration. However, because EMSN usesthe most recent aggregate value for the group, regardless ofwhen the value was computed, EMSN can evaluate deeperthan a single level of the search space during an iterationof evaluation. The result is higher (mmax) or lower (mmin)aggregate values being derived earlier in evaluation, which inturn prunes the search space to avoid derivation of facts thatwill not result in final values.

We considered an alternative approach for EMSN thatinstead maintains the set of aggregate facts derived during aniteration (δS′

A) where modification to the aggregate relationresults in either an update or insert to δS′

A. However, with eachiteration δS′

A would become δSA and a new δS′A would be

started, therefore every modification to the aggregate relationwould require both δSA and the new δS′

A to be searched tobe updated with the new value, even if the aggregate groupis not present in δSA. Although δSA and δS′

A are generallysmaller than the aggregate relation, if an aggregate group hasmany new results during an iteration, efficiency gained fromsearching smaller sets instead of searching a larger aggregaterelation to retrieve the aggregate value when needed (line 8 inFig. 8) would be offset by searching these sets many times.Furthermore, this requires a more complicated implementationto properly synchronize facts in multiple data structures.

IV. MMIN AND MMAX IMPLEMENTATION

This section contains details of our system implementationfor supporting the mmin and mmax aggregates.

A. System Overview

DeALS is an interpreted Datalog system with three maincomponents — the compiler, the interpreter and the storagemanager. Monotonic aggregate rules are supported by the com-piler with an aggregate rewriting approach based on techniquesfrom [27]. The compiler produces an instantiated AND/ORGraph [20] representing the given program, which the inter-preter evaluates to produce query results. The interpreter usestuple-at-a-time pipelining and evaluates nodes of the AND/ORgraph representing goals in rule bodies in a left-to-right fashionwith intelligent backtracking [19]. The interpreter uses nestedloops joins but will create an index and use an index nestedloops join if the argument binding analysis identifies a boundjoin argument. For instance, for r2 in Example 1, the interpreterwill use an index nested loops join – edge will be indexed onZ (first argument) which is bound by spaths.

B. Storage Manager Overview

The DeALS storage manager provides support for mainmemory storage and indexing for predicate relations. DeALSsupports several B+Tree data structure variants for tuple stor-age and indexing. A B+Tree stores fixed-size keys in internaland leaf nodes and non-key attributes in leaf nodes. Leaf nodeshave pointers to their right neighbors to enable fast scanning ofthe tree. Through testing we determined our implementationsperform best on average using a linear key search at bothinternal and leaf nodes with 256 bytes allocated for keys(e.g., 32 64-bit long keys) in each node, which results inshallow trees. DeALS supports B+Tree TupleStores, whichstore tuples in a B+Tree. DeALS also supports an UnorderedHeap TupleStore (UHT) where tuples are stored as fixed-size entries in insertion order in large byte arrays. UHT canbe given multiple indexes (e.g., B+Tree), which they remainsynchronized with at all times. UHT enable a highwatermarkapproach for Semi-naive where each iteration is a contiguousrange of tuples.

1) B+Tree Aggregators: Early experimentation found ag-gregation using either i) UHT with B+Tree or Linear Hashing-based secondary indexes or ii) B+Tree TupleStores lacking inexecution time performance. The B+Tree Aggregator Tuple-Store (B+AT) is a B+Tree TupleStore optimized for pipelinedaggregation in recursive queries that provides both good readand write performance. B+AT store fixed-size keys in internaland leaf nodes and fixed-size aggregate values in leaf nodes.Keys are unique and only one aggregate value per key ismaintained. Leaf nodes have pointers to their right neighborsand linear search is used in both internal and leaf nodes. In aB+AT, aggregation is performed in the leaves, therefore onlyone search of the tree is needed to retrieve the previous value,compare it with the new value and perform the update.

Unlike with UHT, facts in B+AT are not easily tracked byreference or range because of node splitting. Therefore, duringevaluation of recursive queries with EMSN, after an aggregatevalue is modified, the B+AT inserts the modified entry’s key

into the set of keys, which is also a B+Tree3, to process forthe next iteration (δSAKeys

in Fig. 8). This approach requirestwo tree searches with an aggregate value modification – oneB+AT search which results in a modified aggregate value andone δSAKeys

search to record the key. Should no modificationoccur, then only the B+AT is searched. To retrieve aggregatefacts to use in derivations (line 8 in Fig. 8), a specialized cursorscans δSAKeys

, using each key to retrieve the key’s aggregatevalue from the B+AT.

Hash table approaches can be an appealing alternativeto B+Trees. In Section V we present experimental resultscomparing DeALS with a system that utilizes a hash tableapproach and highlight some of key differences between usingB+Trees and hash tables.

C. mmin and mmax Implementation

The mmin and mmax implementation tracks the least(mmin) or greatest (mmax) value computed for each groupwhere each group has one tuple in the TupleStore. We usea single relation schema with one column for each of thepredicate’s group-by argument and a column for the aggregatevalue. Specifically, B+AT keys are the group-by argumentswith the aggregate value stored in the leaf. UHT are givenindexes with the group-by arguments as keys. For instance,spaths in Example 1 uses B+AT with keys (X, Y) and eachX, Y is stored with its current value (D) in a leaf node.

D. Operational Optimizations

No Recursive Relation Storage. Due to the Monotonic Op-timization, we only need to maintain a single fact per groupand when a new value for the group is successfully derived,we overwrite the previous value. If the recursive predicateand monotonic aggregate use separate stores, with EMSNand pipelining, the result is the recursive relation store ismerely being synchronized with the aggregate relation store.Therefore, we do not allocate the recursive predicate a store,and instead have it read from the monotonic aggregate store.

Final Results via Monotonic Aggregate. Since the monotonicaggregate maintains the value for each group in its TupleStore,when a fixpoint is reached, its TupleStore contains the finalresults. For instance, instead of evaluating r3 in Example 1 therecursion is materialized by the system, as it would have beenby the stratified aggregate, and the final values are retrievedfrom the monotonic aggregate’s TupleStore.

V. MMIN & MMAX PERFORMANCE ANALYSIS

All experiments on synthetic graphs were run on a machinewith an i7-4770 CPU and 32GB memory running Ubuntu14.04 LTS 64-bit. The experiments on real-life graphs were runon a machine with four AMD Opteron 6376 CPUs and 256GBmemory running Ubuntu 12.04 LTS 64-bit. Memory utilizationis collected by the Linux time command. Execution timeand memory utilization are calculated by performing the sameexperiment five times, discarding the highest and lowest values,and taking the average of the remaining three values. Allexperiments on systems written in Java were run using Java

3Since we scan the set, we use a B+Tree which stores the keys in orderand can benefit EMSN.

102

103

104

105

106

107

G1

G2

G3

G4

G5

G6

G7

G8

G9

G10

G11

G12

G13

G14

G15

G16

G17

G18

Tim

e (

ms)

25

27

29

211

213

215

G1

G2

G3

G4

G5

G6

G7

G8

G9

G10

G11

G12

G13

G14

G15

G16

G17

G18

Me

mo

ry (

MB

)

SociaLite DLV DeALS

Directed acyclic graphs (DAGs): G1(10K/20K) G2(10K/50K) G3(10K/100K) G4(20K/40K) G5(20K/100K) G6(20K/200K)Random graphs: G7(10K/20K) G8(10K/50K) G9(10K/100K) G10(20K/40K) G11(20K/100K) G12(20K/200K)Scale-free graphs: G13(10K/20K) G14(10K/50K) G15(10K/100K) G16(20K/40K) G17(20K/100K) G18(20K/200K)

Fig. 10. Execution time and memory utilization of APSP on synthetic graphs

1.8.0 except for SociaLite (0.8.1) which did not support Java1.8.0. Experiments for SociaLite were run using Java 1.7.0.

Datasets. An n-vertex graph used in experiments has integervertex labels ranging from 0 to n − 1. We used three kindsof synthetic graphs — 1) directed acyclic graphs (DAGs),generated by connecting each pair of vertices i and j (i < j)with (edge) probability p; 2) random graphs, generated byconnecting each pair of vertices with (edge) probability p;3) scale-free graphs, generated using GTgraph4. The graphsare shuffled after generation where one random permutation isapplied to the vertex labels and another random permutationis applied to the edges. The real-life graphs are not shuffledbut we relabeled graphs whose vertex labels are beyond therange of [0, n− 1] while maintaining the original edge order.A text description such as “10K/20K” indicates the graph has10,000 vertices and 20,000 edges.

Configuration. B+AT and B+Tree indexes for UHT wereconfigured with 256 bytes allocated for keys in each node(internal and leaf). Other than experiments in Section V-B,DeALS used EMSN with B+AT.

A. Datalog Implementation Comparison

DeALS is a sequential, interpreted, main memory Javaimplementation. We compare DeALS execution time andmemory utilization performance against three Datalog im-plementations supporting aggregates in recursion — 1) theDLV system5, the state-of-the-art implementation of disjunc-tive logic programming; 2) the commercial LogicBlox system,which supports aggregates in recursion using staged partialfixpoint semantics6. From log files produced during executionof recursive queries with aggregates, we determined LogicBloxindeed uses an approach akin to Semi-naive, which uses onlynew facts found in the current iteration in derivations in the

4GTgraph, http://www.cse.psu.edu/∼madduri/software/GTgraph/.5DLV with recursive aggregates support, http://www.dbai.tuwien.ac.at/proj/

dlv/dlvRecAggr/.6LogicBlox 4 migration guide, https://download.logicblox.com/wp-content/

uploads/2014/05/LB-MigrationGuide-40.pdf.

next iteration; 3) the SociaLite7 graph query language, which iscompiled into code-generated Java, efficiently evaluates single-source shortest paths (SSSP) queries using an approach withDijkstra’s algorithm-like efficiency [2] and supports a left-linear recursive APSP which it evaluates using Semi-naive.We used SociaLite 0.8.1 as it had the best sequential executiontime performance of SociaLite versions available to us.

APSP on Synthetic Graphs. Fig. 10 shows the results ofAPSP on synthetic graphs with random integer edge costsbetween 1-100 executed with DeALS, SociaLite and DLV. Weexperimented with two versions of LogicBlox — 3.10.15 and4.1.3. The former does not support aggregates in recursion andalthough we can express APSP in a stratified program, it onlyterminates on DAGs, and only G1(0.286s) and G2(18.328s)finish within 24 hours. The latter supports aggregates inrecursion however, only G1(4.809s), G2(6.697m), G7(4.774h)and G13(3.977h) finish within 24 hours. We do not reportLogicBlox results in Fig. 10 nor for the remaining experiments.

Among the 18 graphs described in Fig. 10, we foundSociaLite has the fastest execution time on three graphs andDeALS has the fastest execution time on the remaining 15graphs. DeALS is more than two times faster than SociaLiteon sparse graphs where the average degree of each vertex isonly two (e.g., G7, G10 and G16). This advantage decreasesas the average degree increases from two to ten. The mainreason for this change is due to the different designs usedby DeALS and SociaLite. SociaLite uses an array of hashtables with an initial capacity of around 1,000 entries tomaintain the delta relations, whereas DeALS uses a B+Tree.The initialization cost of a hash table is higher than that ofa B+Tree, while the cost of accessing a hash table is lowerthan that of a B+Tree. For graphs with small average degree,the initialization cost may account for a large percentage ofthe execution time, thus DeALS is faster than SociaLite.The impact of the initialization cost reduces as the averagedegree increases, and thus SociaLite is faster than DeALSon denser graphs. However, this faster execution time comes

7https://sites.google.com/site/socialitelang/

at the expense of higher memory utilization. SociaLite usesmore than two times the memory as DeALS on all 18 graphs.Although the C-based DLV has significantly lower memoryutilization than both Java-based DeALS and SociaLite, DLVis extremely slow compared with both DeALS and SociaLiteon DAGs. These results suggest that DeALS achieves the bestexecution time versus memory utilization trade-off on sparsegraphs among the three compared systems.

APSP on Real-life Graphs. Table I shows the results of APSPon three real-life graphs from the Stanford Large NetworkDataset Collection8. The provided graphs do not have edgecosts, therefore we assigned unit cost to each edge. The resultsare similar to that of synthetic graphs — DeALS executesfastest while DLV has the lowest memory utilization. Theseresults suggest that on real-life workloads the B+Tree-baseddesign (low initialization cost) adopted by DeALS is morefavorable than the hash table-based design used by SociaLite.

TABLE I. EXECUTION TIME AND MEMORY UTILIZATION OF APSP ONREAL-LIFE GRAPHS

HepTh Gnutella SlashdotS1 S2 S3 S1 S2 S3 S1 S2 S3

Time(h) 0.98 17.72 0.39 13.31 4.69 0.49 >24.00 >24.00 2.72Mem(GB) 12.76 0.99 7.19 41.57 0.48 23.46 >89.59 >1.06 64.70

S1, S2, S3 represent SociaLite, DLV and DeALS respectively.HepTh(28K/353K): High-energy physics theory citation network.Gnutella(63K/148K): Gnutella peer-to-peer network.Slashdot(82K/549K): Slashdot social network.

SSSP on Real-Life Graphs. Fig. 11 shows the results ofSSSP on five real-life graphs from the USA road networksdatasets9. For each graph, we evaluate SSSP on ten randomlyselected vertices, and we report the min / (geometric) average /max execution time in the form of error bars. Since executiontime captured for DLV includes time for loading the graph,evaluating the query and outputting the result, and queryevaluation only accounts for a small percentage of overalltime observed, timing for DLV is less informative for thisexperiment and thus we only report results for SociaLite andDeALS. SociaLite generates a Java program that evaluatesthe query using the Dijkstra’s algorithm. The generated codeachieves more than one order of magnitude speedup com-paring to LogicBlox [2]. However, our interpreted DeALSis faster than the code-generated SociaLite for SSSP on theroad network graphs as shown in Fig. 11. This result is notsurprising in the sense that EMSN optimizes Semi-naive, andthe Bellman-Ford algorithm (equivalent to Semi-naive) usuallyyields comparable performance with the Dijkstra’s algorithmon large sparse graphs.

USA road networks:NY(264K/734K)E(3,598K/8,778K)W(6,262K/15,248K)CTR(14,082K/34,292K)USA(23,947K/58,333K) 10

102

103

104

NY E W CTR USA

Tim

e (

ms)

SociaLite DeALS

Fig. 11. Execution time of SSSP on Road Networks Datasets

8Stanford large network dataset, http://snap.stanford.edu/data/index.html.9USA road networks, http://www.dis.uniroma1.it/challenge9/download.

shtml.

B. DeALS Storage Manager Evaluation

This experiment shows i) how EMSN performs relativeto MASN and ii) how B+AT perform relative to UHT. Weevaluated APSP on synthetic graphs G1, G2, G7, G8, G13and G14 from Fig. 10. The (geometric) average executiontime and memory utilization over the six graphs are displayedin Table II. Using UHT with B+Tree indexes, EMSN has alower average execution time than MASN by 13%. A noticeabledifference in performance is observed when using B+AT vs.UHT for EMSN. B+AT is 2.6 times faster than UHT andrequires only approximately 25% of memory needed by UHT.

TABLE II. EVALUATION TYPE BY STORAGE CONFIGURATION

Evaluation Type Storage Configuration Time (s) Memory (GB)MASN UHT w/ B+Tree index 77.743 4.087EMSN UHT w/ B+Tree index 68.927 4.131EMSN B+AT 26.450 1.048

C. Statistical Analysis of Evaluation Methods

To provide a characterization of the relative performanceof EMSN compared to Semi-naive, we perform an analysisover sets of graphs using the statistical estimation techniquefrom [28]. Statistics calculated10 are 1) total number of factsderived by the recursive predicate (derived facts) and 2) totalaggregate size of δS across all iterations (δ facts). Thesetwo statistics help to quantify the amount of computation theevaluation method must perform. For each statistic and eachvertex number/edge probability combination, after each run ofthe program, the statistic is included in the average (m) untila statistically significant number (30) of different graphs hasbeen used AND m is within 5% error with 95% confidence11.

We perform the analysis comparing APSP evaluated usingEMSN to APSP evaluated using Semi-naive. We use randomlygenerated DAGs and random graphs with edge probabilitybetween 0.1 and 0.9 (increments of 0.1) and random integeredge cost between 1-50. EMSN and Semi-naive use the samesequence of graphs. On DAGs, Semi-naive requires 3-11%more derived and δ facts and on random graphs requires 13-18% more derived and δ facts than EMSN, respectively.

VI. MCOUNT AND MSUM MONOTONIC AGGREGATES

With efficient support for monotonic count and sum aggre-gates, DeALS supports many exciting applications.

An mcount or msum monotonic aggregate rule has the form:

p(K1, . . . , Km, aggr〈(T, PT)〉)← Rule Body.

In the rule head, K1, . . . , Km are the zero or more group-byarguments (K), aggr ∈ {mcount, msum} is the monotonicaggregate, and (T, PT) is the aggregate term pair passed fromthe body where T is a variable and PT is a constant or a variableindicating the partial count/sum contributed by T.

10For these statistics, we assume a normal distribution (N(µ, σ2), withmean µ and variance σ2).

11As in [28], m is accepted when ε < (0.05 ∗ m), where ε = (1.96 ∗σ)/√k. σ is standard deviation. k is the number of graphs used. 1.96, from

the tables for standard normal distribution for 0.975, gives the 95% confidencecoefficient.

As with mmin and mmax, the mcount and msum aggre-gates are monotonic w.r.t. set-containment and can be usedin recursive rules. When viewed as a sequence, the valuesproduced by mcount and msum are monotonic. The mcountand msum aggregate functions map an input set or multiset,we will call G, to an output set, we will call D. Elements g∈ G have the form (J,NJ), where NJ indicates the partialcount/sum contributed by J . Note, J maps to T and NJ mapsto PT in the definition of the aggregate term above. Now,given G, for each element g ∈ G, if NJ > NJprev

, whereNJprev

is the memorized previous count (sum) for J or 0 if noprevious count (sum) for J exists, mcount (msum) computesthe count (sum) for G by summing the maximum partial count(sum) NJ for all J . Since only the maximum NJ for eachJ is summed, no double counting (summing) occurs. Lastly,msum only computes with positive numbers, thereby ensuringits monotonicity.

A. Running Example

Example 2 is the DeAL program to count the pathsbetween pairs of vertices in an acyclic graph. This programis not expressible with Datalog with stratified aggregation [8].We will use Example 2 as our running example for mcountto explain monotonic counting in DeAL.

Example 2: Counting Paths in a DAGr1 . cpaths(X, Y, mcount〈(X, 1)〉)← edge(X, Y).r2 . cpaths(X, Y, mcount〈(Z, C)〉)← cpaths(X, Z, C), edge(Z, Y).r3 . countpaths(X, Y, max〈C〉) ← cpaths(X, Y, C).

In Example 2, r1 counts each edge as one path between itsvertices. In r2, any edge(Z, Y) that extends from a computedpath count cpath(X, Z, C) establishes there are C distinct pathsfrom X to Y through Z. The mcount〈(Z, C)〉 aggregate in thehead sums the count of paths from X to Y through every Z toproduce the count from X to Y. Lastly, r3 indicates only themaximum count for each path X, Y in cpaths is desired. Asexplained in Section IV-D, r3 does not have to be evaluated.

Counting Paths By Example. Next, we walk through an eval-uation of Counting Paths in Example 2 using EMSN to furtherexplain mcount; this explanation is easily generalizable tomsum. The edge facts in Fig. 12 are the example dataset.

First, r1 in Example 2 is evaluated and results in the sixcpaths derivations as shown in the Fig. 13. Each cpathsfact has a count of 1 indicating one path between each pairof vertices connected by edge facts. Displayed in the rightcolumn of Fig. 13 is the memorized partial count (recall(J,NJ)) for each group. For example, in the first derivation,J=a, NJ=1, (a, 1) is memorized for group (a, b).

Factsedge(a, b).

edge(a, c).

edge(a, d).

edge(b, c).

edge(b, d).

edge(c, d).

Fig. 12. Facts

r1 Successful Derivations Partial Countcpaths(a, b, 1)← edge(a, b). (a, 1) for (a, b)cpaths(a, c, 1)← edge(a, c). (a, 1) for (a, c)cpaths(a, d, 1)← edge(a, d). (a, 1) for (a, d)*cpaths(b, c, 1)← edge(b, c). (b, 1) for (b, c)cpaths(b, d, 1)← edge(b, d). (b, 1) for (b, d)cpaths(c, d, 1)← edge(c, d). (c, 1) for (c, d)

Fig. 13. Derivations of Example 2, r1 evaluation

r2 Successful Derivations Partial Countcpaths(a, c, 2)← cpaths(a, b, 1), edge(b, c). (b, 1) for (a, c)cpaths(a, d, 2)← cpaths(a, b, 1), edge(b, d). (b, 1) for (a, d)*cpaths(a, d, 4)← cpaths(a, c, 2), edge(c, d). (c, 2) for (a, d)*cpaths(b, d, 2)← cpaths(b, c, 1), edge(c, d). (c, 1) for (b, d)


EMSN evaluates the recursive r2 rule from Example 2using the cpaths derived by r1. Fig. 14 shows the successfulderivations performed by r2. As each cpaths fact is derived, itreplaces the previous fact for the group (i.e. cpaths(a, c, 2) re-places cpaths(a, c, 1)). Note the derivation of cpaths(a, d, 2)from joining cpaths(a, b, 1) and edge(b, d). It represents acount of two for (a, d), even though the rule body contributedonly one path count. However, looking at the *-ed entry inFig. 13, we see a partial count of (a, 1) towards (a, d) wasaccrued during evaluation of r1. Therefore, when computingthe new count for (a, d), (a, 1) and the newly derived (b, 1)are summed to result in cpaths(a, d, 2). Next, we observe thebenefits of using EMSN with the derivation of cpaths(a, d, 4).Since cpaths(a, c, 2) existed even though it was derived thisiteration, it was used and successfully joined with edge(c, d).Then, the partial counts for (a, d), which are (a, 1), (b, 1),and (c, 2), are summed to produce cpaths(a, d, 4). Finally,with no new facts produced after those in Fig. 14, a fixpointis reached, and since there is no need to evaluate r3 (SectionIV-D) we have our result.

VII. MCOUNT AND MSUM IMPLEMENTATION

In this section, we present implementation details forthe mcount and msum aggregates. We use definitions fromSection VI (e.g., G). Note, G is a single group producedfrom the implicit group-by for a distinct assignment of K, thezero or more group-by arguments. We will also refer to theTupleStore descriptions from Section IV-B. Lastly, althoughwe use mcount to present our efficient count/sum technique,this discussion is generalizable to msum.

For mcount and msum, we use an approach based ondelta-maintenance (∆-Maintenance) techniques. Recalling ourexplanation for mcount in Section VI, given a new partialcount NJ > NJprev

, mcount will sum all maximum partialcounts to compute the new total count for G. However,rather than recompute the total count, we can instead use ∆-Maintenance to increase N (the current total count for G) byNJ −NJprev and put the updated count, now the total currentcount for G, into output set D. This produces the same resultas if the maximum partial count NJ for all J are summedto produce the total count N for G, however avoids the re-summation of all NJ with each greater NJ . This requiresmemorizing both N for G and NJ for all J .

TABLE III. STORAGE DESIGN SCHEMAS

Name Schema IndexesDouble (K, N) | (K, T, PT) K | (K, T)List (K, N,List [(T, PT)]) K

B+Tree (K, N,B+Tree[(T, PT)]) K

Hashtable (K, N,Hashtable[(T, PT)]) K

A. Storage Designs

Table III displays storage designs we investigated formcount and msum. Here we use N to indicate the currentcount/sum for the group-by arguments (K). As in Section VI,each T contributes a partial count/sum PT towards a distinctassignment of K (group).

Double uses two relations, one relation (K, N) indexed on Kto store tuples containing the group’s total aggregate value anda second relation (K, T, PT) indexed on (K, T) to store the partialcount PT for each distinct assignment of (K, T). Early testingshowed Double using UHT without ∆-Maintenance taking 2-5times longer to execute than with ∆-Maintenance.

We investigated designs using KeyValue type columns asa more efficient way of managing (T, PT) pairs. We developedthree single relation designs (K, N, KeyValue[(T, PT)]), whereN is the total count for K and KeyValue[(T, PT)] is a referenceto the tuple’s own KeyValue-type data structure. The relationis indexed on K and each group has a single tuple. TheKeyValue-types each represent a different retrieval time com-plexity; a List (O(n)) type, a B+Tree (O(log(n))) type, and aHashtable(O(1)) type. Hashtable is based on Linear Hashingand stores the hashed key in the bucket to avoid rehashing.B+Tree stores keys (T) in internal and leaf nodes and non-keyattributes (PT) in leaf nodes, and uses linear search. Lastly,List stores (T, PT) pairs ordered by T and uses a linear search.These are main memory structures, so designs attempt to limitthe number of objects (e.g., List uses byte arrays). KeyValuedesigns use ∆-Maintenance.

For the designs shown in Table III, DeALS supports List,HashTable and B+Tree with B+AT and all designs with UHTindexed as shown. For example, using r1, r2 from Example 2,with B+Tree, the B+AT would have X, Y as keys and each entryin a leaf would have the current total count N and a referenceto a B+Tree KeyValue-type to store (T, PT) pairs. This designis essentially a B+Tree of B+Trees.

VIII. MCOUNT AND MSUM PERFORMANCE ANALYSIS

Configuration. B+Tree TupleStores and indexes, B+AT andthe B+Tree KeyValue design were configured with 256 bytesallocated for keys in each node (internal and leaf). TheHashtable KeyValue design used a directory and segment sizeof 256, 16 initial buckets and load factor of 10.

A. Statistical Analysis of Evaluation Methods

We also perform the statistical analysis described in SectionV-C comparing the evaluation of Counting Paths (Example2) using EMSN to Counting Paths using Semi-naive. Theexperiment uses randomly generated DAGs of 100-250 vertices(increments of 50) and edge probability between 0.1 and0.9 (increments of 0.1). EMSN and Semi-naive use the samesequence of graphs.

Fig. 15(a) and Fig. 15(b) show the results of the analysis.Each point on a line represents the ratio of Semi-naive to EMSNfor number of derived facts (Fig. 15(a)) or number of δ facts(Fig. 15(b)) for the size of the graph indicated by the line andedge probability indicated by the x-axis. For example, in Fig.15(b), Semi-naive produces more than three times as many δfacts as EMSN for graphs of 200 and 250 vertices starting at

1.5

2.0

2.5

3.0

3.5

4.0

0.2 0.4 0.6 0.8

Ra

tio

Edge probability

250200150100

(a) Derived Facts

2.0

2.5

3.0

3.5

4.0

0.2 0.4 0.6 0.8

Ra

tio

Edge probability

250200150100

(b) δ Facts

Fig. 15. Ratio Semi-naive/EMSN Derivations - Counting Paths

0.2 (20%) edge probability. Fig. 15(a) and Fig. 15(b) show thatfor the test graphs, Counting Paths using Semi-naive derives1.94 to 3.48 times more facts than EMSN. As edge probabilityincreases, so does the ratio between Semi-naive and EMSNbecause the higher edge probability allows EMSN to deriveand use facts earlier, which prunes the search space faster.

B. Storage Design Evaluation

This experiment tests how each of the storage designspresented in Section VII-A performed on DAGs. Fig. 16 showsthe (geometric) average execution time and memory utilization,along with minimum and maximum values, on 45 random 250-vertex DAGs (5 graphs for each edge probability from 0.1 to0.9) for each design. In Fig. 16 results are shown in left-to-rightorder from worst to best average execution time performance.

D1: Double (B+Tree)D2: Double(UHT)D3: List (UHT)D4: Hashtable (UHT)D5: B+Tree (UHT)D6: HashTable (B+AT)D7: List (B+AT)D8: B+Tree (B+AT)

103

104

105

D1 D2 D3 D4 D5 D6 D7 D8

Tim

e (

ms)

28

29

210

211

D1 D2 D3 D4 D5 D6 D7 D8

Mem

ory

(M

B)

Fig. 16. mcount and msum Storage Design Performance

Recall the TupleStore descriptions from Section IV-B andstorage designs from Section VII-A. D2 is the Double designas described in Table III executed using UHT with B+Treeindexes. D1 is Double using a B+Tree TupleStore for the(K, T, PT) relation12. D3-D5 and D6-D8 are KeyValue designsexecuted using UHT and B+AT, respectively. In Fig. 16 wesee D6-D8, the three B+AT KeyValue designs, have the bestexecution time performance. D7 and D8 have the lowestaverage execution time performance with B+Tree having bettermaximum (51s vs. 62s) execution time. Compared with D7,D8 has slightly better memory average utilization (969MBvs. 989MB) but lower maximum memory utilization (1.4GBvs. 2GB). Note, D1 and D2 have lowest average memoryutilization but their average execution times are nearly twicethat of D7 and D8. Of the designs, D8, the B+Tree design usingB+AT, best balances good average execution time performancewith good average memory utilization.

12In Double, the (K, T, PT) relation will contain many times more tuplesthan the (K, N) relation (still UHT), so we focused on optimizing the larger.

C. Discussion

Finding other systems to perform an experimental compar-ison with mcount and msum proved challenging. Support inDatalog implementations for count and sum aggregates thatcan be used in recursion is not as mature as that of minand max aggregates. Using LogicBlox version 4, we wereable to execute the Counting Paths program but experiencedsimilar slow performance as with APSP (Section V-A). UsingSociaLite, we were unable to execute the Counting Paths pro-gram, and BOM queries such as subparts explosion, producedresults different from ground truth. Lastly, we were able toexecute a version of the Counting Paths program using DLV,but again the results were different from ground truth.

IX. ADDITIONAL DeAL PROGRAMS

This section includes additional programs to show DeAL’sexpressiveness and support for a variety of applications. Moreexamples are found in [29] and on the DeALS website13.

Example 3: How many days until delivery?r1 . delivery(Part, mmax〈Days〉) ← basic(Part, Days, ).r2 . delivery(Part, mmax〈Days〉) ← assb(Part, Sub, ),

delivery(Sub, Days).r3 . actualDays(Part, max〈Days〉)← delivery(Part, Days).

Example 4: What is the maximum cost of a part?r1 . cost(Part, msum〈(Part, Cost)〉)← basic(Part, , Cost).r2 . cost(Part, msum〈(Sub, Cost)〉) ← assb(Part, Sub, Num),

cost(Sub, Scost),Cost = Scost ∗ Num.

r3 . totalCost(Part, max〈Cost〉) ← cost(Part, Cost).

Example 3 and 4 are the Bill of Materials (BOM) programfor finding the days required to deliver a part and the programfor computing the max cost of a part from the cost of itssubparts, respectively. The assb predicate denotes each part’srequired subparts and number required and basic denotes thenumber of days for a part to be received and the part’s cost.

Example 5: Viterbi Algorithmr1 . calcV(0, X, mmax〈L〉)← s(0, EX), p(X, EX, L1), pi(X, L2),

L = L1 ∗ L2.r2 . calcV(T, Y, mmax〈L〉)← s(T, EY), p(Y, EY, L1), T1 = T− 1,

t(X, Y, L2), calcV(T1, X, L3),L = L1 ∗ L2 ∗ L3.

r3 . viterbi(T, Y, max〈L〉)← calcV(T, Y, L).

Example 5 is the Viterbi algorithm for hidden Markovmodels. Four base predicates are used — t denotes thetransition probability L2 from state X to Y; s denotes theobserved sequence of length L+1; pi denotes the likelihoodL2 that X is the initial state; p denotes the likelihood L1 thatstate X (Y) emitted EX (EY). r1 finds the most likely initialobservation for each X. r2 finds the most likely transition forobservation T for each Y. r3 finds the max likelihood for T, Y.

Example 6: Max Probability Pathr1 . reach(X, Y, mmax〈P〉)← net(X, Y, P).r2 . reach(X, Y, mmax〈P〉)← reach(X, Z, P1), reach(Z, Y, P2),

P = P1 ∗ P2.r3 . maxP(X, Y, max〈P〉) ← reach(X, Y, P).

13DeALS website, http://wis.cs.ucla.edu/deals.

Example 6 is the non-linear program for computing themax probability path between two nodes in a network. Thenet predicate denotes the probability P of reaching Y from X.

X. FORMAL SEMANTICS

So far we have worked with the operational semantics ofour monotonic aggregates and shown how this is conduciveto the expression of algorithms by programmers. While mostusers only need to work at this level, it is important that we alsoshow how this coincides with the formal semantics discussed inthose two DatalogFS papers [17], [18], inasmuch as propertiessuch as least fixpoint and stable models will follow from it.

We start with the example inspired by [9] for determiningwho will come to a party. In this program, some people willcome to the party for sure, whereas others only join when atleast three of their friends are coming. Example 7 is the DeALversion of this program. The idea is that with cntComing eachperson watches the number of their friends that are cominggrow, and once that number reaches three, the person willthen come to the party too. To count the number of friends,rather than the final count used in [9], we can use the mcountcontinuous count aggregate that enumerates all the integersuntil the actual maximum, i.e. it returns I , the actual maximum,representing the integer interval [1, I].

Example 7: Who will come to the party?r1 . coming(X)← sure(X).r2 . coming(X)← cntComing(X, N), N ≥ 3.r3 . cntComing(Y, mcount〈(X, 1)〉)← friend(Y, X), coming(X).

Here the use of mcount over count is justified on thegrounds of performance, since it is inefficient to count allfriends of people if only three are required. More importantlythough, while count is non-monotonic (unless we use thespecial lattices suggested by [9]), mcount is monotonic inthe lattice of set containment used by the standard Datalog.So no ad hoc semantic extension is needed and concepts andtechiques such as magic sets, perfect models and stable modelscan be immediately generalized to programs with mcount.

A. DeAL Interval Semantics

The lessons learned with mcount tell us that we can derivethe monotonic counterpart of an aggregate by simply assumingthat it produces an interval of integer values, rather than justone value. In the following we (i) apply this idea to max andmin to obtain mmax and mmin, and then (ii) generalize thesemonotonic aggregates to arbitrary numbers, and show thatthe least fixpoint computation under this formal interval-basedsemantics can be implemented using the Semi-naive semanticsused in Section II under general conditions that hold for allour examples. Due to space limitations the discussion is keptat an informal level: formal proofs are given in [17], [18].

Before we introduce DeAL’s interval semantics, considerthe following example of interval semantics for counting,based on the DatalogFS interval semantics, depicted in Fig.17(a). If r2 in Example 2 produced cpaths(a, b, 3) andthen cpaths(a, b, 4), we cannot sum the aggregate values toget a new count for group (a, b). Instead, with the countsfor cpaths(a, b, 3) and cpaths(a, b, 4) represented by [1, 3]

(a) Counting (b) +/- Rational Numbers (c) Minimum

Fig. 17. DeAL Interval Semantics

and [1, 4], respectively, [1, 3] ∪ [1, 4] = max(3, 4) = 4. Thuscpaths(a, b, 4) represents (a, b)’s count.

+/- Rational Numbers. Computations on numbers in mantissa* exponent representation is tantamount to integer arithmeticon their numerators over a fixed large denominator. For in-stance, for floating point representations, the smallest value ofexponent supported could be −95, whereby every number canbe viewed as the integer numerator of a rational number withdenominator 1095. However, because of the limited length ofthe mantissa, this numerator will often be rounded-off — amonotonic operation that preserves the existence of a least-fixpoint semantics for our programs [17]. Thus floating-point-type representation in programming languages can be usedwithout violating monotonicity [17], [18].

Now, say Lb represents the lower bound for all negativenumbers supported by the system architecture. We can use theinterval [Lb,N ] to represent any number regardless of its sign.The result of unioning the sets representing these numbersis a set representing the max, independent of whether this ispositive or negative. Using Fig. 17(b) as an example, withN1 and N2 represented by [Lb,N1] and [Lb,N2], respectively,then [Lb,N1] ∪ [Lb,N2] represents the larger of the two, i.e.max(N1, N2) = N2. Thus, we can support negative numbers.

Minimum. Now, say Ub represents the upper bound for allpositive numbers supported by the system architecture. We canrepresent the set of all numbers between N and Ub, i.e. theinterval [N,Ub], as the number N . Observe that a numbersmaller than N is represented by an interval that contains theinterval [N,Ub]. As before, if we take a union of two or moresuch representations, the result is the largest interval. UsingFig. 17(c) as an example, with N1 and N2 represented by[N1, Ub] and [N2, Ub], respectively, then [N1, Ub] ∪ [N2, Ub]represents the smaller of the two, i.e., min(N1, N2) = N2.

As another example of these semantics, consider the firstderivation in Fig. 2 – spaths(a, c, 2) was derived becausethe previous value for (a, c) was 3. In the interval semantics,spaths(a, c, 2) would be represented as [2, Ub], and 3 as[3, Ub], thus we have [2, Ub] ∪ [3, Ub] is min(2, 3) = 2.

B. Normal Programs

DeAL programs that only use monotonic arithmetic andmonotonic boolean (comparison) functions on values producedby monotonic aggregates will be called normal14. All practicalalgorithms we have considered only require the use of normalDeAL programs. Two classes of normal programs exist.

Class 1. This class of normal programs uses monotonicaggregates to compute the max (min) values for use outsidethe recursion (e.g., to return the facts with the final max (min)

14The compiler can easily check if a program is normal when the programcontains only arithmetic and simple functions (e.g. addition, multiplication).

values). Programs in this class include Examples 1 - 6 whichare expressed using a stratified max (min) aggregate to selectthe max (min) value produced by the recursive rules when afixpoint is reached. Examining the intermediate results of Class1 programs: at each step of the fixpoint iteration, we have (i)the max (min) value and (ii) values less than the max value(greater than the min value). However, we do not need (ii), aslong as the values in the head are computed from those in thebody via an arithmetic function that is monotonic.

Class 2. Values produced by monotonic aggregates in Class 2normal programs are not passed to rule heads, but are testedagainst conditions in the rule body. Here too, as long asthe functions applied to the values are monotonic, rules aresatisfied if and only if they are satisfied for the max (min)values. Example 7 is a Class 2 normal program.

C. Normal Program Evaluation

Recall the algorithm for MASN in Fig. 6. Let us call L theset produced by TE(M) or TR(δS) and F the set producedfrom applying getLast() to L. For Semi-naive, δS will be L,whereas for MASN, δS will be F . Let W = L − F . W = ∅when, for facts of monotonic aggregate predicates, each groupwith facts derived during the iteration has only one fact. Foran iteration, if W = ∅, MASN evaluates the program the sameas Semi-naive. Otherwise, W contains facts that will not leadto final answers for Class 1 normal programs and for Class2 normal programs, any condition satisfied (in a rule body)by a fact in W will also be satisfied by the fact of the samegroup in F . Thus, MASN does not need to evaluate W . In thenext iteration, Semi-naive will derive all of the same facts asMASN, but also derive facts evaluating W . We have alreadyestablished that these facts, because they were derived fromW , will not lead to final answers, and thus MASN does notneed to derive facts with these either.

Theorem 10.1: A normal program with monotonic aggre-gates evaluated using MASN will produce the same result asthat program evaluated using Semi-naive.

Recall the algorithm for EMSN in Fig. 8. Assume we areevaluating a normal program with Semi-naive. Let us callKSN the set of facts used in derivations (∪ of all δS) bySemi-naive. Now assume we are evaluating the same normalprogram with EMSN. Let us call KEMSN the set of all factsused in derivations by EMSN, which means KEMSN containsfacts that were retrieved from the aggregate’s relation, meaningthey had the current value for the group at the time the factwas used in a derivation. Now, let C = KSN − KEMSN .If C = ∅, EMSN evaluates the program the same as Semi-naive. Otherwise, C contains facts that were not used by EMSNbecause at the time of derivation, the values in the aggregate’srelation for those facts’ groups were greater (mmax, mcount,msum) or lesser (mmin) than the aggregate value in the fact.We know KSN ∩KEMSN = KEMSN and KEMSN ⊂ KSN

because EMSN will ignore facts to use in derivations that Semi-naive will use in derivations, but EMSN will not use differentfacts than Semi-naive. Stated another way, Semi-naive willattempt derivations with all facts EMSN attempts derivationswith. Therefore, for Class 1 normal programs, facts in Care not going to lead to final answers. For Class 2 normalprograms, any condition satisfied (in a rule body) by a fact in

C would have also been satisfied by the value that was insteadused. EMSN does not need to evaluate C. We have:

Theorem 10.2: A normal program with monotonic aggre-gates evaluated using EMSN will produce the same result asthat program evaluated using Semi-naive.

XI. ADDITIONAL RELATED WORK

We have previously discussed the contributions of manyworks on supporting aggregation in recursion including [2],[9], [23]. For extrema aggregates, [25] proposes rewritingprograms by pushing the aggregate into the recursion andGreedy Fixpoint evaluation to select the next min/max value toexecute. Another approach proposes using aggregate selectionsto identify irrelevant facts that are to be discarded by an ex-tended version of Semi-naive evaluation [26]. Nondeterministicconstructs and stable models have been proposed to definemonotone aggregates [27]. The DRed [30] algorithm incremen-tally maintains recursive views that can include negation andaggregation. The BloomL distributed programming language[16] uses logical monotonicity via built-in and user-definedlattice types to support eventual consistency.

XII. CONCLUSION AND FUTURE WORK

With the renaissance of Datalog, the monotonicity propertyhas been placed at the center of its ability to provide adeclarative treatment of distributed computation [31]. In thispaper, we have shown how this property can be extendedto the aggregate involved in recursive computations whilepreserving the syntax, semantics, and optimization techniquesof traditional Datalog. The significance of this result followsfrom the fact that this problem had remained long unsolved,and that many new applications can be expressed with theproposed extensions that make them amenable to parallelexecution on multiprocessor and distributed systems. Lines offuture work include i) supporting advanced KDD algorithmsin DeALS, ii) parallel/distributed extensions of techniquesfrom this paper and iii) extending DeALS with highly-parallelevaluation techniques for recursive queries [32].

XIII. ACKNOWLEDGMENT

We thank Tyson Condie, Kai Zeng, Matteo Interlandi andNeng Lu for their discussions and suggestions on versionsof this work. We thank the reviewers for their thoughtfulcomments. This work was supported by NSF awards IIS1218471 and IIS 1302698.

REFERENCES

[1] B. T. Loo, T. Condie, M. N. Garofalakis, D. E. Gay, J. M. Hellerstein,P. Maniatis, R. Ramakrishnan, T. Roscoe, and I. Stoica, “Declarativenetworking,” Commun. ACM, vol. 52, no. 11, pp. 87–95, 2009.

[2] J. Seo, S. Guo, and M. S. Lam, “Socialite: Datalog extensions forefficient social network analysis,” in ICDE, 2013, pp. 278–289.

[3] E. F. Codd, “Relational completeness of data base sublanguages,” In: R.Rustin (ed.): Database Systems: 65-98, Prentice Hall and IBM ResearchReport RJ 987, San Jose, California, 1972.

[4] S. Greco and C. Zaniolo, “Greedy algorithms in datalog,” TPLP, vol. 1,no. 4, pp. 381–407, 2001.

[5] P. G. Kolaitis, “The expressive power of stratified logic programs,” Info.and Computation, pp. 50–66, 1991.

[6] M. P. Consens and A. O. Mendelzon, “Low complexity aggregation ingraphlog and datalog,” in ICDT, 1990, pp. 379–394.

[7] I. S. Mumick, H. Pirahesh, and R. Ramakrishnan, “The magic ofduplicates and aggregates,” in VLDB, 1990, pp. 264–277.

[8] I. S. Mumick and O. Shmueli, “How expressive is stratified aggrega-tion?” Annals of Mathematics and AI, pp. 407–435, 1995.

[9] K. A. Ross and Y. Sagiv, “Monotonic aggregation in deductivedatabases,” in PODS, 1992, pp. 114–126.

[10] C. Zaniolo, S. Ceri, C. Faloutsos, R. T. Snodgrass, V. S. Subrahmanian,and R. Zicari, Advanced Database Systems. Morgan Kaufmann, 1997.

[11] C. Zaniolo, N. Arni, and K. Ong, “Negation and aggregates in recursiverules: the ldl++ approach,” in DOOD, 1993, pp. 204–221.

[12] G. Lausen, B. Ludascher, and W. May, “On active deductivedatabases: The statelog approach,” in Transactions and Change in LogicDatabases, 1998, pp. 69–106.

[13] A. V. Gelder, “The well-founded semantics of aggregation,” in PODS,1992, pp. 127–138.

[14] W. Faber, G. Pfeifer, and N. Leone, “Semantics and complexity ofrecursive aggregates in answer set programming,” Artif. Intell., vol. 175,no. 1, pp. 278–298, 2011.

[15] A. Van Gelder, “Foundations of aggregation in deductive databases,” inDOOD, 1993, pp. 13–34.

[16] N. Conway, W. R. Marczak, P. Alvaro, J. M. Hellerstein, and D. Maier,“Logic and lattices for distributed programming,” in SoCC, 2012.

[17] M. Mazuran, E. Serra, and C. Zaniolo, “A declarative extension of hornclauses, and its significance for datalog and its applications,” TPLP,vol. 13, no. 4-5, pp. 609–623, 2013.

[18] ——, “Extending the power of datalog recursion,” VLDB J., vol. 22,no. 4, pp. 471–493, 2013.

[19] D. Chimenti, R. Gamboa, R. Krishnamurthy, S. A. Naqvi, S. Tsur, andC. Zaniolo, “The ldl system prototype,” IEEE Trans. Knowl. Data Eng.,vol. 2, no. 1, pp. 76–90, 1990.

[20] F. Arni, K. Ong, S. Tsur, H. Wang, and C. Zaniolo, “The deductivedatabase system ldl++,” TPLP, vol. 3, no. 1, pp. 61–94, 2003.

[21] J. M. Hellerstein, “The declarative imperative: experiences and conjec-tures in distributed logic,” SIGMOD Record, vol. 39, no. 1, pp. 5–19,2010.

[22] F. Giannotti, G. Manco, and F. Turini, “Specifying mining algorithmswith iterative user-defined aggregates,” IEEE Trans. Knowl. Data Eng.,vol. 16, no. 10, pp. 1232–1246, 2004.

[23] W. Faber, G. Pfeifer, N. Leone, T. Dell’Armi, and G. Ielpa, “Designand implementation of aggregate functions in the dlv system,” TPLP,vol. 8, no. 5-6, pp. 545–580, 2008.

[24] T. J. Green, M. Aref, and G. Karvounarakis, “Logicblox, platform andlanguage: A tutorial,” in Datalog 2.0, 2012, pp. 1–8.

[25] S. Ganguly, S. Greco, and C. Zaniolo, “Minimum and maximumpredicates in logic programming,” in PODS, 1991, pp. 154–163.

[26] S. Sudarshan and R. Ramakrishnan, “Aggregation and relevance indeductive databases,” in VLDB, 1991, pp. 501–511.

[27] C. Zaniolo and H. Wang, “Logic-based user-defined aggregates forthe next generation of database systems,” in The Logic ProgrammingParadigm, K. R. Apt, V. Marek, M. Truszczynski, and D. S. Warren,Eds. Springer Verlag, 1999, pp. 401–426.

[28] S. Ganguly, R. Krishnamurthy, and A. Silberschatz, “An analysistechnique for transitive closure algorithms: A statistical approach,” inICDE, 1991, pp. 728–735.

[29] A. Shkapsky, K. Zeng, and C. Zaniolo, “Graph queries in a next-generation datalog system,” PVLDB, vol. 6, no. 12, pp. 1258–1261,2013.

[30] A. Gupta, I. S. Mumick, and V. S. Subrahmanian, “Maintaining viewsincrementally,” in SIGMOD, 1993, pp. 157–166.

[31] P. Alvaro, N. Conway, J. Hellerstein, and W. R. Marczak, “Consistencyanalysis in bloom: a calm and collected approach,” in CIDR, 2011, pp.249–260.

[32] M. Yang and C. Zaniolo, “Main memory evaluation of recursive querieson multicore machines,” in IEEE Big Data, 2014, pp. 251–260.

optimizing recursive queries with monotonic aggregates in...

Documents