bhanu pratap gupta devang vira s. sudarshan dept. of computer science and engineering, iit bombay
DESCRIPTION
Automated Test Data generation Based on database constraints, and SQL query ▪ Agenda [Chays et al., STVR04], a tool which generates test cases for database applications which additionally uses user fed heuristics Ensuring query result is not empty ▪ Reverse Query Processing [Binning et al., ICDE07] takes desired query output and generates relation instances ▪ Handle a subset of Select/Project/Join/GroupBy queries None of the above guarantee anything about detecting errors in SQL queries Question: How do you model SQL errors? Answer: Query Mutation 3TRANSCRIPT
![Page 1: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/1.jpg)
Bhanu Pratap GuptaDevang ViraS. Sudarshan
Dept. of Computer Science and Engineering, IIT Bombay
![Page 2: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/2.jpg)
Complex SQL queries hard to get right Question: How to check if an SQL
query is correct? Formal verification is not applicable since
we do not have a separate specification and an implementation
State of the art solution: Generate test databases and check if the query gives the intended result
2
![Page 3: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/3.jpg)
Automated Test Data generation Based on database constraints, and SQL query
▪ Agenda [Chays et al., STVR04], a tool which generates test cases for database applications which additionally uses user fed heuristics
Ensuring query result is not empty▪ Reverse Query Processing [Binning et al., ICDE07]
takes desired query output and generates relation instances
▪ Handle a subset of Select/Project/Join/GroupBy queries None of the above guarantee anything about detecting
errors in SQL queries Question: How do you model SQL errors? Answer: Query Mutation
3
![Page 4: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/4.jpg)
Mutant: Variation of the given query Mutations model common programming errors, like
▪ Join used instead of outerjoin (or vice versa)▪ Join/selection condition errors
▪ < vs. <=, missing or extra condition▪ Wrong aggregate (min vs. max)
Mutant may be the intended query
4
![Page 5: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/5.jpg)
Traditional use of mutation testing has been to check coverage of dataset Generate mutants of the original program by modifying the
program in a controlled manner A dataset kills a mutant if query and the mutant give
different results on the dataset A dataset is considered complete if it can kill all non-
equivalent mutants of the given query Prior work:
Tuya and Suarez-Cabal [IST07], Chan et al. [QSIC05] defined a class of SQL query mutations
Shortcoming: do not address test data generation Our goal: generated dataset for testing query
Test dataset and query result on the dataset are shown to human, who verifies that the query result is what is expected given this dataset
Note that we do not need to actually generate and execute mutants
5
![Page 6: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/6.jpg)
Address the problem of test data generation for killing non-equivalent mutants Equivalent Mutants: r(A,B) s(B,C) and r(A,B)
s(B,C) where r.B is a foreign key to s, and is not null will always produce the same resultset
Define class of: Join/outerjoin mutations Selection predicate mutations
Algorithm for test data generation that kills all non-equivalent mutants in above class Under some simplifying assumptions (given in the
paper) With the guarantee that generated datasets are small
and realistic, to aid in human verification of results6
![Page 7: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/7.jpg)
Join type mutations: An occurrence of a join operator ( , , , ) is replaced by one of the other join operators
Defining join mutations in SQL is complicated by the absence of a particular join order SELECT * FROM a,b,c WHERE (a.x = b.x) and (b.x =
c.x) We consider all relational algebra expressions
(trees) equivalent (under inner join reordering) to the given SQL query
We consider join type mutations to single join nodes in each tree above
7
![Page 8: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/8.jpg)
Case I: Mutation at root node, with no foreign key constraints Schema: r(A), s(B)
To kill this mutant: ensure that for an r tuple there is no matching s tuple
Generated test case: r(A)={(1)}; s(B)={} Basic idea:
(a) run query on given database, (b) from result extract matching tuples for r and s(c) delete s tuple to ensure no matching tuple for r
8
![Page 9: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/9.jpg)
Case II: Extra join above mutated node Schema: r(A,B), s(C,D), t(E)
To kill this mutant we must ensure that for an r tuple there is no matching s tuple, but there is a matching t tuple
Generated test case: r(A,B)={(1,2)}; s(C,D)={}; t(E)={(2)}
9
![Page 10: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/10.jpg)
Given join expression on relations r1, r2, …, rn Create dataset where all relations have a set of
matching tuples For each relation ri, generate a dataset where rest of
relations match, but ri is empty▪ Unless making ri empty makes join graph disconnected
Above procedure kills all join type mutations of given inner join tree Outer joins complicate picture when attributes are
projected out▪ May have to make more than one ri empty at a time
Foreign keys may prevent making some ri empty
10
![Page 11: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/11.jpg)
Case III: Mutation at root node with foreign key constraints and selection on right side Schema: r(A), s(B,C) Foreign key: r.A →s.B
To kill this mutant we must create an s tuple which matches with the r tuple on the foreign key reference, but which has s.C ≠ 4 Generated test case: r(A)={(2)}; s(B,C)={(2,5)}
Notion of valid nullable pattern defined in paper specifies which relations can be made null/non-matching, given foreign key constraints and join graph
11
![Page 12: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/12.jpg)
Implemented using Java and PostgreSQL Creates datasets by extracting and
modifying tuples from given database Currently handles join type mutation and
selection predicate mutation For creating a merged dataset
▪ Tuples having same values for join attributes must be blocked from being inserted again
Handling selection predicate mutation▪ Eg. to distinguish r.A < 3 and r.A <= 3 we
generate tuples with r.A = 2 and 3
12
![Page 13: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/13.jpg)
Ongoing work : Synthetic data generation taking
database and query constraints into account which is non trivial▪ Idea (from RQP [Binning et al ICDE07]): Use a
model checker to generate data▪ Under implementation using CVC3
Extend the technique to handle aggregations and sub-queries
Future work: data generation for application code with multiple queries
13
![Page 14: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/14.jpg)
Questions
![Page 15: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/15.jpg)
Problem: is Q equivalent to a mutant Q‘ can be reduced to query containment and vice versa in polynomial time
The Chase algorithm can be used to generate datasets to show that Q and Q' are not equivalent (for SPJ queries and several extensions) such a dataset would kill the mutant Q‘ limited work on outerjoin containment data
generation However we don't want to enumerate each
mutant and generate separate datasets too expensive
15
![Page 16: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/16.jpg)
Under the following conditions we can generate merged datasets: Tuples having same values for join attributes
must be blocked from being inserted again The query must not contain any equality
selection on an unique key The result of the query must contain one or
more attributes which together form an unique key for any relation
Also attributes from the result forming an unique key must be guaranteed to be non-null in the result
16
![Page 17: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/17.jpg)
Consider the three relations : Student(name, deptcode, progcode), Department(deptcode, deptname) Program(progcode, progname)
And a query: SELECT rollno, name, deptname, progname
FROM student s INNER JOIN department d ON s.deptcode=d.deptcode INNER JOIN program p ON s.progcode=p.progcode
17
![Page 18: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/18.jpg)
Generate mutants by mutating join operator
of a single node for all above trees
Query Tree 1 Query Tree 2 Query Tree 3
18
![Page 19: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/19.jpg)
Generated data shows : A student (Devang) with valid program and department A student (Abhijeet) with invalid department A student (Sandeep) with invalid program A student (Aditya) with invalid program invalid program
and department A program (PhD) with no student A department (Mechanical) with no student
Deptcode
Deptname
CS Computer CH ChemicalME Mechanical
Progcode
Progname
0 B.Tech1 M.Tech2 PhD
Rollno Name progcode
deptcode
501 Devang 1 CS401 Abhijee
t0 CE
701 Sandeep
5 CH
101 Aditya 4 MA
DepartmentProgram Student
19
![Page 20: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/20.jpg)
Generated data shows : A student (Devang) with valid program and
department A program (B.Tech) with no student A department (Electrical) with no student
Deptcode
Deptname
CS Computer EE Electrical
Progcode
Progname
0 B.Tech1 M.Tech
Rollno Name progcode
deptcode
501 Devang 1 CS
DepartmentProgram Student
Foreign Keys are:Student.progcode → Program.progcodeStudent.deptcode → Department.deptcode
20
![Page 21: Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1aee7f8b9ab05997c9a9/html5/thumbnails/21.jpg)
Case of no foreign keys
21