What to test in a SQL query

SQL is a robust language and contains many different functions we can use. Let’s simplify and look at queries as a composition of predicates. Here are some examples:

  • SELECT * FROM INVOICE WHERE VALUE < 50
  • SELECT * FROM INVOICE I JOIN CUSTOMER C ON I.CUSTOMER_ID = C.ID WHERE C.COUNTRY = 'NL'
  • SELECT * FROM INVOICE WHERE VALUE > 50 AND VALUE < 200

In these examples, value < 50i.customer_id = c.idc.country = 'NL', and value > 50 and value < 200 are the predicates that compose the different queries. As a tester, a possible criterion is to exercise the predicates and check whether the SQL query returns the expected results when predicates are evaluated to different results.

Virtually all the testing techniques we have discussed:

  • Specification-based testing —SQL queries emerge out of a requirement. A tester can analyze the requirements and derive equivalent partitions that need to be tested.
  • Boundary analysis —Such programs have boundaries. Because we expect boundaries to be places with a high bug probability, exercising them is important.
  • Structural testing —SQL queries contain predicates, and a tester can use the SQL’s structure to derive test cases.

Here, we focus on structural testing. If we look at the third SQL example and try to make an analogy with what we have discussed about structural testing, we see that the SQL query contains a single branch composed of two predicates (value > 50 and value < 200). This means there are four possible combinations of results in these two predicates: (true, true)(true, false)(false, true), and (false, false). We can aim at either of the following:

  • Branch coverage —In this case, two tests (one that makes the overall decision evaluate to true and one that makes it evaluate to false) would be enough to achieve 100% branch coverage.
  • Condition + branch coverage —In this case, three tests would be enough to achieve 100% condition + branch coverage: for example, T1 = 150T2 = 40T3 = 250.

In “A practical guide to SQL white-box testing,” a 2006 paper by Tuya, Suárez-Cabal, and De La Riva, the authors suggest five guidelines for designing SQL tests:

  • Adopting modified condition/decision coverage (MC/DC) for SQL conditions —Decisions happen at three places in a SQL query: joinwhere, and having conditions. We can use criteria like MC/DC to fully exercise the query’s predicates. If you do not remember how MC/DC coverage works.
  • Adapting MC/DC for tackling nulls —Because databases have a special way of handling/returning nulls, any (coverage) criteria should be adapted to three-valued logic (truefalsenull). In other words, consider the possibility of values being null in your query.
  • Category-partitioning selected data —SQL can be considered a declarative specification for which we can define partitions to be tested. Directly from Tuya et al.’s paper, we define the following:
    • Rows that are retrieved —We include a test state to force the query to not select any row.
    • Rows that are merged —The presence of unwanted duplicate rows in the output is a common failure in some queries. We include a test state in which identical rows are selected.
    • Rows that are grouped —For each of the group-by columns, we design test states to obtain at least two different groups at the output, such that the value used for the grouping is the same and all the others are different.
    • Rows that are selected in a subquery —For each subquery, we include test states that return zero or more rows, with at least one null and two different values in the selected column.
    • Values that participate in aggregate functions —For each aggregate function (excluding count), we include at least one test state in which the function computes two equal values and another that is different.
    • Other expressions —We also design test states for expressions involving the like predicate, date management, string management, data type conversions, or other functions using category partitioning and boundary checking.
  • Checking the outputs —We should check not only the input domain but also the output domain. SQL queries may return null or empty values in specific columns, which may make the rest of the program break.
  • Checking the database constraints —Databases have constraints. We should make sure the database enforces these constraints.

As you can see, many things can go wrong in a SQL query. It is part of the tester’s job to make sure that does not happen.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *