Input validation, contracts, or both?

Developers are aware of how important input validation is. A mistake in the validation may lead to security vulnerabilities. Therefore, developers often handle input validation whenever data comes from the end user.

Consider a web application that stores products for an online store. To add a new product, a user must pass a name, a description, and a value. Before saving the new product to the database, the developer performs checks to ensure that the input values are as expected. Here is the greatly simplified pseudo-code.

Listing 4.12 Pseudo-code for input validation

class ProductController {
  // more code here ...
 
  public void add(String productName, String productDescription,
   double price) {                                                      ❶
 
    String sanitizedProductName = sanitize(productName);                ❷
    String sanitizedProductDescription = sanitize(productDescription);  ❷
    if(!isValidProductName(sanitizedProductName)) {                     ❸
       errorMessages.add("Invalid product name");
    }
    if(!isValidProductDescription(sanitizedProductDescription)) {       ❸
       errorMessages.add("Invalid product description");
    }
    if(!isValidPriceRange(price)) {                                     ❸
       errorMessages.add("Invalid price");
    }
 
    if(errorMessages.empty()) {                                         ❹
      Product newProduct = new Product(sanitizedProductName,
        ➥ productDescription, price);
      database.save(newProduct);
 
      redirectTo("productPage", newProduct.getId());
    } else {                                                            ❺
      redirectTo("addProduct", errorMessages.getErrors());
    }
  }
}

❶ These parameters come directly from the end user, and they need to be validated before being used.

❷ We use the made-up sanitize() method to sanitize (remove invalid characters from) the inputs.

❸ Ensures that values are within the expected format, range, and so on

❹ Only when the parameters are valid do we create objects. Is this a replacement for design-by-contract?

❺ Otherwise, we return to the Add Product page and display the error messages.

Given all this validation before the objects are even created, you may be thinking, “Do I need to model pre-conditions and post-conditions in the classes and methods? I already know the values are valid!” Let me give you a pragmatic perspective.

First, let’s focus on the difference between validation and contracts. Validation ensures that bad or invalid data that may come from users does not infiltrate our systems. For example, if the user types a string in the Quantity field on the Add Product page, we should return a friendly message saying “Quantity should be a numeric value.” This is what validation is about: it validates that the data coming from the user is correct and, if not, returns a message.

On the other hand, contracts ensure that communication between classes happens without a problem. We do not expect problems to occur—the data is already validated. However, if a violation occurs, the program halts, since something unexpected happened. The application also returns an error message to the user. Figure 4.3 illustrates the difference between validation and code contracts.

Figure 4.3 The difference between validation and code contracts. Each circle represents one input coming to the system.

Both validation and contracts should happen, as they are different. The question is how to avoid repetition. Maybe the validation and pre-condition are the same, which means either there is code repetition or the check is happening twice.

I tend to be pragmatic. As a rule of thumb, I prefer to avoid repetition. If the input validation already checked for, say, the length of the product description being greater than 10 characters, I don’t re-check it as a pre-condition in the constructor of the Product class. This implies that no instances of Product are instantiated without input validation first. Your architecture must ensure that some zones of the code are safe and that data has been already cleaned up.

On the other hand, if a contract is very important and should never be broken (the impact could be significant), I do not mind using a little repetition and extra computational power to check it at both input-validation time and contract-checking time. Again, consider the context to decide what works best for each situation.

NOTE Arie van Deursen offers a clear answer on Stack Overflow about the differences between design-by-contract and validation, and I strongly.

Input validation, contracts, or both?

Comments

Leave a Reply Cancel reply