Wednesday, December 12, 2012

State-Query Models

It’s been a while since I have had time to write on my blog. In the meantime I have been busy being on paternity leave (we get 10 weeks in Denmark!), changed roles, and went through re-organization in NAV. All bad excuses for not taking the time to write blog posts – apologies for that!

I’m going to shift gears a bit from now on. We recently filed a patent application1 on a new approach to Model-Based Testing. The patent revolves around how to apply Model-Based Testing to systems under test where state is contained in a data source, for instance a database. With this new approach models emerge from the responses of the system under test instead of through careful crafting of state-machines. This sounds very fancy, but in reality we probe the system through exercising automated action, and record how it responds by tracing the impact on the data source. With this approach a state model emerges, which has some surprisingly nice benefits.

State-queries vs. state-machines

At the hearth of the invention lies the replacement of finite-state-machines with state-queries.

Allow me to illustrate by example:

Consider a sales order in an ERP system. This is a document that has a header with some general information and some lines detailing what items/resources/etc. is being sold. A very simple model could consist of rules for creating an empty sales order, adding a sales line to the document and posting it. A typical Model-generated test case could be to create an empty sales document add a sales line and post it. However, the model could also just create a blank sales order and try to post it, which should result in an error.

To model this with Spec Explorer we would need to create a state-machine which counts the number of sales orders, and how many lines each sales order has. We would then need to build logic to increase counters whenever we create new sales orders or add new lines, and to check that we are able to post a sales order with lines.

In Spec Explorer such a model could look like:

Where the state (x:y) represents x sales orders and y sales lines. However, the information regarding number of sales orders and the number of lines on each is typically already stored in some backend database.

In our new approach the strategy is to create SQL queries that selects the relevant information from the database, instead of constructing a fictive state-machine. These so-called state-queries are then evaluated after every single rule is exercised to determine in what state the model is. In our example the following SQL query would select all relevant state-variables directly from the system:

SELECT
       h.[Document Type],
       h.[No],
       COUNT(*) AS "No of lines"
FROM
       [Sales Header] h
LEFT JOIN
       [Sales Line] l
ON
       h.[Document Type] = l.[Document Type] AND
       h.[No] = l.[Document No]
GROUP BY
       h.[Document Type],h.[No]

An example execution of this query could yield the following dataset:

The resulting dataset contains all the information the model needs to determine if it can add lines or post sales orders, and it was generated at runtime by directly querying the database.

Exploring state-query models

Exploration of state-query models is ‘on-the-fly’ testing. This means that the model is being explored by performing actions on the live system under test. Between each executed action the state-queries are evaluated on the database to determine the current state of the system under test. If the combined result of the state-queries is unique, then the newly discovered state is stored.

The following UML diagram shows how to conceptually explore a state-query model in a depth-first fashion:

Notice that for the state-query model to be feasible we must provide a mechanism for restoring a previous state (the ‘Restore database’ action). If the system under test stores its full state in the database (which is common for many service architectures), then this is obtained through a database restore. State restore is typically the most costly operation in the exploration algorithm. However, there are many ways to optimize this step, for example in NAV we track all changes to tables and then we can easily roll them back efficiently.

Seamlessly merging of models

The greatest benefit of this approach is that the emergent state model is coupled to the system implementation (and not some artificial state-machine), this implies that any integration scenarios (from the outside of the model) that stimulates a response in the system also stimulates the same response in the model. This means that we can create small models of sub-systems of the full system that still responds to integration scenarios. These small models are easily verifiable and once we have confidence that the small models works as we expect, they can seamlessly be merge into larger more complicated models that we would still expect to work. Again, very fancy, but let me continue the example and the benefits should be very clear:

Keeping in mind our original sales order model, we would now also like to model a sales quote document. This model would look almost identical to our first model, but everything would revolve around quotes. The one thing that is different is that a quote cannot be posted, it needs to be converted to an order first.

Where we again to use Spec Explorer we could use a new set of variables to track sales quote documents and the number of lines on each in a new model. But what happens now when we convert the sales quote to an order? How do we make that work with the sales order model? The conversion would never register in the sales order model, because the models know nothing of each other. We would have to bring in the functionality from the sales order model into the sales quote model (or make a third model), and explicitly enforce the creation of a sales order when a quote is converted. But there is no good way to do that – we are forced to either duplicate the sales order model or couple the two models together. If we duplicate we have redundant code, if we couple then the models become dependent on each other, and we can no longer change one without risk of breaking the other.

However, with our new approach, we can simply ignore this problem! We can just model the sales quote as if it was independent of the sales order. Resulting in another state variable query that would count sales quotes and the number of lines on them. In a stand-alone sales quote model what would happen when the model tries to convert a quote to an order? Even though the system under test will convert the quote to an order, the quote would simply appear to have disappeared – the model is not tracking sales orders, so it will not notice the appearance of a new sales order in the system. Conceptually the models would be disjoined:

However, were we to merge the state queries of both models into a new model, this new model would notice the appearance of a new sales order:

It gets even better, not only will we detect the appearance of a sales order, the sales order sub-model would not know the difference between a sales order generated by itself (the sales order model) and one converted from a quote. This leads to the automatic generation of integration scenarios (green path in the picture) between models.

The merged model would automatically detect and generate test cases for this integration scenario: creating a sales quote, adding lines to the quote, converting the quote to an order and posting this newly generated order – without us ever telling it anything about integration scenarios between sales quotes and orders. It simply falls out of the new approach for free.

Caveat

This is all nice and dandy, but one thing to be wary about is the fact that the model is coupled to the system under test. In traditional Model-Based Testing this is not the case. When we couple the model and the system, then we cannot distinguish between bugs in the model vs. bugs in the system under test. Furthermore, if we change behavior of the system, the model will follow suit. In short:

The model reveals how the system behaves, not whether that behavior is correct or not.

However, it is relatively easy to visually inspect a small model and determine if the behavior is correct or not. The strength of state-query models is that we inspect each of the individual models in isolation and validate conformance, then we bring the small models together into one (much) larger merged model, which is validated against all the validation rules of the individual models.

In the coming articles we will drill-down further into this caveat, and provide some mechanisms for decoupling your Model-Based Tests from the system under test.

Conclusion

In this article, we presented a novel innovation in the field of Model-Based Testing in which we represent the state of the system under test through the result of a set of state-queries executed against the data source backing the system. Using this approach we can construct many small models that validates conformance of system components and then seamlessly merge these into a larger model which models bigger parts of the system (or even the entire system).

1U.S Patent Application No. 13/571717 (AUTOMATIC VERIFICATION OF DATA SOURCES)

No comments:

Post a Comment