Tuesday, September 6, 2011

Multi-Threaded Modeling – Barrier Synchronization

Okay, all my previous posts have been based on static system models, where rules and actions are static functions in classes. A lot can be accomplished with this, and it is also possible to wrap instance based applications inside static functions. However, Spec Explorer allows us to work with actual instance based models.

So what does this mean for us? Well, it means we are allowed to instantiate any number of objects of some type and Spec Explorer can reference each of these objects individually. For example, you could write a class that represents a complex number, and have Spec Explorer create two instances of this object and then add them together. This would be a basic example, but instead let’s jump to something more tricky – multi-threading!

You can imagine that instance based modeling and multi-threading are closely related. But there are some key issues one must understand first. Instance based modeling is not multi-threaded, it is exactly as sequential as any static model. Spec Explorer will not create a new thread for you when you instantiate an object. Any interaction by Spec Explorer will be from the “main”-thread, you have to handle communication with your threads.

Barrier synchronization
The example I have picked for this post is a barrier synchronization implementation. Barrier synchronization is a pattern for worker threads in multi-threaded application. With barrier synchronization each thread must wait for its co-workers to complete their cycle of work before the next cycle starts. Denoting the number of cycles completed by thread i as ci, this can be stated by the following invariant:
          
Essentially each thread reaches a “barrier” upon completion of its work. This “barrier” is only lowered once all threads have reached it. Upon release of the “barrier” the worker threads continues their work.

Implementation of barrier synchronization
First of all, we capture the threading in a class

    public class BarrierThread
    {
        ...

        Thread thread;
        bool abortOnSynchronize = false;

        volatile int workCycles = 0;
        volatile bool abort = false;

        public BarrierThread()
        {
            thread = new Thread(() => ThreadMain());
            ...
        }

        public void Start()
        {
            thread.Start();
        }


        private void ThreadMain()
        {
            do
            {
                ...
            } while (!abort);
        }

        public int Synchronize()
        {
            ...
        }
    }

There are a couple of things to point out here. I’ve cut away the tricky implementation details at first, but essentially the constructor spawns a new thread which can be started with a call. This thread goes into a loop until aborted. Notice there is a couple of variables that are declared volatile to avoid compiler optimization around these – this is important because they are shared between the main thread and the worker threads!

The difficult parts of the implementation are the thread body, synchronization and abortion, but I’m not going into detail with that right now, you can refer to the sample [1].

Modeling barrier synchronization
Before I go on to describe the model a word of warning is in place. The config.cord file is very different from that of a static model, but I don’t want to dwell on that – I may do a follow up post on this if I find the time for it.

For my implementation the threads are modeled with the following interface:
        enum ThreadState { Unstarted, Started, Aborting, Aborted };

        // Starts the worker thread
        [Rule(Action = "this.Start()")]
        void Start()

        // Aborts the worker thread upon next synchronization
        [Rule(Action = "this.Abort()")]
        void Abort()

        // Barrier synchronization of all worker threads, returns the current worker thread cycle count
        [Rule(Action = "this.Synchronize()/result")]
        int Synchronize()

The ThreadState enum enumerates the states our threads can be in from the model perspective. The Start and Abort rules just change current thread state. Threads that are aborted will be set to an intermediate state for actual abortion upon next synchronization.

It should be noted that each rule in the model applies to any instance of our class. This means that the model exploration will call Start, Abort and Synchronize on any instance. This is important to understand, because if all threads jump into the synchronization rule they will hang waiting on each other. The implementation details of the synchronization rule are:

        // Barrier synchronization of all worker threads, returns the current worker thread cycle count
        [Rule(Action = "this.Synchronize()/result")]
        int Synchronize()
        {
            Condition.IsTrue(synchronizations++ < 3);
            for(int i = 0; i < threads.Count; i++) {
                Condition.IsTrue(threads[i] != ThreadState.Unstarted);
                Condition.IsTrue(threads[i] != ThreadState.Aborted);
                if (threads[i] == ThreadState.Aborting)
                    threads[i] = ThreadState.Aborted;
            }

            return synchronizations;
        }

First we are limiting the number of working cycles to perform to 3. Notice that synchronization is not allowed before all worker threads are started or aborting, this is essential, because otherwise the main thread will be hanging unable to start all workers. This is an example of where we need to remember that model exploration is sequential.

This is all there is to the model. With two threads exploration yields:
One problem with the model/implementation is that you can abort one thread while the other is still active. The implementation does not accurately track that the number of threads decreased, as such this is not allowed and our implementation would run into deadlock in case this happens. Also, if one or more threads are not aborted upon test completion the test will hang because remaining threads have not terminated.

The way we handle this is simply to state that states are only acceptable if all threads have been aborted in the end:

        // Accept only end states where all threads are aborted
        [AcceptingStateCondition]
        bool Accepting()
        {
            for (int i = 0; i < threads.Count; i++)
                if (threads[i] != ThreadState.Aborted)
                    return false;

            return true;
        }

When generating test cases we can filter to only end-states that are accepting states.

Test generation
The test generation is filtered to only accepting end-states using:

machine ThreadTestSuite() : Main where TestEnabled = true
{
    construct test cases where strategy = "longtests" for
        construct accepting paths for Sliced2ThreadModelProgram()
}

The key word here is the “accepting paths” construct. This is pretty nice, because we can keep the model showing that it has this “flaw”, while still generating only tests that we know satisfy our conditions for the implementation (i.e. abort all threads before finishing).

The generated test cases are:


Running concurrent tests
When running concurrent tests, one type of bugs we are looking for is deadlocks. The problem is of course that in this case the test case will hang until it times out. It is a good idea to decrease the test timeout to a small value. In my case I’m decreasing it to 10 seconds as this is sufficient time for my tests to complete.

Doing so requires that you add a testsettings file. This is done from right-clicking the solution and adding a new item. It took me a while to figure this out, as I was trying to add it to the test project at first, but this is not possible.

Conclusion
Although instance based modeling is the way to go when testing concurrent systems, care has to be put into designing the adapter layer. From my experience developing this sample model you will run into situations where the generated tests deadlocks. In many cases this is due to implementation issues, but getting the model right is also tricky. Fortunately a lot of headaches can be avoided by filtering invalid states out, like I did for cases where some threads were left running. In this respect model based testing is actually a very powerful tool for writing concurrent test scenarios.

No comments:

Post a Comment