error handling

Now that we have a basic application running, there is something important we need to add. The first release of the testbed application makes a mistake that is common too much commercial LabVIEW code: It assumes that nothing will ever go wrong. To address that failing, we are going to look at a couple refinements that will report and record errors, as well as provide the hooks for later refinement of the error handling process.

Startup Errors

The acquisition process right now is simply passing to the display process a random number. Consequently, the acquisition needs no initialization — and so presents no real opportunities for an error to occur. However, if you decide to use this testbed as a basis for a real-world application, you will discover that most kinds of acquisition will require some sort of initialization that needs to occur when the process starts.

Now the most obvious place to put this initialization is before the event registrations. Unfortunately this placement can result in a situation where an error in the early parts of the initialization can keep your events from registering so you have no way of stopping the process — or event reporting that an error has occurred.

You could, of course, put the initialization that can generate errors after the event registrations, and so preserve the functionality of your events. As it turns out though, that move really isn’t very helpful because you still don’t have any way of reporting the error, stopping the process or trying to reinitialize the code. Remember it is perfectly possible to have transient problems like an instrument was turned off, or there was a momentary network outage.

What I want to show you is an easy way to leverage something you already have in the code to handle initialization errors. I’ll first show you an easy technique that simply stops the process and closes it. This technique is useful for many situations. Then I’ll discuss how to build on that technique to implement a more complex approach that supports retrying the initialization process. Both techniques are based on the fact that all event structures can have a timeout event.

Stop Process on Errors

Implementing the first technique, is easy because the event structure already has a timeout case in it. All we have to do is add a little functionality to it. Since the timeout right now is fixed at 1000 msec, the first thing we need to do is provide a way of changing the value being passed to the timeout terminal. The idea is that we will carry the timeout value in a shift register that is initialized with a zero. This setting means that the first time through the loop, the timeout will fire immediately. Inside the timeout event, if the timeout value for the current iteration is zero, the code knows it just finished initializing and so does nothing but check the state of the error cluster. If it is showing an error, the process quits. This case also sets the timeout value shift register to the normal 1000 msec value so if there are no errors, the loop will continue with its normal operation. This is what the check for initialization errors looks like:

The default case (not shown) for the inner case structure contains the code that is currently in the timeout event case.

Retry on Error

I haven’t implemented the complete retry technique since it tends to be very specific to what a given process is doing, but based on the previous simple example it isn’t hard to visualize some of the tweaks the code would need.

To begin with, in the simple example there were only two possible states for the timeout case to address:

The VI has been initialized
The VI has not been initialized

Because there are only two possibilities, it is logical — and valid — to essentially “encode” the knowledge about whether the VI is initialized in the timeout value, where a zero means it is not initialized and any other value means it is initialized.

Now however, the situation is more complex because, in this case, there can be two reasons why the VI in not initialized. It could be because the VI is just starting, or it could be because the last attempt at initialization failed. Additionally, if a previous initialization attempt has failed, you might not want to have the code sit there forever retrying, so the code needs to know how many time has it has failed. Finally, if you are going to retry the initialization at some interval, you need to be able to define that interval without restraint.

Clearly, we are going to need more than one value to represent all that information, so I would recommend two numbers (in separate shift registers). One would be the timeout value and it would have no meaning other than how long to wait for the timeout. The second numeric value would encode the retry state like so:

..-1 = The VI has been initialized
0 = All retry attempts have been exhausted
1.. = The number of retry attempts remaining

It is this new second value that would now control the inner case structure such that the acquisition code would go in the ..-1 case and pass both shift register values through unmodified. The 0 case would stop the loop, and the 1.. case would attempt to initialize the VI. If the attempt fails, the logic would set the timeout to the retry timeout value and decrement the value controlling the case structure. If the attempt is successful, the logic would set the timeout to the acquisition timeout value, and set the control value to -1.

And remember, this example is but one of many possible variations on the theme.

Displaying and Recording Errors

Having collected errors, your application also needs a mechanism for displaying them to the operator and/or recording their occurrence. I have seen approaches that try to decentralize this functionality but for many reasons the results are often less than ideal. Chief among these reasons can be additional errors resulting from trying to access a common resource such as a file or database from multiple locations within the code. The solution to this conundrum is to create a separate process, the sole purchase of which is to report and display errors. That is what I am presenting here:

As you can see, the code is very simple. When something fires the error event, the parameters needed to make the error handler VI work are read from the system configuration information. This information consists of what the VI should do with the error, and the labels for the dialog box buttons. The error handler VI itself (Show Error.vi) is a simplified version of the error handler that ships with LabVIEW. For safety reasons, the VI no longer has the ability to abort the running application.

The process’ event structure only services three events:

The error event (shown) — <sys:System Error>:User Event
This event is fired from other processes when they detect an error in their operation. The first VI it calls can either display the error to the operator in a dialog box, save it to file or both. The output from the VI is an enumeration that specifies whether to continue execution of the application, or command a shutdown. If the reporting VI presents the error in a dialog box, this output is set in accordance with the user’s selection. If the reporting VI is only recording the error, the output is always Continue.
The application stop event — <sys:Stop Application>:User Event
This event fires whenever the application is stopping. The only thing unique about this process’s implementation of the event handler is that before stopping itself, it pauses and waits 5 seconds for all the other process to at least get started on their shutdown operations.
The Stop button value change event
Operationally, there is no real reason for this button, the front panel will after all be closed. Its main use is to assist in troubleshooting or preliminary testing where the formal shutdown logic may not be completed. It simply fired the shutdown event.

But how do you pass errors to this error handler process? Like all events, the Handle Errors event has a Generate Event.vi. In this case, however, the operation is tweaked slightly.

Because we only want the event generated when an error occurs, the code is structured such that the event is only fired if there is an incoming event; in which case, the data for the event is the incoming error cluster. The output error cluster, however, is the result of the event generation. If there is no incoming error, the error cluster is passed through unmodified.

The other variation for typical for this VI is that because it is likely to be used in multiple locations, I have set its execution to be shared clone reentrant. This causes LabVIEW to preallocate a pool of clones that can be used as needed. This setting is useful in situations where there is code that you don’t want instances blocking each other, but don’t need a unique memory space between instances.

The event generator is placed in the acquisition and display VIs and the last thing to do before the loop repeats. The testbed code with the two modifications discussed in this post is available from:

http://svn.notatamelion.com/blogProject/testbed application/Tags/Release 2

You should know the drill for grabbing a copy of it by now.

Until next time…

Mike…

Not a Tame Lion

"Safe" isn't always an option…

Adding Basic Error Handling

Startup Errors

Stop Process on Errors

Retry on Error

Displaying and Recording Errors

Startup Errors

Stop Process on Errors

Retry on Error

Displaying and Recording Errors

Share this: