Plotting Large Datasets Without Waiting Forever

Many years ago in his landmark 1977 book, Exploratory Data Analysis, John Tukey made a comment that in his pre-PC, pre-Excel, pre-most-everything-we-think-we-need-to-do-analysis time was pretty radical, what which is today pretty common-sense:

“…there is never a good reason to not look at a plot of your data.”

Any test engineer worth his or her salt will admit that we often do our best work when we heed that truly sage advice. Looking at a plot allows us to see the data’s basic “shape” and begin to develop something of an intuitive understanding of relationships that might exist in the data. The problem is that in a world where “Big Data” is the catchphrase of the day, that simple advice can be difficult to put into practice. It has become increasingly common to see people on the user forum wondering how to effectively view work with datasets that incorporate hundreds, thousands or even hundreds of thousands of datapoints.

Seeing the Problem

To assist us in visualizing some of the problems inherent in handling large datasets, I have put together a test dataset consisting of 3 traces, each with over 19,000 datapoints. Now when I just read the data and plot it, this is what I get:

Entire dataset with white bars

Clearly there is an issue here – I mean what is up with the wide vertical bars? But there is an even larger problem. Let’s say I change the size of the plot by making it just 9 pixels wider.

But now the bars have moved (9px)

Now what is going on? The white bars have changed and if you look at the peaks in the data carefully, some of them appear to have moved or even disappeared. In order to get your head wrapped around what is happening, consider what LabVIEW is having to do behind the scenes. I mentioned that the dataset had over 19,000 datapoints (19,555, to be exact) but the active plot area of the display is only 350 pixels wide. If you do the math, you discover that to generate this plot, each pixel has to represent about 57 datapoints. The problem of course is that you can’t subdivide a pixel into 57 pieces. So what is LabVIEW to do?

Well it does what any graphing package does when it is confronted with this challenge: it decimates the data. In other words it takes 57-datapoint chunks of the data, performs some sort of statistical operation on each chunk (min, max, mean, etc) and then uses the resulting summary value to represent that chunk of data on the plot. There are several potential problems with this way of handling the situation, but they typically don’t become an issue unless the dataset being plotted is very large relative to the size of the graph. For example, this is why the data on the graph appeared to change as a function of the size of the plot area. As the plot area changed (even slightly) the chunking changes so the data appears to change as well – you can think of it as sort of visual aliasing.

More subtle problems have to do with the way the graphing routines “summarize” the data chunks. Depending upon the shape of your dataset, the operations I mentioned earlier can give dramatically different output and to make matters worse you have no idea what techniques the graphing functions are using. But even if you can live with the visual effects there are good reasons to take action to address the issue.

Finally, in order to plot these huge datasets you have to be carrying them around inside your program. Consequently, rather than having just one copy of these monsters, you can have several – perhaps dozens – it all depends on how your code it written. From this discussion we can then see the two imperatives for our solution:

  1. The approach must minimize the number of copies that LabVIEW has to make of the dataset.
  2. It must reduce the number of datapoints that actually need to be plotted.

Let’s start by looking at the data management aspect of the problem, remembering of course that these two issues are inextricably linked together.

Low-Overhead Storage

Decades ago, people in the nascent computer-science discipline realized that if you had a value, like an array, that consisted of multiple items, the most efficient way of making it available throughout your code was to store it in one location in memory and give the code that needed to access it a “pointer” that served as a reference to that value. Originally this mechanism was pretty primitive with the pointer often consisting of simply the value’s starting address in RAM. In addition, there was no real way of preventing race conditions or security intrusions because there was no way of controlling access to the data. It would be nice to think that we have learned the errors of our ways and fixed all the holes, but such is not always the case. Remember the “Heartbleed” bug panic from last year?

The good news is that LabVIEW does not suffer from the same problems because while we have at our disposal a mechanism that fills the same role as the primitive pointer, it lacks the problems. I am talking about the Data Value Reference, or DVR. It meets the low-overhead storage mandate by accessing the data through a reference that is only 4 bytes long. The DVR is also secure because the buffer that is creates is strongly typed, meaning that you can’t just store anything in it or read whatever you want from it. The data going in and coming out must match the definition of the data structure that was used when the DVR was defined. Finally, the DVR removes problems resulting from simultaneous access to the same resource by defining a new structure that automatically serializes access on a first-come, first-served basis. So the first thing we need to do is get our data into the DVR, and here’s some code to do just that.

Load  Big Data

The VI starts by reading a binary file containing the data which, to simplify this example, is already formatted correctly for how we are going to use it. The resulting array drives a box called an inplace structure that guarantees there will be no other accesses to the DVR occurring in parallel with this one. However, the structure does something else too: Inplace structures operate something like compiler directives telling the LabVIEW compiler that its OK to attempt additional optimizations that would not otherwise be safe to make. For example, they allow to LabVIEW operates on the inplace data without making the copies that the compiler might otherwise need make.

The other thing to note is that funny-looking function in the middle of the inner inplace structure. It’s called Swap Values and its help description really doesn’t do it justice. If all you did was read the context help you might assume that it is simply some sort of switch for routing signals around, and stifling a yawn, go on to consider matters that you think look more exciting. To see why you should consider this function very exciting, we need to look under LabVIEW’s hood.

To store data internally, LabVIEW uses memory data buffers. In fact much of what we think of as “dataflow” consists of the manipulation of those buffers. Now when LabVIEW stores a complex datatype like a cluster (which is what the DVR in this case is holding) it uses a combination of techniques. For simple fix-sized data like numerics or booleans, LabVIEW simply includes the data values directly in the cluster’s memory space. However, it needs a different approach when storing data values like arrays or strings that can vary in length. When a cluster includes an item that can change in size, the item is stored outside the cluster in its own memory buffer and the cluster only holds a reference to that buffer. That way if the item changes in size it can do so without effecting the memory allocation of the cluster containing it.

However this explanation also reveals why the Swap Values node is so important. Let’s look at this code from the standpoint of buffers. Coming into the inner inplace structure there are two buffers allocated that are holding arrays: One contains the data I read from the file, and one the (empty) array that is contained in the cluster that is the contents of the DVR. Now there are two ways that we could initialize that array. The most obvious one is to leave the unbundle (left) side of the cluster inplace structure unwired and wire the array containing the data directly to the bundle (right) side of the cluster inplace structure. While this would work, coding it in that way would result in LabVIEW needing to copy the data contained in the incoming array’s buffer to the array buffer associated with the cluster – and the larger the dataset is, the longer this copy can take.

Now consider what happens when Swap Values is used. Although the node resides inside an inplace structure, it would seem logical that you can’t replace an empty array with an array containing thousands of datapoints in place. Well actually you can. The key point to remember is that at a very low level, the clusters don’t actually contain the arrays, rather they hold references that point to the arrays that are associated with them. So what Swap Values does is it leaves the two arrays in place and simply swaps the references that the clusters contain. Thanks to this optimization, populating this cluster with data will take the exact same amount of time whether the input data contains 2 datapoints or 200,000 datapoints because the only thing that is really being moved is a pair of 4-byte memory buffer references.

Getting Data Out

So we have gotten our data into the DVR as efficiently as we can, but if this storage is going to be of any use, there clearly needs to be a way to get data out of it as well. However, here we face the issue of plotting data that is too large. At the same time we are pulling it out, we also need to be reducing or decimating it to more closely match the size of the available graphing area. To meet those dual requirements I created this VI.

Read and Decimate Big Data

At first this code might seem intimidating, but if you take it step-by-step and analyse what it’s doing, it isn’t really so very different from the example we looked for initializing the data in the DVR. Starting at the left side, the code unbundles the data array from the DVR and passing it into a loop that will execute three times – once for each plot in the dataset. The first point of optimization is in how this loop operates. Note the node with the “P” in it. The presence of this node means that the for loop is set for parallel operation. There are many situations where, even though you specify a for loop, there is no logical reason that the iterations have to operate sequentially. When LabVIEW encounters a “parallelized” loop the optimizer essentially flattens the loop out, creates the necessary parallel code to execute each iteration simultaneously, and then reassemble the output data in the correct order. To find out if a loop is parallelizable, there is an option under Tools>>Profile called Find Parallelizable Loops…. This operation opens a dialog that allows you to identify the loops that can and cannot be run in parallel mode.

Inside the loop, the array drives an inplace structure that indexes out one element, and the resulting cluster feeds a second inplace structure that unbundles the two items in the cluster. The processing of this data occurs in two distinct steps. First the Start and Length inputs produce a subset of the total dataset representing the portion of the data that is to be displayed. Because this operation causes LabVIEW to copy the selected data into a new memory buffer, the code passes the resulting arrays into another inplace structure to ensure that the subset will also be manipulated inplace.

The code inside this inner-most inplace structure performs the second half of the processing – the decimation to reduce the size of the data being plotted. Note that if the selected portion of the dataset is already smaller than the width of plot area, the following code is bypassed. The first step in the decimation process is to reshape the 1D array into a 2D array where each row contains one chunk of data to be statistically summarized. To obtain the final X values, the code takes the first value of each chunk, while the final Y values are the maximum Y for each chunk. Note that this processing occurs in another parallelized loop that auto-indexes the output arrays, which are swapped into the output dataset as they work their way back out through the inplace structures.

Summarizing Options and Challenges

The real heart of this VI is the function that is being used to summarize the Y values for each chunk of data. Right now, I am using the function that returns the minimum and maximum values contained in the array. One of the advantages that it offers is that it is deals well with datasets containing missing datapoints represented by the value NaN. This consideration is important because it is a common (and valuable) practice to represent missing data points using that value. Without the NaN datapoints, any graph will simply connect the dots on either side of the missing datapoints resulting in a graph that visually misrepresents the data being presented. However, with the NaN values, the missing points are shown as breaks in the line (or gaps between bars), thus highlighting the missing data.

The statistical function I selected to summarize the data in the chunks simply returns the minimum and maximum values of the elements that are not NaN. However, most other analysis routines follow the basic rule that any calculation which has NaN as an operand will return an answer of NaN – which in this situation will not be real helpful. More often, what you will want is, for example, the average value of the datapoints that are present in the dataset chunk. If you are wanting to use the chunk mean or median value to summarize a dataset that you know contains NaN value, you should include something like this before the statistical operation:

Filtering out NaN

Basically it works by first sorting the array to move any NaN values to the end of the array. It then looks for the first NaN and simply trims off it (and anything after it). This works because a mean operation doesn’t care about data order, and the first thing a median function does is sort the data anyway.

Let’s See How it Works

When you run the top-level VI in the linked project, the graph that comes up will look a lot like the first image in this post, but minus the vertical white bars. As you make changes to the display that effect the X axis range, you will notice that the resulting image will zoom in on the data, showing ever greater levels of detail. Try manually typing in new X axis end points, or use the horizontal zoom tool on the graph palette to select a range of data points that you want to zoom in on.

Zoom in far enough and you will see why there were white bars on the original plot: There are a lot of missing datapoints. Using the default decimation resulted in wide white bars because the presence of the NaN values effectively hid dozens of real datapoints.

Plotting Large Datasets – Release 1
Toolbox – Release 11

Hopefully this discussion will give you something to think about, and experiment with.

The Big Tease

One of the things that developers often have to face is adding functionality to an existing VI without disrupting, or even modifying what is already there. One way to accomplish this (seemingly impossible) task is to use what are sometimes called “drop-in” VIs. These routines are simply dropped down on an existing block diagram and they do what they do without interaction with the existing code. To demonstrate how this could work, next time we’ll get back to our test bed application and give it the ability to customize the font and size of the test that are on its various displays.

Until Next Time…
Mike…

More Than One Kind of Modularity

When learning something that you haven’t done before – like .NET – it’s not uncommon to go through a phase where you look at some of the code you wrote early on and cringe (or at least sigh deeply). The problem is that you are often not only learning a new interface or API, but you are learning how to best use that interface or API. The cause of all the cringing and sighing is that as you learn more, you begin to realize that some of the assumptions and design decisions that you made were misguided, if not flat-out wrong. If you look at the code we put together last time to help us learn about .NET in general, and the NotifyIcon assemble in particular, we see a gold-plated example of just such code. Although it is clearly accomplished it’s original goal of demonstrating how to access .NET functionality and illustrating how the various objects can relate to one another, it is certainly not reusable – or maintainable, or extensible, or any of the other “-ables” that good software needs to be.

In fact, I created the code in that way so this time we can take the lesson one step further to fix those shortcomings, and thus demonstrate how you can go about cleaning up code (of your own or inherited) that is making you cringe or sigh. Remember, it is always worth your time to fix bad design. I can’t tell you how many times I have seen people struggling with bad decisions made years before. Rather than taking a bit of time to fix the root cause of their trouble, they continue to waste hours on project after project in order to workaround the problem.

Ok, so where do we start?

Clearly this code would benefit from cleaning-up and refactoring, but where and how should we start? Well, if you are working on an older code base, the question of where to start will not be a problem. You start with where the most pain is. To put it another way, start with the things that cause you the biggest problems on a day-to-day basis.

This point, however, doesn’t mean that you should just sit around and wait for problems to arise. As you are working always be asking yourself if what you are doing has limitations, or embodies assumptions that might cause problems in the future.

The next thing to remember is that this work can, and should, be iterative. In other words you don’t have to fix everything at once. Start with the most egregious errors, and address the others as you have the opportunity. For example, if you see the code doing something stupid like using a string as a state variable, you can fix that quickly by replacing the strings with a typedef enumeration. I have even fixed some long-standing bugs in doing this replacement because it resolved places where states were subtly misspelled or contained extraneous spaces.

Finally, remember that the biggest payoffs, in terms of long-term benefit, come from improved modularity that corrects basic architectural problems. As we shall see in the following discussion, I include under this broad heading modularity in all its forms: modular functionality, modular logic and modular data.

Revisiting Modular Functionality

Modular functionality is the result of taking small reusable bits of code and encapsulating it in routines that simplify access, standardize the interface or ensure proper usage. There are good examples of all these usages in the application modified code, starting with Create NotifyIcon.vi:

Create NotifyIcon VI

Your first thought might be why I bothered turning this functionality into a subVI. After all, it’s just one constructor node. Well, yes that is true but it’s also true that in order to create that one node you have to remember multiple steps and object names. Even though this subVI appears rather simple, if you consider what it would take to recreate it multiple times in the future, you realize that it actually encapsulates quite a bit of knowledge. Moreover, I want to point out that this knowledge is largely stuff for which there is no overwhelming benefit to be gained from you committing it to memory.

Next, let’s consider the question of standardizing interfaces. Our example in this case is a new subVI I created to handle the task of assigning an icon to the interface we are creating. I have named it Set NotifyIcon Icon.vi:

Set NotifyIcon Icon VI

You will remember from out previous discussion that this task involves passing a .NET object encapsulating the icon we wish to use to a property node for the NotifyIcon object. Originally, this property was combined with several others on a single node. A more flexible approach is to breakup that functionality and standardize the interfaces for all the subVIs that will be setting NotifyIcon to simply consist of an object reference and the data to be used to set the property in a standard LabVIEW datatype – in this case a path input. This approach also clearly simplifies access to the desired functionality.

Finally, there is the matter of ensuring proper usage. A good place to highlight that feature is in the last subVI that the application calls before quitting: Drop NotifyIcon.vi.

Drop NotifyIcon VI

You have probably been warned many times about the necessity of closing references that you open. However, when working with .NET objects, that action by itself is sometimes not sufficient to completely release all the system resources that the assembly had been using. Most of the time if you don’t completely close out the assembly you many notice memory leaks or errors from attempting to access resources that still think they are busy. However with the NotifyIcon assembly you will see a problem that is far more noticeable, and embarrassing. If you don’t call the Dispose method your program will close and release all the memory it was using, but if you go to the System Tray you’ll still see your icon. In fact, you will be able to open its menu and even make selections – it just doesn’t do anything. Moreover, the only way to make it go away it to restart your computer.

Given the consequences of forgetting to include this method in your shutdown sequence, it is a good idea to make it an integral part of the code that you can’t forget to include.

Getting Down with Modular Logic

But as powerful as this technique is, there can still be situations where the basic concept of modularity needs to be expressed in a slightly different way. To see such a situation, let’s look at the structure that results from simply applying the previous form of modularity to the problem of building the menus that go with the icon.

Create ContextMenu VI

Comparing this diagram to the original one from last time, you can see that I have encapsulated the repetitive code that generated the MenuItem objects into dedicated subVIs. By any measure this change is a significant improvement: the code is cleaner, better organized, and far more readable. For example, it is pretty easy to visualize what menu items are on submenus. However, in cases such as this one, this improved readability can be a bit of a double-edged sword. To see what I mean, consider that for the structure of your code to allow you to visualize your menu organization, said organization must be hard-coded into the structure of the code. Consequently, changes to the menus will, as a matter of course, require modification to the fundamental structure of the code. If the justifications for modularity is to include concepts like flexibility and reusability, you just missed the boat.

The solution to this situation is to realize that there is more than one flavor of modularity. In addition to modularizing specific functionality, you can also modularize the logic required to perform complex and changeable tasks (like building menus) that you don’t want to hard code. If this seems like a strange idea to you, consider that computers spend most of their time using their generalized hardware to performed specialized tasks defined by lists of instructions called “programs”. The thing that makes this process work is a generalized bit of software called a “compiler” that turns the programs into data structures that the generalized hardware can use to perform specialized actions.

Carrying forward with this line of reasoning, what we need is a simple way of defining a menu structure that is external to our program, and a “menu compiler” that turns that definition into the MenuItem references that our program needs. So let’s build one…

Creating the Data for Our Menu Compiler

So what should this menu definition look like? Well, to answer that question we need to start with the data required to define a single MenuItem. We see that as a minimum, every item in a menu has to have a name for display to the user, a tag to identify it, and a parent tag that says if the item has a parent item (and if so which item is its parent). In addition, we haven’t really talked about it, but the order of references in an array of menu items defines the order in which the items appear in the menu or submenu – so we need a way to specify its menu position as well. Finally, because in the end the menu will consist of a list (array) of menu item references, it makes sense to express the overall menu definition that we will eventually compile into that array of references as a list (and eventually also an array).

But where should we store this list of menu item definitions? At least part of the to this question depends on who you want to be able to modify the menu, and the level of technical expertise that person has. For example, you could store this data in text files as INI keys, or as XML or JSON strings. These files have the advantage of being easy to generate and are readily accessible to anyone who has access to a text editor – of course that is their major disadvantage, as well. Databases on the other hand are more secure, but not as easy to access. For the purposes of this discussion, I’ll store the menu definitions in a JSON file because, when done properly, the whole issue of how to parse the data simply goes away.

To see what I mean, here is a nicely indented JSON file that describes the menu that we have been working using for our example NotifyIcon application:

[
	{
		"Menu Order":0,
		"Item Name":"Larry",
		"Item Tag":"Larry",
		"Parent Tag":"",
		"Enabled":true
	},{
		"Menu Order":1,
		"Item Name":"Moe",
		"Item Tag":"Moe",
		"Parent Tag":"",
		"Enabled":true
	},{
		"Menu Order":2,
		"Item Name":"The Other Stooge",
		"Item Tag":"The Other Stooge",
		"Parent Tag":"",
		"Enabled":true
	},{
		"Menu Order":3,
		"Item Name":"-",
		"Item Tag":"",
		"Parent Tag":"",
		"Enabled":true
	},{
		"Menu Order":4,
		"Item Name":"Quit",
		"Item Tag":"Quit",
		"Parent Tag":"",
		"Enabled":true
	},{
		"Menu Order":0,
		"Item Name":"Curley",
		"Item Tag":"Curley",
		"Parent Tag":"The Other Stooge",
		"Enabled":true
	},{
		"Menu Order":1,
		"Item Name":"Shep",
		"Item Tag":"Shep",
		"Parent Tag":"The Other Stooge",
		"Enabled":true
	},{
		"Menu Order":2,
		"Item Name":"Joe",
		"Item Tag":"Joe",
		"Parent Tag":"The Other Stooge",
		"Enabled":true
	}
]

And here is the LabVIEW code will convert this string into a LabVIEW array (even if it isn’t nicely indented):

Read JSON String

JSON has a lot of advantages over techniques like XML: For starters, it’s easier to read, and a lot more efficient, but this is why I really like using JSON: It is so very convenient.

Starting the Compilation

Now that we have our raw menu definition string read into LabVIEW and converted into a datatype that will simplify the next step in the processing, we need to ensure that the data is in the right order. To see why, we need to remember that the final data structure we are building is hierarchical, so the order in which we build it matters. For instance, “The Other Stooge” is a top-level menu item, but it is also a submenu so we can’t build it until we have references to all the menu items that are under it. Likewise, if one of the items under it is a submenu, we can’t build it until all its children are created.

So given the importance of order, we need to be careful how we handle the data because none of the available storage techniques can on their own guarantee proper ordering. The string formats can all be edited manually, and it’s not reasonable to expect people to always type in data in the right order. Even though databases can sort the result of queries, there isn’t enough information in the menu definition to allow it to do so.

The menu definition we created does have a numeric value that specifies the order of items in their respective menus and submenus. We don’t, however, yet have a way of telling the level the items reside at relative to the overall menu structure. Logically we can see that “Larry” is a top-level menu item, and “Shep” is one level down, but we can’t yet determine that information programmatically. Still the information we need is present in the data, it just needs to be massaged a bit. Here is the code for that task:

Ordering the Menu Items

As you can see, the process is basically pretty simple. I first rewrite the Item Tag value by adding the original Item Tag value to the colon-delimited list that starts with the Parent Tag. I then count the number of colons in the resulting string, and that is my Menu Level value. The exception to this processing are the top-level menu items which are easy to identify due to their null parent tags. I simply force their Menu Level values to zero and replace the null string with a known value that will make the subsequent processing easier. The real magic however, occurs after the loop stops. The code first sorts the array in ascending order and then reverses the array. Due to the way the 1D array sort works when operating on arrays of clusters, the array will be sorted first by Menu Level and then Menu Order – the first two items in the cluster. This sorting, in concert with the array reversal, guarantees that the children of a submenu will always be processed before the submenu item itself.

Some of you may be wondering why we go to all this trouble. After all, couldn’t we just add a value to the menu definition data to hold the Menu Level? Yes, we could, but it’s not a good idea, and here’s why. In some areas of software development (like database development, for instance) the experts put a lot of store in reducing “redundancy” – which they define basically as storing the same piece of information in more than one place. The problem is that if you have redundant information, you have to decide how to respond when the two pieces of information that are supposed to be the same, aren’t. So let’s say we add a field to the menu definition for the menu level. Now we have the same piece of information stored in two different places: It is stored explicitly in the Menu Level value while at the same time it is also stored implicitly in Parent Tag.

Generating the Menu Item “Code”

In order to turn this listing into the MenuItem references we need, we will pass this sorted and ordered array into a loop that will process one element at a time. And here it is:

Compiling the Menu-1

You can see that the loop carries two shift registers. The top SR holds a 1D array of strings that consists of the submenu tags that the loop has encountered so far. The other SR also carries a 1D array but each element in it is a cluster containing an array of MenuItem references associated with the submenu named in the corresponding element of the top SR.

As the screenshot shows, the first thing that happens in the loop is that the code checks to see if the indexed Item Tag is contained in the top SR. If the tag is missing from the array it means that the item is not a submenu, so the code uses its data to create a non-submenu MenuItem. In parallel with that operation, the code is also determining what to do with the reference that is being created by looking to see if the item’s Parent Tag exists in the top SR. If the item’s parent is also missing from the array, the code creates entries for it in both arrays. If the parent’s tag is found in the top SR, it means that one or more of the item’s sibling items has already been processed so code is executed to add the new MenuItem to the array of existing ones:

Compiling the Menu-2

Note that the new reference is added to the top of the array. The reason for this departure from the norm is that due to the way the sorting works, the menu order is also reversed and this logic puts the items on each submenu back in their correct order. Note also that during this processing the references associated the menu items are also accumulated in a separate array that will be used to initialize the callbacks. Because the array indexing operation is conditional, only a MenuItem that is not a submenu, will be included in this array.

Generating the Submenu “Code”

If the indexed Item Tag is found in the top SR, the item is a submenu and the MenuItem references needed to create its MenuItem should be in the array of references stored in the bottom SR.

Compiling the Menu-3

So the first thing the code does is delete the tag and its data from the two array (since they are no longer needed) and uses the data thus obtained to create the submenu’s MenuItem. At the same time, the code is also checking to see if the submenu’s parent exists in the top SR. As before, if the Parent Tag doesn’t exist in the array, the code creates an entry for it, and if it does…

Compiling the Menu-4

…adds the new MenuItem to the existing array – again at the top of the array. By the time this loop finishes, there should be only one element in each array. The only item left in the top SR should be “[top-menu]” and the bottom SR should be holding the references to the top-level menu items. The array of references is in turn used to create the ContextMenu object which written to the NotifyIcon object’s ContextMenu property.

What Could Possibly Go Wrong?

At this point, you can run the example code and see an iconic system tray interface that behaves pretty much as it did before, but with a few extra selections. However, we need to have a brief conversation about error checking, and frankly in this situation there are two schools of though on this topic. There is ample opportunity for errors to creep into the menu structure. Something as simple as misspelling a parent tag name could result in an “orphan” menu that would never get displayed – or could end up being the only one that is displayed. So the question is how much error checking do we really need to do? There are those that think you should spend a lot of time going through the logic looking for and trapping every possible error.

Given that most menus should be rather minimal, and errors are really obvious, I tend to concentrate on the low-hanging fruit. For example, one simple check that will catch a large number of possible errors, is looking to see if at the end of the processing, there is more than one menu name left in the top SR – and finding an extra one, asserting an error that gives the name of the extra menu. You should probably also use this error as an opportunity to abort the application launch since you could be left in a situation when you can’t shutdown the program because the “Quit” option is missing.

Something else that you might want to consider is what to do if the external file containing the menu definitions comes up missing. The most obvious solution is to, again, abort the application launch with some sort of appropriate error message. However, depending on the application it might be valuable to provide a hard-coded default menu that doesn’t depend on external files and provides a certain minimum level of functionality. In fact, I once worked on an application where this was an explicit requirement because one of the things that the program allowed the user to do was create custom menus, the structure of which was stored in external files.

Stooge Identifier – Release 2
Toolbox – Release 11

The Big Tease

So what are we going to talk about next time? Well something that I have seen coming up a lot lately on the user forum is the need to be able to work with very large datasets. Often, this issue arises when someone tries to display the results of a test that ran for several hours (or days!) only to discover that the complete dataset consists of hundreds of thousands of separate datapoints. While LabVIEW can easily deal with datasets of this magnitude, it should be obvious that you need to really bring you memory management “A” game. Next time will look into how to plot and manage VLDs (Very Large Datasets).

Until Next Time…

Mike…

A Brief Introduction to .NET in LabVIEW

From the earliest days of LabVIEW, National Instrument has recognized that it needed the ability to incorporate code that was developed in other programming environments. Originally this capability was realized through specialized functions called Code Interface nodes, or CINs. However as the underlying operating systems continued to develop, LabVIEW acquired the ability to leverage such things as DLLs, ActiveX controls and .NET assemblies. Unfortunately, while .NET solves many of the problems that earlier efforts to standardize sharable code exhibited, far too many LabVIEW developers feel intimidated by what they see as unmanageable complexity. The truth, however, is that there are many well-written .NET assemblies that are no more difficult to use than VI Server.

As an example of how to use .NET, we’ll look at an assembly that comes with all current versions of Windows. Called NotifyIcon, it is the mechanism that Windows itself uses to give you access to programs through the part of the taskbar called the System Tray. However, beyond that usage, it is also an interesting example of how to utilize .NET to implement an innovative interface for background tasks.

The Basic Points

Given that the whole point of this lesson is to learn about creating a System Tray interface for your application, a good place to start the discussion is with a basic understanding of how the bits will fit together. To begin with, it is not uncommon, though technically untrue, to hear someone say that their program was, “…running in the system tray…”. Actually, your program will continue to run in the same execution space, with or without this modification. All this .NET assembly does is provide a different way for your users to interact with the program.

But that explanation raises another question: If the .NET code allows me to create the same sort of menu-driven interface that I see other applications using, how do the users’ selections get communicated back to the application that is associated with the menu?

The answer to that question is another reason I wanted to discuss this technique. As we have talked about before, as soon as you have more than one process running, you encounter the need to communicate between process – often to tell another process that something just happened. In the LabVIEW world we often do this sort of signalling using UDEs. In the broader Windows environment, there is a similar technique that is used in much the same way. This technique is termed a callback and can seem a bit mysterious at first, so we’ll dig into it, as well.

Creating the Constructor

In the introduction to this post, I likened .NET to VI Server. My point was that while they are in many ways very different, the programming interface for each is exactly the same. You have a reference, and associated with that reference you have properties that describe the referenced object, and methods that tell the object to do something.

To get started, go to the .NET menu under the Connectivity function menu, and select Constructor Node. When you put the resulting node on a block diagram, a second dialog box will open that allows you to browse to the constructor that you want to create. The pop-up at the top of the dialog box has one entry for each .NET assembly installed on your computer – and there will be a bunch. You locate constructors in this list by name, and the name of the constructor we are interested in is System.Windows.Forms. On your computer there may be more than one assembly with this basic name installed. Pick the one with the highest version (the number in parentheses after the name).

In the Objects portion of the dialog you will now see a list of the objects contained in the assembly. Double click on the plus sign next to System.Windows.Forms and scroll down the list until you find the bullet item NotifyIcon, and select it. In the Constructors section of the dialog you will now see a list of constructors that are available for the selected object. In this case, the default selection (NotifyIcon()) is the one we want so just click the OK button. The resulting constructor node will look like this:

notifyicon constructor

But you may be wondering how you are supposed to know what to select. That is actually pretty easy. You see, Microsoft offers an abundance of example code showing how to use the assemblies, and while they don’t show examples in LabVIEW, they do offer examples in 2 or 3 other languages and – this is the important point – the object, property and method names are the same regardless of language so it’s a simple matter to look at the example code and, even without knowing the language, figure out what needs to be called, and in what order. Moreover, LabVIEW property and invoke nodes will list all the properties and methods associated with each type of object. As an example of the properties associated with the NotifyIcon object, here is a standard LabVIEW property node showing four properties that we will need to set for even a minimal instance of this interface. I will explain the first three, hopefully you should be able to figure out what the fourth one does on your own.

notifyicon property node

Starting at the top is the Text property. It’s function is to provide the tray icon with a label that will appear like a tip-strip when the user’s mouse overs over the icon. To this we can simply wire a string. You’ll understand the meaning of the label in a moment.

Giving the Interface an Icon

Now that we have created our NotifyIcon interface object and given it a label, we need to give it an icon that it can display in the system tray. In our previous screenshot, we see that the NotifyIcon object also has a property called Icon. This property allows you to assign an icon to the interface we are creating. However, if you look at the node’s context help you see that its datatype is not a path name or even a name, but rather an object reference.

context help window

But don’t despair, we just created one object and we can create another. Drop down another empty .NET constructor but this time go looking for System.Drawing.Icon and once you find the listing of possible constructors, pick the one named Icon(String fileName). Here is the node we get…

icon constructor

…complete with a terminal to which I have wired a path that I have converted to a string. In case you missed what we just did, consider that one of the major failings of older techniques such as making direct function calls to DLLs was how to handle complex datatypes. The old way of handling it was through the use of a C or C++ struct, but to make this method work you ended up needing to know way too much about how the function worked internally. In addition, for the LabVIEW developer, it was difficult to impossible to build these structures in LabVIEW. By contrast, the .NET methodology utilizes object-oriented techniques to encapsulate complex datatypes into simple-to-manipulate objects that accept standard data inputs and hide all the messy details.

Creating a Context Menu

With a label that will provide the users a reminder of what the interface is for, and an icon to visually identify the interface, we now turn to the real heart of the interface: the menu itself. As with the icon, assigning a menu structure consists of writing a reference to a property that describes the object to be associated with that property. In this case, however, the name of the property is ContextMenu, and the object for which we need to create a constructor is System.Windows.Forms.ContextMenu and the name of the constructor is ContextMenu(MenuItem[] menuItems).

context menu constructor

From this syntax we see that in order to initialize our new constructor we will need to create an array of menuItems. You got to admit, this makes sense: our interface needs a menu, and the menu is constructed from an array of menu items. So now we look at how to create the individual menu items that we want on the menu. Here is a complete diagram of the menu I am creating – clearly inspired by a youth spent watching way too many old movies (nyuk, nyuk, nyuk).

menu constructors

Sorry for the small image, but if you click on the image, you can zoom in on it. As you examine this diagram notice that while there is a single type of menuItem object, there are two different constructors used. The most common one has a single Text initialization value. The NotifyIcon uses that value as the string that will be displayed in the menu. This constructor is used to initialize menu items that do not have any children, or submenus. The other menuItem constructor is used to create a menu item that has other items under it. Consequently in addition to a Text initialization value, it also has an input that is – wait for it – an array of other menu items. I don’t know if there is a limit to how deeply a menu can be nested, but if that is a concern you need to be rethinking your interface.

In addition to the initialization values that are defined when the item is created, a menuItem object has a number of other properties that you can set as needed. For instance, they can be enabled and disabled, checked, highlighted and split into multiple columns (to name but a few). A property that I apply, but the utility which might not be readily apparent, is Name. Because it doesn’t appear anywhere in the interface, programmers are pretty much free to use is as they see fit, so I going to use it as the label to identify each selection programmatically. Which, by the way, is the next thing we need to look at.

Closing the Event Loop

If we stopped with the code at this point, we would have an interface with a perfectly functional menu system, but which would serve absolutely no useful purpose. To correct that situation we have to “close the loop” by providing a way for the LabVIEW-based code to react in a useful way to the selections that the user makes via the .NET assembly. The first part of that work we have already completed by establishing a naming convention for the menu items. This convention largely guarantees menu items will have a unique name by defining each menu item name as a colon-delimited list of the menu item names in the menu structure above it. For example, “Larry” and “Moe” are top-level menu items so their names are the same as their text values. “Shep” however is in a submenu to the menu item “The Other Stooge” so its name is “The Other Stooge:Shep”.

The other thing we need in order to handle menu selections is to define the callback operations. To simplify this part of the process, I like to create a single callback process that services all the menu selections by converting them into a single LabVIEW event that I can handle as part of the VI’s normal processing. Here is the code that creates the callback for our test application:

callback generator

The way a callback works is that the callback node incorporates three terminals. The top terminal accepts an object reference. After you wire it up, the terminal changes into a pop-up menu listing all the callback events that the attached item supports. The one we are interested in is the Click event. The second terminal is a reference for the VI that LabVIEW will have executed when the event you selected is fired. However, you can’t wire just any VI reference here. For it to be callable from within the .NET environment it has to have a particular set of inputs and a particular connector pane. To help you create a VI with the proper connections, you can right-click on the terminal and select Create Callback VI from the menu. The third terminal on the callback registration node is labelled User Parameters and it provides the way to pass static application-specific data into the callback event.

There are two important points here: First, as I stated before, the User Parameters data is static. This means that whatever value is passed to the terminal when the callback is registered is from then on essentially treated as a constant. Second, whatever you wire to this terminal modifies the data inputs to the callback VI so if you are going to use this terminal to pass in data, you need to wire it up before you create the callback VI.

In terms of our specific example, I have an array of the menu items that the main VI will need to handle so I auto-index through this array creating a callback event for each one. In all cases, though, the User Parameter input is populated with a reference to a UDE that I created, so the callbacks can all use the same callback VI. This is what the callback VI looks like on the inside:

callback vi

The Control Ref input (like User Parameter) is a static input so it contains the reference to the menu item that was passed to the registration node when the callback was created. This reference allows me to read the Name property of the menu item that triggered the callback, and then use that value to fire the SysTray Callback UDE. It’s important to remember when creating a callback VI to not include too much functionality. If fact, this is about as much code as I would ever put in one. The problem is that this code is nearly impossible to debug because it does not actually execute in the LabVIEW environment. The best solution is to get the selection into the LabVIEW environment as quickly as possible and deal with any complexity there. Finally, here is how I handle the UDE in the main VI:

systray callback handler

Here you can see another reason why I created the menu item names as I did. Separating the different levels in the menu structure by colons allows to code to easily parse the selection, and simultaneously organizes the logic.

Future Enhancements

With the explanations done, we can now try running the VI – which disappears as soon as you start it. However, if you look in the system tray, you’ll see its icon. As you make selections from its menu you will see factoids appear about the various Stooges. But this program is just the barest of implementations and there is still a lot you can do. For example, you can open a notification balloon to notify the user of something important, or manipulate the menu properties to show checkmarks on selected items or disable selections to which you want block access.

The most important changes you should make, however, are architectural. For demonstration purposes the implementation I have presented here is rather bare-bones. While the resulting code is good at helping you visualize the relationships between the various objects, it’s not the kind of code you would want to ship to a customer. Rather, you want code that simplifies operation, improves reusability and promotes maintainability.

Stooge Identifier — Release 1

The Big Tease

So you have the basics of a neat interface, and a basic technique for exploring .NET functionality in general. But what is in store for next time? Well I’m not going to leave you hanging. Specifically, we are going to take a hard look at menu building to see how to best modularize that functionality. Although this might seem a simple task, it’s not as straight-forward as it first seems. As with many things in life, there are solutions that sound good – and there are those that are good.

Until Next Time…

Mike…