Using Worker Threads

Home
Back To Tips Page

Worker threads are an elegant solution to a number of problems about concurrent processing; for example, the need to keep the GUI active while a computation is being performed. This essay addresses some of the issues of using worker threads. A companion essay talks about some techniques dealing with user-interface threads.

(14-Jun-08) Fixed a bug reported by Giovanni Dicanio, missing const on a parameter

(27-Jan-01) In response to several questions about ResumeThread, I've expanded my discussion of this topic.

(27-Jan-01) I've added, in response to some issues in the newsgroup, a discussion of how to wait for a thread to start up.

(20-Jan-01) An alternative mechanism for doing thread shutdown--specifically, how you detect within the thread it is being shut down--is now documented (the method I wished I'd thought of when I wrote the chapter in Win32 Programming, and figured out about six months after it went to press). I had meant to put this in the original essay and forgot about it.

(10-Apr-00) A flaw in the thread shutdown logic was pointed out to me; you have to inhibit auto-deletion of the CWinThread-derived object explicitly!

A new section on shutting down view-related threads has been added.

(28-Jan-00) The description of pausing and resuming threads has been enhanced with a more detailed discussion of why SuspendThread and ResumeThread must be avoided.

Why Worker Threads?

Consider a simple implementation of a program. For our purposes, the program has the task of inverting the color of every pixel of an image, just because we sometimes need a concrete example to illustrate techniques. For the sake of our example, we will assume that the image being processed is 10 megapixels of 24-bit color.

The GUI has a menu item or other means of saying "Invert it now". This calls the doInvert method on the view class:

void CMyView::doInvert()
    {
     for(int x=y = 0; y < image.height; y++)
          for(int x = 0; x < image.width; x++)
              changePixel(x, y);
    }

This is a perfectly correct program. It traverses all 10 megapixels, happily changing the pixels, until it completes. But it is not a good implementation.

Why not? Because the entire GUI is dead until the operation completes. This means that for some long duration, the user is forced to wait while the operation proceeds, and there is absolutely nothing that can be done about it. If the user decides that the transformation is bogus, and wants to stop it, well, tough. It is going to complete anyway.

Doing it the obsolete, and hard, way

One solution, the antiquated and largely unused 16-bit Windows solution (but still used because it is "well known"), is to use PeekMessage, an API call that does not block when there is no message.

void CMyView::doInvert()
    {
     running = TRUE; 
     for(int x=y = 0; running && y < image.height; y++)
          for(int x = 0; running && x < image.width; x++)
              { /* change it */
               MSG msg;
               if(PeekMessage(&msg, AfxGetMainWnd()->m_hWnd,
                              0, 0, PM_REMOVE))
                   { /* handle it*/
                    TranslateMessage(&msg);
                    DispatchMessage(&msg);
                   } /* handle it */
               changePixel(x, y);
              } /* change it */
    }

This is bad for several reasons. The most important one is that it puts, in the time-critical main loop, a function whose overhead is substantial. Now, instead of taking k minutes (whatever that was before) we might find the algorithm takes a significant multiple of that to complete. But while it is running, the GUI is still active. You could even, if you were not careful, fire off another thread to paint each green pixel purple. This Is Not A Good Idea.

The performance hack is simple: only poll occasionally. For example, if we assume that the images are roughly rectangular, we could change the code to

void CMyView::doInvert()
    {
     running = TRUE; 
     for(int y = 0; running && y < image.height; y++)
         { /* do one row */
          MSG msg;
          if(PeekMessage(&msg, AfxGetMainWnd()->m_hWnd,
                         0, 0, PM_REMOVE))
             { /* handle it*/
              TranslateMessage(&msg);
              DispatchMessage(&msg);
             } /* handle it */
 
          for(int x = 0; running && x < image.width; x++)
              { /* change it */
               changePixel(x, y);
              } /* change it */
         } /* do one row */
    }

Thus this tests only once per row, which can either be too often if the rows are short, or not often enough if the rows are long. The generalization, changing the test to

if(x % 10 == 0 && PeekMessage(...))

will work if the rows are too short.

There are some problems remaining; for example, if there is a modeless dialog active, you will notice that there is no IsDialogMessage call there to handle it. Oops. Nor, for that matter, is there code to handle ActiveX event notifications. Oops. And it presumes that you are writing in pure C, and doing the dispatch yourself. Life gets more complicated when you have to support the message maps and control paradigms of MFC. Oops, squared.

But why bother when there is a better way?

The Thread Solution

It is almost always the case that you can use threads to do the job more easily. This is not without certain costs and hazards, but it ultimately is the better method.

Here's a solution to handling the invert-everything. Note that we have to move from the pure-C domain (which we interface to via a static method) to the MFC domain.

To the class (in this example, a CView-derived class), add the following declarations:

static UINT run(LPVOID p);
void run();
volatile BOOL running;

To start a thread, your handler does

void CMyView::doInvert()
    {
     running = TRUE;
     AfxBeginThread(run, this);
    }
 
UINT CMyView::run(LPVOID p)
    {
     CMyView * me = (CMyView *)p;
     me->run();
     return 0;
    }
 
void CMyView::run()
   {
     for(int x=y = 0; running && y < image.height; y++)
          for(int x = 0; running && x < image.width; x++)
              changePixel(x, y);
    running = FALSE;
   }

The command to stop the thread is very simple:

void CMyView::OnStop()
   {
    running = FALSE;
   }

That's all there is to it!

Well, almost. Keep reading.

For example, the above code assumes that the thread will not try to access any member variables of the view unless it is certain they exist. This includes handles to synchronization primitives or pointers to a CRITICAL_SECTION that might be used to synchronize interactions between the thread and its view. This requires a more graceful shutdown mechanism.

Note that the declaration of the running variable includes the modifier volatile. This is because under certain optimizations, the compiler will discover that in the body of the loop there is absolutely nothing that changes the running flag, and therefore, cleverly, it can avoid testing it each time through the loop. This means that although you change the value in another thread, the change is never seen. By adding the volatile modifier, you tell the compiler that it cannot assume the variable will remain unmodified during the execution of the loop, even though there is no code in the loop that can change the variable.

Worker threads and the GUI I: Enabling controls

The problem is that when your worker thread is running, there are probably lots of things you shouldn't be doing. For example, starting a thread to do a computation. Then you'd have two threads running doing the same or similar computations, and that way madness lies (we'll assume for the moment that this actually is a Bad Thing).

Fortunately, this is easy. Consider your ON_UPDATE_COMMAND_UI handler

void CMyView::OnUpdateInvertImage(CCmdUI * pCmdUI)
    {
     pCmdUI->Enable(!running && (whatever_used_to_be_here));
    }

Generalizing this to cover other menu items is left as an Exercise For The Reader. However, note that this explains why there is that assignment "running = FALSE;" at the end of the thread handling routine above: it explicitly forces the running flag to reflect the running status of the thread. (Well, if you are being very pedantic, note that it is possible to start another thread before the current one finishes if the current one does not quickly get back to test the running flag, so you may wish to use a separate Boolean variable to indicate the thread state. Set it before the thread starts, and clear it only after the thread loop completes. For most thread usages, a single running flag usually suffices.

Worker threads and the GUI II: Don't touch the GUI

That's right. A worker thread must not touch a GUI object. This means that you should not query the state of a control, add something to a list box, set the state of a control, etc.

Why?

Because you can get into a serious deadlock situation. A classic example was posted on one of the discussion boards, and it described something that had happened to me last year. The situation is this: you start a thread, and then decide to wait for the thread to complete. Meanwhile, the thread does something apparently innocuous, such as add something to a list box, or, in the example that was posted, calls FindWindow. In both cases, the process came to a screeching halt, as all threads deadlocked.

Let's analyze these two situations in some detail, so you get a sense of what happened.

In my case, the list box sent a notification, via SendMessage, to its parent. This means the message went to its parent thread. But the parent thread was blocked, waiting for the thread to complete. But the thread couldn't complete until it could run, and guess what: the SendMessage was a cross-thread SendMessage, which would not return until it was processed. But the only thread that could process it was blocked, and couldn't run until the thread completed. Welcome to the world of deadlock.

The FindWindow problem was quite similar. The programmer had specified finding a window with a particular caption. This meant that the thread running FindWindow had to SendMessage a WM_GETTEXT message to the window whose handle it had just found via EnumWindows. This message could not be processed until the thread that owned the window could execute. But it couldn't, because it was blocked waiting for the thread to finish. Deadlock. So note that although you should not touch the GUI thread explicitly, you must also not touch it implicitly, through such innocuous-seeming operations such as FindWindow, or you can, and will, experience deadlock. Under these conditions, by the way, there is no recovery. You can kill the process, which explicitly terminates all threads, but this is neither elegant nor, as the warning box tells us, necessarily safe.

How do you get around these problems?

In my case, I had just been sloppy. I actually know better, and have written as much. So much for being an expert. The workaround in almost all cases is that you must never to a SendMessage to a GUI object. In very rare cases, you can do a PostMessage, although this usually won't work because you need to pass in a pointer to a string, and cannot tell when the operation has finished so you can't tell when to release the string. Any string you pass in must be allocated either in static storage (as a constant), or on the heap. If it is allocated in writeable static storage, or on the heap, you need to know when the value can be reused or freed. PostMessage does not allow you to do this.

Therefore, the only solution is to use a user-defined message, and post it to the main GUI thread (usually the main frame, but frequently a CView-derived object). The main GUI thread then handles the SendMessage to the GUI object, and, knowing then when the operation has completed, is able to free up any resources.

You must not send a pointer to a stack-allocated object in a PostMessage call. By the time the operation begins to execute, the chances are excellent that the object will have been removed from the stack. The results of this are Not A Pretty Sight.

This essay will not go into describing how to manage user-defined messages. That is covered in my essay on Message Management. We will assume that there are some user-defined messages, with names like UWM_ADDSTRING, that are already defined.

Here's some code that adds window names to a list box:

void CMyDialog::AddNameToControl(const CString & name)
    {
     CString * s = new CString(name);
     PostMessage(UWM_ADDSTRING, 0, (LPARAM)s;
    }

To the collection of handlers in CMyDialog.h, I add the declaration

afx_msg LRESULT OnAddString(WPARAM, LPARAM);

To the MESSAGE_MAP of my class, I add

    ON_REGISTERED_MESSAGE(UWM_ADDSTRING, OnAddString);

or

    ON_MESSAGE(UWM_ADDSTRING, OnAddString)

(If you're curious about the distinction, read my essay on Message Management).

The handler then looks like this:

LRESULT CMyDialog::OnAddString(WPARAM, LPARAM lParam)
    {
     CString * s = (CString *)lParam;
     c_ListBox.AddString(*s);
     delete s;
     return 0;
    }

The sender of the message must allocate the value being sent on the heap, which it does via the new operator. The message is posted to the main GUI thread, which eventually routes it to the OnAddString handler. This particular handler knows that the message is destined for a particular control, c_ListBox (if you don't understand how to get control variables for controls, read my essay on the subject). Note that we could have put the destination control ID, or the destination CWnd-class variable, in wParam, and made this more general. The handler calls the AddString method. When this call completes, it is now known that the string value is no longer required (this would be different if we had an owner-draw listbox without LBS_HASSTRINGS, but if you already know how to do that, the solution should be evident). Therefore, we can now delete the heap-allocated object, which for us is a CString.

Worker Threads and the GUI III: Dialogs and MessageBoxes

You must not try to launch any GUI window from a worker thread. This means a worker thread cannot call on MessageBox, DoModal, Create of a modeless dialog, and so on. The only way you can handle this is to post a user-defined message back to the main GUI loop to perform this service for you.

If you attempt to do this, you will get various odd failures. You will get ASSERT errors from MFC, dialogs will fail to come up, and essentially you will get all sorts of antisocial behavior from your application. If you really, really need to have GUI objects operating from a thread, you must not use a worker thread. You must use a user-interface thread, which has a message pump. I'm not an expert on these, although I've done a brief article on what I discovered about them while doing a SAPI application, and this may be of some assistance in using them.

If you must pause the thread until the dialog or MessageBox completes, you should probably use an Event for synchronization. What you would do is create, for example, an auto-reset Event, which is created in the Cleared state. Post a user-defined message (see my article on Message Management) to the main GUI thread, which will then launch the dialog. When the dialog completes, it calls ::SetEvent to set the event. Meanwhile, the thread, after posting the message, uses ::WaitForSingleObject to wait for the event to be set. The thread will block until the dialog completes and sets the event.

I really should check the VC++ 6.0 implementation of CEvent. The VC++ 4.2 implementation was so full of errors as to be totally unusable. You Have Been Warned.

Worker Threads and the GUI IV: AfxGetMainWnd

Well, I've shown how to post a message to the window for the class. But what if you want to post a message to the main window? This is obvious, right? Just do

AfxGetMainWnd()->PostMessage(...)

and you're done! Right? Of course not. It should be obvious by now that I wouldn't pose such a question if the answer was the obvious one. 

What will happen is either an ASSERT failure or an access fault. If you exercise a bit of debugging cleverness, you will find that the pointer to the main window is NULL! But how can this be! The application clearly has a main window...

Well, the answer is, yes, the application has a main window. But that's not what AfxGetMainWnd is defined as returning! Read the documentation carefully, and you will see that it says:

If AfxGetMainWnd is called from the application's primary thread, it returns the application's main window according to the above rules. If the function is called from a secondary thread in the application, the function returns the main window associated with the thread that made the call. [emphasis added]

A worker thread has no main window. Therefore, the call will return NULL. The workaround is to obtain a pointer to the application's main window by calling AfxGetMainWnd in the primary thread, and store it in a place (such as a member variable of the class) where the worker thread can find it.

Waiting for a thread to start (27-Jan-01)

I recently processed a newsgroup message which was discussing the problem of waiting for thread startup. The proposed mechanism looked like this:

BOOL waiting; // flag (this is one bug right here!)
void CMyClass::OnStartThread()
   {
    waiting = TRUE;
    AfxBeginThread(myThread, something);
    while(waiting) /* spin */ ;   // horrible!
   }

UINT CMyClass::myThread(LPVOID whatever) // *** static *** member function
   {
    ... initialization sequence here
    waiting = FALSE; // allow initiatior to continue (probably won't work)
    ... thread computations
    return 0;
   }

This code has several problems. First, it is a terrible implementation; it requires that the parent thread run until its timeslice ends, which means that it delays the startup of the created thread. This alone is enough reason to throw the code out. But even if it were acceptable, it is still wrong. The waiting variable must be declared as volatile, because under the optimizing compiler, the while loop which spins may or may not be executed, and if it is executed, it will probably never exit. I discuss this in more detail in my essay on "Surviving the Release Build". But the bottom line is that when you have a variable of any sort that can be modified from one thread and whose modification must be detected in another, you must declare it as volatile.

The busy-wait is the serious disaster here. Since a timeslice, or quantum, is 200 ms in NT, it means that the thread can waste up to 200 ms, doing nothing, before allowing the spawned thread to run. If the spawned thread blocks on something like an I/O operation, and control returns to the creator, each time control returns to the creator, it will burn up another 200ms (the kernel doesn't care that it is doing nothing but polling a Boolean variable that can never change while it is polling it on a uniprocessor; it only knows that the thread is executing). As you can see, it doesn't take very many I/O operations in the thread to add many seconds to the perceived thread startup time.

The correct solution is shown below. In this solution, a manual-reset Event is used. To simplify the code, we create it just before it is needed, and destroy it immediately afterwards; in the case where the thread may be started up several times, the optimization to move this out to a member variable should be obvious. Note that doing this as a member variable suggests that the Event would be created in the class's constructor and destroyed in its destructor.

class CMyClass : public CView { // or something like this...
     protected:
         HANDLE startupEvent;
 };

void CMyClass::OnStartThread()
   {
     startupEvent = ::CreateEvent(NULL, // no security attributes
                                  TRUE, // manual-reset
                                  FALSE,// initially non-signaled
                                  NULL);// anonymous
     AfxBeginThread(myThread, this);
     switch(::WaitForSingleObject(startupEvent, MY_DELAY_TIME))
         { /* waitfor */
          case WAIT_TIMEOUT:
              ... report problem with thread startup
              break;
          case WAIT_OBJECT_0:
              ... possibly do something to note thread is running
              break;
         } /* waitfor */
     CloseHandle(startupEvent);
     startupEvent = NULL; // make nice with handle var for debugging
   } 

UINT CMyClass::myThread(LPVOID me)
   {
    CMyClass * self = (CMyClass *)me;
    self->run;
    return 0;
   }

void CMyClass::run( )
   {
    ... long initialization sequence
    ::SetEvent(staruptEvent);
    ... loop computations
   }

Note that I haven't done anything about dealing with the fact that the startup timeout may be incorrect and the thread is still trying to start; this could be handled by, for example, attempting to ::WaitForSingleObject on the thread handle with a wait time of 0; if it times out, the thread is running; if it returns with WAIT_OBJECT_0 the thread has halted. This requires that you deal with the issues of the CWinThread object possibly being deleted before you can get the handle. No, I'm not going to try to write every possible line of code.

Actually, it is rare that I would do something like this. I'd be more inclined to use messages posted to the main window to establish state for the GUI: the thread is starting, the thread has started, the thread has terminated (note that it may be terminated without being started). This avoids the issues about the GUI blocking until the thread has actually completed the startup sequence, or dealing with the timeout issue if the thread died somehow before doing the ::SetEvent.

void CMyClass::OnStartThread( )
   {
    AfxBeginThread(myThread, this);
    PostMessage(UWM_THREAD_STARTING);
   }

UINT CMyClass::myThread(LPVOID me) // *** static *** member function
   {
    CMyClass * self = (CMyClass *)me;
    self->run( );
    return 0;
   }

void CMyClass::run( )
   {
    ... lengthy startup sequence
    PostMessage(UWM_THREAD_STARTED);
    ... long thread computation
    PostMessage(UWM_THREAD_STOPPING);
    ... long thread shutdown
    PostMessage(UWM_THREAD_TERMINATED);
   }

I use the above paradigm in many contexts. Note that it completely eliminates the need for synchronization, but adds some complexity to the GUI. For example, imagine that I have in the GUI (in this case, since I'm posting to the view, it is view-specific state) a member variable that encodes the current state: terminated-or-never-started, starting, stopping, and running. I might have menu items called Start Computation, Pause Computation, Cancel Computation. I would create ON_UPDATE_COMMAND_UI handlers that responed as follows:

void CMyClass::OnUpdateStart(CCmdUI * pCmdUI)
   {
    pCmdUI->Enable(threadstate == MY_THREAD_STOPPED);
   }
void CMyClass::OnUpdatePause(CCmdUI * pCmdUI)
   {
    pCmdUI->Enable(threadstate == MY_THREAD_RUNNING);
   }
void CMyClass::OnUpdateCancel(CCmdUI * pCmdUI)
   {
    pCmdUI->Enable(threadstate == MY_THREAD_RUNNING);
   }

Providing I didn't really need to wait (and I find that I rarely do), I have now avoided the need to introduce a blocking synchronization event in the main GUI thread, which could potentially lock up the GUI. Note also that I might change the Cancel case to allow for the thread to be cancelled even if it is the middle of starting, providing that this makes sense in the context of the thread computation. In this case, I'd have to "poll" the cancel flag during the startup, for example, by splitting out the startup into a separate function:

void CMyClass::run( )
   {
    BOOL byCancel = FALSE;
    if(!startMe( ))
     { 
      PostMessage(UWM_THREAD_STOPPED, TRUE); // stopped by cancel
      return;
     }
    PostMessage(UWM_THREAD_STARTED);
    while(!cancelFlag)
      { /* thread loop */
    ...lengthy thread computation
      } /* thread loop */
    
    byCancel = cancelFlag;
    PostMessage(UWM_THREAD_STOPPING);
    ...lengthy thread shutdown
    PostMessage(UWM_THREAD_STOPPED, byCancel);
   }

BOOL CMyClass::startMe( )
   {
    ...do something
    if(cancelFlag)
      return FALSE;
    ...open the file on the server
    if(cancelFlag)
       {
        ...close the file on the server
        return FALSE;
       }
    ... more stuff, following above idea
    return TRUE;
   }

As usual in such cases, it is important to undo everything you have done if you detect the CancelFlag has been set during startup. I've also defined WPARAM of the message to include a flag that indicates if the thread stopped because it stopped normally or stopped because the user cancelled it (which I might use to display in the log that the thread was stopped by user request). I might also extend this to a set of enumerated types to communicate back error codes in case the thread decided to terminate because of some problem. I might even use LPARAM to hold the ::GetLastError code. You see, there are many themes-and-variations of this basic scheme.

Pausing a Thread and Thread Shutdown

A thread may have to stop and wait for some reason. Perhaps the user has clicked a "pause" check box or pushed a "stop" button. Perhaps the thread has nothing to do, and is waiting for some information, such as request packet, to process. The problem is that you need to shut down all the threads before a process exits (note: in Windows CE, shutting down the main thread shuts down the process, and all threads owned by the process. This is not true in Win9x, Windows NT, or Windows 2000). A typical bug encountered is that you shut down your program, recompile it, and get an error that it is unable to write the executable file. Why? Because the program is still running. But you don't see it on the taskbar. So you bring up the Task Manager (via Ctrl-Alt-Del) and it isn't there, either. But something has got your executable tied up! The answer is that if you look in the NT Task manager under the processes tab, or use pview to look at processes, you will indeed find that your program is still running. This is usually because you failed to shut down one or more worker threads. As long as any one thread exists, even if it is blocked, your process is still alive. Of course, if you've killed the GUI thread, so the main window is gone and nothing is visible. But it is still lurking. Of course, you can use the NT Task Manager, or pview, to kill the process by terminating all of its threads, but this is a little bit like using dynamite to lift your car to change the tire. Sure, it lifts the car, but there are a few other things that go wrong which are considered undesirable side effects.

Three functions immediately present themselves for purposes of pausing or shutting down a thread: the SuspendThread and ResumeThread methods (and their underlying API calls, ::SuspendThread and ::ResumeThread) and ::TerminateThread. Assume, for all practical purposes, except in some very limited contexts, these functions do not exist. Using them will almost always get you in trouble.

The limited contexts in which these can be used are

Note that it is not a good idea to have a thread call TerminateThread on itself, because this will mean that it terminates instantly. The implication of this is that the DLL_THREAD_DETACH events for various DLLs will not be executed, which can lead to the misbehavior of a DLL you didn't even know you were using! (If you don't understand what this means, take it as meaning: bypassing DLL_THREAD_DETACH is a Very Bad Thing). Instead, if a thread wants to kill itself, it should call ExitThread, which guarantees the correct notification of all the DLLs. A thread that is started by _beginthread should call _endthread to terminate itself (which calls ExitThread), and a thread that is started by AfxBeginThread should call AfxEndThread to terminate itself (which again calls ExitThread).

Note that you should not substitute SuspendThread/ResumeThread for the proper use of synchronization objects such as Events and Mutexes.

To illustrate why it is a Bad Idea to let one thread suspend another, let's take a simple case: you have a worker thread off doing something, and the something involves memory allocation. You click the "Pause" button on your GUI, which immediately calls SuspendThread. What happens? The worker thread stops. Right now. Immediately. No matter what it is doing. If your worker thread happens to be in the storage allocator, you have just shut down your entire application. Well, not quite--it won't shut down until the next time you try to allocate memory from the GUI thread. But the MFC library is fairly liberal in using memory allocation, so there is an excellent chance that the next call to the MFC library from any thread will simply stop that thread dead in its tracks. If it is your GUI thread (the most likely one) your app appears to hang.

Why is this?

The storage allocator is designed to be thread-safe. This means that at most one thread at a time is permitted to be executing it. It is protected by a CRITICAL_SECTION, and each thread which attempts to enter the allocator blocks if there is already a thread in the allocator. So what happens if you do SuspendThread? The thread stops dead, in the middle of the allocator, with the critical section lock held. This lock will not be released until the thread resumes. Now, if it happens that your GUI requires an allocation as part of resuming the thread, an attempt to resume the thread will block, producing classic deadlock. And if you did a ::TerminateThread, then there is no way the lock will ever be released. And without SuspendThread, there is no need for ResumeThread

Ah, you say, but I know I don't do any memory allocation either in the worker thread or the GUI. So I don't need to worry about this!

You're wrong.

Remember that the MFC library does allocations you don't see. And allocation is only one of many critical sections that exist inside the MFC library to make it thread-safe. And stopping in any of them will be fatal to your app.

Ignore the existence of these functions.

So how do you suspend or terminate a thread?

The problems of shutting down a thread and pausing a thread are closely related, and my solution is the same in both cases: I use a synchronization primitive to effect the pause by suspending the thread, and use timeouts on the primitive to allow me to poll for shutdown.

Sometimes the synchronization primitive is a simple event, such as in the example below where I wish to be able to pause the thread. In other cases, particularly where I'm using the thread to service a queue of events, I will use a synchronization primitive such as a semaphore. You can also read my essay on GUI Thread Techniques.

Typically, I use a worker thread in the background, and it simply polls some state (such as a Boolean) to determine what it should be doing. For example, to pause a thread, it looks at the Boolean that says "Paused", which is set when (for example) a checkbox is set:

// thread body:
while(running)
   { /* loop */
    if(paused)
       switch(::WaitForSingleObject(event, time))
          {
           case WAIT_OBJECT_0:
              break;
           case WAIT_TIMEOUT:
              continue;
          }
     // rest of thread
    } /* loop */

The trick of doing the continue for the timeout means that the thread will regularly poll for the running flag being clear, simplifying your shutdown of the application. I typically use 1000 for the time, polling once a second for shutdown. 

Why did I do the apparently redundant test of the paused Boolean variable before doing the ::WaitForSingleObject? Wouldn't waiting on the object be sufficient?

Yes, the paused Boolean is an optimization hack. Because I use ::WaitForSingleObject, each pass through this loop involves a moderately heavy-duty operation to test for continuance. In a high-performance loop this would introduce a completely unacceptable performance bottleneck. By using a simple Boolean I can avoid the heavy-duty kernel call when I don't need it. If performance is not an issue, you can eliminate this extra test.

The code in the main GUI thread that sets these variables is as shown below:

CMyDialog::OnPause()
   {
    if(paused && c_Pause.GetCheck() != BST_CHECKED)
       { /* resume */
        paused = FALSE;
        SetEvent(event);
       } /* resume */
    else
    if(!paused && c_Pause.GetCheck() == BST_CHECKED)
       { /* pause */
        paused = TRUE;
        ResetEvent(event);
       } /* pause */
   }

where event is a handle from ::CreateEvent. Avoid CEvent--at least the last time I tried it, it was so unbelievably brain-dead buggy that it was unusable. I haven't checked the VC++ 6.0 implementation, so it may have been fixed, but the bottom line is that the MFC interface to synchronization primitives has been deeply suspect, and gains nothing over using the actual primitives.

There's another, slightly more complex, mechanism for doing a thread shutdown without polling, which I discuss in a later section.

Shutting down a thread from a view or main frame

There is sometimes a problem in shutting down a thread. If you don't do things in the right order, you could even shut down your GUI thread while the worker thread is still running, which can lead to all sorts of interesting problems. Interesting, as in the traditional Chinese curse. So here's a method I've used to shut down a thread a be sure it is shut down before the view is shut down.

First, you must store a pointer to the CWinThread object in your view, so declare

CWinThread * myWorkerThread;

in your view. When you create the worker thread, create it as

     myWorkerThread = AfxBeginThread(run, this);

You will need this variable to synchronize the shutdown with the view termination.

void CMyView::OnClose()
    {
     // ... determine if we want to shut down
     // ... for example, is the document modified?
     // ... if we don't want to shut down the view, 
     // ... just return
  
     // If we get here, are are closing the view
     // note that we had to have previously set m_bAutoDelete
     // in the thread object myWorkerThread to FALSE
     running = FALSE;
     WaitForSingleObject(myWorkerThread->m_hThread, INFINITE);
     delete myWorkerThread;
     CView::OnClose(); // or whatever the superclass is
    }

The only odd thing that appears in the previous function is the saving of the m_bAuthoDelete flag explicitly to FALSE. This is because the deletion of the CWinThread-derived object can close the handle of the thread, rendering the subsequent WaitForSingleObject invalid. By inhibiting the auto-deletion, we can wait on the thread handle. We then do the explicit deletion of the CWinThread-derived object ourselves, since it is now no longer useful.

Special thanks to Charles Doucette for pointing out a flaw in my original article which he found in another essay by Doug Harrison: there was a race condition; I had previously stored the handle and shut down the thread. But the auto-delete invalidated the handle which could lead to incorrect behavior.  

By storing the handle to a variable, we can then do a WaitForSingleObject on the thread. The close operation then blocks until the thread terminates. Once the thread has terminated, we can proceed with the close by calling our superclass OnClose handler (in this example, we are a derived class of CView).

There is a caution here: this assumes that the thread will actually terminate "within a reasonable time". If you have a thread that is blocked on I/O or a synchronization object, you will need to add a timeout mechanism as I have already described. Note that this prohibits the use of CRITICAL_SECTION as a synchronization object since they don't have a timeout capability. If you're blocked on a CRITICAL_SECTION you're stuck forever.

Of course, in the general case you may have several synchronization mechanisms that are necessary to ensure the thread will terminate within a reasonable period of time. A serious design flaw in the whole AfxBeginThread mechanism is that it doesn't allow me to create a CWinThread-derived subclass of my own which is the object created. In this case, I've sometimes subclassed CWinThread and bypassed the AfxBeginThread by doing my own thread creation inside my subclass, and exporting methods such as CMyWinThread::shutdown that do whatever is needed to make the thread shut down cleanly and quickly.

Thread Shutdown Without Polling (20-Jan-01)

The technique I use that polls every few seconds does have two implications: it makes the thread active every few seconds, and it sets a limit on responsiveness on a shutdown. The shutdown of the application becomes limited by the maximum time it takes to get out of the thread-wait operation. There is an alternative implementation I have also used, which involves using a second Event. 

HANDLE ShutdownEvent;

This should be initialized via CreateEvent. What I do when I'm using this technique is include it in a class derived from CWinThread, which makes the thread creation slightly trickier. This is because AfxBeginThread always creates a new CWinThread object, but if you need your own CWinThread-derived class, you can't use AfxBeginThread. The technique shown below generalizes this. Note that if I wanted to be really general, I would create a template class. I leave that as an Exercise For The Reader.

/***********************************************************************
*                                class CMyThread
***********************************************************************/
class CMyThread : public CWinThread {
     public:
        CMyThread( );
        virtual ~CMyThread( );
        static CMyThread * BeginThread(LPVOID p);
        void Shutdown( );
        enum { Error, Running, Shutdown, Timeout };
     protected: // data
        HANDLE ShutdownEvent;
        HANDLE PauseEvent;
};
/**********************************************************************
*                        CMyThread::CMyThread
* Inputs:
*        AFX_THREADPROC proc: Function to be called
*        LPVOID p: Parameter passed to proc
***********************************************************************/
CMyThread::CMyThread(AFX_THREADPROC proc, LPVOID p ) : CWinThread(proc, p)
   {
     m_bAutoDelete = FALSE;
     ShutdownEvent = ::CreateEvent(NULL,   // security
                                   TRUE,   // manual-reset
                                   FALSE,  // not signaled
                                   NULL);  // anonymous

     PauseEvent = ::CreateEvent(NULL,      // security
                                TRUE,      // manual-reset
                                TRUE,      // signaled
                                NULL);     // anonymouse
   }

/**********************************************************************
*                         CMyThread::~CMyThread
**********************************************************************/
CMyThread::~CMyThread( )
   {
    ::CloseHandle(ShutDownEvent);
    ::CloseHandle(PauseEvent);
   }

/*********************************************************************
*                        CMyThread::BeginThread
* Result: CMyThread *
*        Newly-created CMyThread object
*********************************************************************/
CMyThread * CMyThread::BeginThread(AFX_THREADPROC proc, LPVOID p)
   {
    CMyThread * thread = new CMyThread(proc, p);
    if(!thread->CreateThread( ))
        { /* failed */
         delete thread;
         return NULL;
        } /* failed */
    return thread;
   }
/*********************************************************************
*                         CMyThread::Wait
* Result: DWORD
*       WAIT_OBJECT_0 if shutting down
*       WAIT_OBJECT_0+1 if not paused
* Notes:
*       The shutdown *must* be the 0th element, since the normal
*       return from an unpaused event will be the lowest value OTHER
*       than the shutdown index
*********************************************************************/
DWORD CMyThread::Wait( )
   {
    HANDLE objects[2];
    objects[0] = ShutdownEvent;
    objects[1] = PauseEvent;
    DWORD result = ::WaitForMultipleObjects(2, objects, FALSE, INFINITE);
    switch(result)
      { /* result */
       case WAIT_TIMEOUT:
           return Timeout;
       case WAIT_OBJECT_0:
           return Shutdown;
       case WAIT_OBJECT_0 + 1:
           return Running;
       default:
           ASSERT(FALSE); // unknown error
           return Error;
      } /* result */
   }

/********************************************************************
*                        CMyThread::Shutdown
* Effect:
*        Sets the shutdown event, then waits for the thread to shut
*        down
********************************************************************/
void CMyThread::Shutdown( )
   {
    SetEvent(ShutdownEvent);
    ::WaitForSingleObject(m_hThread, INFINITE);
   }

Note that I don't make provision here for the full set of options for CreateThread, since the threads I create do not need flags, stack size, or security attributes; you would need to make the obvious extensions if you need these features.

To call it from an application, I do something like the following. In the declaration of the class in which the thread will run, such as a view class, I add declarations like

CMyThread * thread; // worker thread
static UINT MyComputation(LPVOID me);
void ComputeLikeCrazy( );

Then I add methods, such as this one that responds to a menu item or pushbutton in a view:

void CMyView::OnComputationRequest( )
   {
    thread = CMyThread::BeginThread(MyComputation, this);
   }

UINT CMyView::MyComputation(LPVOID me) // static method!
   {
    CMyView * self = (CMyView *)me;
    self->ComputeLikeCrazy( );
   }

The code below then shows how I implement a "pause" capability. Alternatively, the PauseEvent variable could represent a Semaphore on a queue or some other synchronization mechanism. Note, however, that it is more complex if you want to wait for a semaphore, a shutdown, or a pause. In this case, because you can only wait on "or" or "and" of the events, and not more complex relationships, you will probably need to nest two WaitForMultipleObjects, one for the semaphore-or-shutdown combination and one for the pause-or-shutdown combination. Although I don't show it below, you can additionally combine this technique with a timeout. Note that in the example below, the running flag is actually local, rather than being a class member variable, and is implicitly handled by the case decoding the ShutdownEvent.

void CMyView::ComputeLikeCrazy( )
   {
    BOOL running = TRUE;

    while(running)
      { /* loop */
       DWORD result = thread->Wait( );
       switch(result)
          { /* result */
           case CMyThread::Timeout:   // if you want a timeout...
              continue;
           case CMyThread::Shutdown:  // shutdown event
              running = FALSE;
              continue;
           case CMyThread::Running:   // unpaused
              break;
          } /* result */
       // ...
       // ... compute one step here
       // ...
      } /* loop */
   }

Note that I make provision for a timeout case, even though the current implementation does not provide for it (an Exercise For The Reader).

Synchronization

Any time you have state shared between two threads it is essential that you provide synchronization on accesses. I discuss a fair amount of this in our book, Win32 Programming, and don't plan to replicate that discussion here. What is odd, however, is the fact that for variables such as paused I don't provide any synchronization. How can I get away with this?

The answer is that synchronization is not required providing only one thread ever modifies the variable, at least in some restricted cases. In our examples, the main GUI thread modifies the variable paused, but all other threads, such as the worker thread, only read it. It is true that the worker thread might, by a single instruction, miss detecting it, but the idea here is that one additional loop of the worker thread won't matter anyway, because the user might have missed it by tens or hundreds of milliseconds.

It has been pointed out to me that even if only one thread modifies the variable (although several threads may use it), if it takes more than one instruction (or one memory cycle) to do it, synchronization is required. For example, if the value is a 64-bit value and two 32-bit instructions are used to store it (because you did not compile for a native Pentium instruction set), you  could have the modifying thread preempted after it has stored the first 32 bits (whichever, high or low, the compiler has chosen to do first) but not the second 32 bits. This is, in fact, correct. If you are modifying a scalar value of more than 32 bits, and the generated code requires more than two instructions to store the value, you must still do synchronization between the modifier and users of the value to ensure that you have not been victimized by this anomaly. Note that if the compiler generates a 64-bit store, there might not a be a problem. The Pentium bus is 64 bits wide, and synchronization is done at the hardware level. But if the value is not aligned so that a single memory cycle can store it (for example, a 32-bit value split across a 64-bit boundary), two memory cycles are required to complete the store, making it risky for a multiprocessor environment. Therefore, you should be careful about taking advantage of this feature. A Tip of the Flounder Fin to Chris Bond for pointing this out.

So what about that case where I set running to be TRUE before AfxBeginThread and set it FALSE just as the thread exited? That just violated my previous statement, didn't it? Well, yes, But note that in this case the synchronization still exists. The thread is terminating, and therefore any computation left to be done in the thread is about to complete. No other work will be done in the thread. The GUI thread will not start a new thread until the running flag is FALSE. Unless you've set the m_bAutoDelete member of the CWinThread to be FALSE, all cleanup including deleting the object will be handled automatically. So we can actually "get away with it". 

If you want to be totally correct and precise, the only valid solution is to have yet another thread waiting on your first worker thread, and when the first worker thread completes, the second worker thread starts up, sets running to FALSE, and then terminates itself. This is a little clumsy and is essentially overkill, but is formally correct.

Setting m_bAutoDelete

This is not as easy as it sounds. Consider the various scenarios:

Scenario 1: Setting it in the launching thread (incorrect!)

CWinThread thread = AfxBeginThread(proc, this);
thread->m_bAutoDelete = FALSE;  // can crash!

Why is this wrong? Well, because upon successful initiation of the thread, the thread that did the launch could be preempted. Then, the newly-launched thread could run. It might even finish within its timeslice. So it terminates, and the CWinThread object that represented it would be deleted. By the time you get around to setting the m_bAutoDelete variable, you are using an illegal pointer. You might crash, you might corrupt some other object, you will probably get an ASSERT from the storage allocator that you used a dangling pointer. And if you are unlucky, the damage might not show up for hours...or days!

Note this error is timing dependent and may not show up during testing, and may not be reproducible when you get reports of failures in the field.

Scenario 2: Setting it in the launched thread (incorrect!)

OK, so you realize Scenario 1 is wrong. And you know why! So instead, you set it in the worker thread itself. Note we are passing this in as the parameter, so we have something like the following. Here, you have decided to move the thread variable from being a local variable to being a class member variable:

UINT proc(LPVOID me)
   {
    CMyClass * self = (CMyClass *)me;
    self->thread->m_bAutoDelete = FALSE;
    return 0;
   }

What's wrong here? Well, look at the launching code. It is

     thread = AfxBeginThread(proc, this);

Now, suppose you have a multiprocessor, or suppose that the launching thread is preempted before it can return. This means you can be trying to execute the assignment

   self->thread->m_bAutoDelete = FALSE;

when the thread variable is still uninitialized! It might be NULL (if you did that in the constructor), or if you didn't, in the debug version, it is probably  0xCDCDCDCD, but in the release version, it could be anything, including a valid pointer left over from some previous allocation. So if you are lucky, you take the access fault immediately; if you are unlucky, you corrupt some random piece of memory. Note this error is timing dependent and may not show up during testing, and may not be reproducible when you get reports of failures in the field.

Scenario 3: The Only Right Way

   thread = AfxBeginThread(proc, this, 
                           THREAD_PRIORITY_NORMAL, // default: use it
                           0,     // default stack size 
                           CREATE_SUSPENDED);
   thread->m_bAutoDelete = FALSE;
   thread->ResumeThread();

Summary

Working with threads introduces some complications, but compared to the problems of dealing with PeekMessage is a much better generalization of the notion of parallel computation. The amount of care that has to be exercised is startling at first, but after you've done a few multithreaded applications it becomes almost a reflex. Learn to use threads. You will be better off in the long run.

[Dividing Line Image]

The views expressed in these essays are those of the author, and in no way represent, nor are they endorsed by, Microsoft.

Send mail to newcomer@flounder.com with questions or comments about this web site.
Copyright © 1999, 2000, 2001, The Joseph M. Newcomer Co. All Rights Reserved.
Last modified: May 14, 2011