[ wxwindows-Patches-1518719 ] FIX critical threadpsx.cpp race condition

SourceForge.net noreply at sourceforge.net
Fri Jul 7 06:03:36 PDT 2006


Patches item #1518719, was opened at 2006-07-07 15:03
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=309863&aid=1518719&group_id=9863

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Thread
Group: bug fix
Status: Open
Resolution: None
Priority: 5
Submitted By: Andreas Mohr (andim2)
Assigned to: Nobody/Anonymous (nobody)
Summary: FIX critical threadpsx.cpp race condition

Initial Comment:
Hello all,

I've got an app which has to exhibit extreme amounts of
stability, as such it has to run and terminate
successfully for many thousands of runs.
Unfortunately this isn't the case at all...
(one in 300 runs bombs)

My app is using threading very heavily (it creates
many, many threads on a whim), and this exposed a
*severe* problem in the main wxThread startup/shutdown
code:
in src/unix/threadpsx.cpp, the static variable
gs_allThreads isn't protected at all against concurrent
access.
This led to the following ASSERT in a debug build:

#0  0xb6fb8bb6 in raise () from /lib/tls/libpthread.so.0
#1  0xb72cc5d3 in wxTrap ()
    at
/home/amoh/wxWidgets-2.6.3/src/common/appbase.cpp:599
#2  0xb72ccbc7 in ShowAssertDialog (
    szFile=0xb735ae94
"/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp",
    nLine=103, szCond=0xb735ae83 "wxAssertFailure",
    szMsg=0xb735b88c "removing inexisting element in
wxArray::Remove",
    traits=0x81aa018) at
/home/amoh/wxWidgets-2.6.3/src/common/appbase.cpp:823
#3  0xb72cc32d in wxAppConsole::OnAssert (this=0x814fc90,
    file=0xb735ae94
"/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp",
    line=103, cond=0xb735ae83 "wxAssertFailure",
    msg=0xb735b88c "removing inexisting element in
wxArray::Remove")
    at
/home/amoh/wxWidgets-2.6.3/src/common/appbase.cpp:457
#4  0xb74e19ff in wxApp::OnAssert (this=0x814fc90,
    file=0xb735ae94
"/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp",
    line=103, cond=0xb735ae83 "wxAssertFailure",
    msg=0xb735b88c "removing inexisting element in
wxArray::Remove")
    at /home/amoh/wxWidgets-2.6.3/src/gtk/app.cpp:728
#5  0xb72cc68a in wxOnAssert (
    szFile=0xb735ae94
"/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp",
    nLine=103, szCond=0xb735ae83 "wxAssertFailure",
    szMsg=0xb735b88c "removing inexisting element in
wxArray::Remove")
    at
/home/amoh/wxWidgets-2.6.3/src/common/appbase.cpp:645
#6  0xb72cc606 in wxAssert (cond=0,
    szFile=0xb735ae94
"/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp",
    nLine=103, szCond=0xb735ae83 "wxAssertFailure",
    szMsg=0xb735b88c "removing inexisting element in
wxArray::Remove")
    at
/home/amoh/wxWidgets-2.6.3/src/common/appbase.cpp:612
#7  0xb73491e0 in wxArrayThread::Remove
(this=0xb738c8c0, lItem=0x9f020d8)
    at
/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp:103
#8  0xb7347a19 in ~wxThread (this=0x9f020d8)
    at
/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp:1548
#9  0xb71e5c70 in ~wxPostEvtThread (this=0x9f020d8) at
cop_wnd.cpp:553
#10 0xb73482ed in DeleteThread (This=0x9f020d8)
    at
/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp:1692
#11 0xb7347817 in wxThread::Exit (this=0x9f020d8,
status=0x0)
    at
/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp:1486
#12 0xb73460b1 in wxThreadInternal::PthreadStart
(thread=0x9f020d8)
    at
/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp:797
#13 0xb7345e7a in wxPthreadStart (ptr=0x9f020d8)
    at
/home/amoh/wxWidgets-2.6.3/src/unix/threadpsx.cpp:715
#14 0xb6fb2dec in start_thread () from
/lib/tls/libpthread.so.0
#15 0xb6f52a2a in clone () from /lib/tls/libc.so.6

I can only guess (i.e. it sounds very likely that this
was the problem) that this was produced by the thread
to be removed trying to locate its array slot while at
the same time another thread got created and registered
itself, thus causing an array realloc due to exceeding
allocated resources.
Problem is that src/common/dynarray.cpp/Realloc()
allocates new array storage, then copies over the data
to the new location, then *deletes* the old location
and *later* updates the array pointer to indicate the
new location. IOW, we have a tiny race window after
deletion and before pointer updating where another
thread wouldn't be able to re-identify its slot in case
the deleted old data (which is still being referenced
by not-yet-updated m_pItems!) has already been
overwritten with random data...

IOW, wxArray is absolutely not thread-safe.
Which is not a problem since most likely this is by design.
However the wxArray docs don't mention this (should I
add this?).

The patch I attached uses a pthreads mutex (and not a
full-blown bloaty wxMutex due to very time-critical
thread-launch hotpath) to properly protect
gs_allThreads against concurrent messup.
Using a lowlevel pthread mutex is not a problem since
we're in platform-dependent threadpsx.cpp here which
directly implements stuff via those pthread mutexes anyway.

Now that this is out of the way, I'm almost happy
again, but what about wxArray?
We could actually improve Realloc() thread-safety by
introducing a temporary pointer to hold the old array
location, *then* immediately updating m_pItems to the
new location and *then* destroying the old location
that's been remembered in the temporary pointer.
However IMHO this dosn't make much sense from a
philosophical point of view: wxArray isn't designed to
be thread-safe, period. So it better really, really
shouldn't pretend to be, to show people with race
conditions immediately where they are.

Patch against current CVS.

Comments?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=309863&aid=1518719&group_id=9863




More information about the wx-dev mailing list