Uploaded image for project: 'Qt'
  1. Qt
  2. QTBUG-17014

Qt event delivery is unreliable on windows platforms

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • P1: Critical
    • 4.7.4
    • 4.7.2
    • Core: Event loop
    • None
    • Tested on windows Vista.
    • 220efa578b7d032257c7fa95aaf1295330fd474e

    Description

      The program is simple, it uses thread-to-thread messaging to update a counter, displayed in the window. You should run it with the cursor entirely outside the window boundaries. The test is also simple, the counter never stops, so if it does, then it's broken. If the counter restarts as you move the cursor back into the apps window area, that's the bingo - a message was stuck, due to a failed wakeUp(), and the mouse message got the pump moving again...

      PROBLEM HISTORY:

      1. Qt 4.5 - Used a simple SetEvent mechanism in QEventDispatcherWin32::wakeUp() - we have no record of seeing these freeze problems before the subsequent Qt update.

      2. Qt 4.6 - wakeUp() switched to using a windows message to awaken threads, along with an atomic switch used to suppress redundant messages. Our users started observing the original "AppShare Freeze" problem in our HPVR client. Luckily, we were able to isolate the problem in a small sample program which was submitted with bug report# 12721.

      3. A patch for 4.6.2 was generated in response to 12721 (I received it from Jervey Kong, via Tony, Sep/2010) which appeared to fix the sample program. We deployed the patched libraries and had fairly good success. We've had only one verified instance of the AppShare freeze since that patch. Our production application is still running with this patch.

      4. During development of a new version of our MyRoom product (with the patched version of 4.6.2), we started to experience the freeze in a completely different area, the video processing module (three threads are directly involved) were missing notifications, we started calling it the "Video Freeze". At this point it started to look more like a fundamental problem in the thread to thread messaging design/implementation within the Qt libraries. The code can be changed around to get varying levels of success, but ultimately the possibility still exists for a wakeUp() to fail - leaving a message stuck in the queue - "eventually", one will get stuck. Unfortunately, we were unable to construct any sample program that demonstrated this new problem, but could produce it regularly in our application.

      5. We received a report that 4.7.2 had a new fix that applied to "failed event delivery", and since this sounded like the right area, we gave it a try by testing against the 4.7.2 dev branch. Our testing seemed to alleviate the Video freeze problem, so we requested a patch to 4.6.2 to use in our production release, thinking that the root cause may have finally been addressed.

      6. Which brings us to about current. During the build process of 4.6.2, with the latest patch, I also ran the old sample programs (from the original AppShare problem) and discovered that they were now failing under the new patch. This prompted me to run the same test on 4.7.2 and found that it's also failing there.

      Our conclusion is that the message based wakeUp() mechanism, introduced in 4.6, exposes a probability for a event delivery failure - a failure that is quite difficult to mitigate (looking at both of these patches, the size of the windows event loop, the complexity of the GetMessage hooking mechanism, makes this quite clear). The defect is also next to impossible to pin down in a reproducible way since seemingly unrelated changes can either mask the problem, or produce another instance of it.

      So, the question remains, what was the motivation for this change and will the Qt lab reconsider returning to the SetEvent mechanism that was used in 4.5? From our perspective, SetEvent is a very reliable, and very simple mechanism, so it's a mystery to us why the decision was made to move away from it.

      Attachments

        1. threadtest.zipremove
          2 kB
        2. threadtest.cpp
          3 kB
        3. qeventdispatcher_win.cpp
          38 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            bhughes Bradley T. Hughes (closed Nokia Identity) (Inactive)
            jokeman Gary Hynes
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes