Web Worker: All existing and newly created workers become unresponsive to messages after both an XHR object and a function are initialized through a closure inside a worker

Issue #9243268 • Assigned to Steve B.

Details

Author
Rotem D.
Created
Oct 7, 2016
Privacy
This issue is public.
Found in
  • Microsoft Edge
  • Internet Explorer
Found in build #
14.14936
Reports
Reported by 6 people

Sign in to watch or report this issue.

Steps to reproduce

This reduced test case took a total of a full day of work to create (it was extremely difficult to isolate the particular causes of this).

I believe this issue should have a major impact on many web applications and libraries making use web workers, and it renders some of my own projects mostly unusable in both IE11 and Edge (although in Edge the occurrence rate seems to be somewhat lower, especially when the console window is not visible). I have reproduced it in both IE10 and Edge in a clean VM install of Win10 Pro (latest fast ring build - 14936, and also the current stable one - 14393).

Based on testing done over period of months on I believe it started around the time the anniversary update was introduced. It never happened before around August 2016 - I have tried exact snapshots of previous builds of my library whose test suite was successfully run hundreds of times in both IE and Edge without this problem ever occurring. Also, this problem does not occur on IE10 (in Microsoft provided Win7 VM).

test.js:

///
/// This test runs as a back and forth series of messages from the main thread to a set of workers
///
/// For the test to succeed, 100 workers should be created (and destroyed) and be able to receive and 
/// return a single message each
///

if (typeof window === "object") {
    ///
    /// Main thread code, this is only used to test the issue:
    ///
    let workerNum = 1;

    function startNewWorker() {
        var myWorker = new Worker("test.js");
        console.log("Worker "+ workerNum + " created"); 

        myWorker.addEventListener("message", function (event) {
            console.log("Worker sent message: '" + event.data + "'");
            
            myWorker.terminate();

            if (workerNum <= 100) {
                setTimeout(function () {
                    startNewWorker();
                }, 100);
            } else {
                console.log("100 workers created, test passed!");
            }
        });
        
        myWorker.postMessage("Message to worker " + workerNum);
        workerNum++
    }

    startNewWorker();
} else {
    /// 
    /// Worker code, causes the issue itself to occur:
    ///
    
    var obj; // This object must be positioned outside the scope of the event handler

    self.addEventListener("message", function(event) {
        console.log("Worker received message: '" + event.data + "'")
        
        var oReq; // The XHR request object must be accessed through a closure

        function test() {
            
            // Commenting out the following line causes the problem not to occur. Try it!:
            obj = function() {};                    // If a number or string is assigned instead, the problem doesn't occur
            
            oReq = new XMLHttpRequest();
            // Commenting out the following line causes the problem not to occur. Try it!:
            oReq.open("GET", "http://microsoft.com/");  // Note: this XHR request is never actually sent!
                                                        // However, calling 'open' seems essential to reproduce the problem.
            
            postMessage("OK"); // This is not essential for reproduction, only for the execution of the test itself
        }

        test();
    });
}

index.html:

Attachments

Comments and activity

  • Correction: In the statement “I have reproduced it in both IE10 and Edge” I meant "I have reproduced it in both IE11 and Edge". It occurs most frequently in IE11, but never on IE10. That was an inintentional typo.

  • Some instructions on how to run the test:

    1. Download ‘test.js’ and "index.html’ and put them in a path accessible through a web server like IIS.

    2. Open ‘index.html’ with the browser console visible.

    3. The test should produce console messages of the form:

    Worker x created
    Worker received message: 'Message to worker x'
    Worker sent message: 'OK'
    

    If this set of messages doesn’t repeat 100 times in the console, it means the test has failed.

    If the last series of messages was cut like:

    Worker 5 created
    

    It means that the 5th worker was created, was sent a message by the main thread, but did not actually receive it in practice.

    In IE11 this usually stops this way, at 2 or 3 (or even cuts at the very first worker), and sometimes more intermittently. In Edge the number successful rounds could go up to 100 at times, but sometimes only a few tens.

    This test case doesn’t allow testing the problem when the console window is closed, but my previous personal testing shows it happens just as frequently in this scenario in IE11, though more rarely in Edge.

  • Microsoft Edge Team

    Changed Assigned To to “Ibrahim O.”

    Changed Status to “Fixed”

  • Appreciate your feedback and your effort on repro sample. It appears to be this issue already fixed and the fix is available in latest insider preview which will be available in public build soon. (I have tested on 14942) If you see the issue persist in future builds please feel free to reactivate this item, we will be happy to assist you.

    Lastly, we are no longer accepting IE bugs unless they are security related.

    All the best,
    The MS Edge Team

  • Thanks for fixing this on Edge.

    I understand the IE11 “security update only” policy but I’m not sure if it applies to regressions introduced due to the security updates themselves. I may be mistaken, but if it is really the case that the bug only appeared in IE11 as late as August 2016, then there might be the possibility it was unintentionally introduced due to a security update? (there is no way for me to know… but I’ve never encountered that behavior before that time period).

    It would be somewhat of a tragedy if a bug that was apparently introduced long after IE11 was announced as mostly unmaintained, to then be permanently engraved into the browser.

  • I understand your concern and we do still support IE11, If you have
    a premier support contract you can visit https://premier.microsoft.com and open
    a support incident and work with an engineer to address this issue. If you are not a premier customer I would suggest you to go to support.microsoft.com and file this issue as regression and this will channel the issue to the right team.

     

    Best
    regards,

    The MS Edge
    Team

  • Dear MS Edge team,

    I’ve installed the original Win10 Pro x64 RTM (build 10240) in a VM and disabled all updates. The issue does not occur there in both Edge and IE11. This reinforces the suspicion that the problem was most likely introduced due to a security update, at least for IE11 though also likely for Edge.

    I believe members of the Edge QA team are the best candidates to identify the exact update that triggered the problem, since you have easy access to many previous builds and the source code itself. I don’t have a premier support contract and I don’t see the need to issue further inquiries to Microsoft.

    I worked very hard to build a test case that should probably be included, at some form, in the Edge and IE automated test suite. I feel I have done my part. If someone else (maybe even within Microsoft) cares enough to try to convince Microsoft to take responsibility for the consequences of their updates to legacy products. Please do.

  • Appreciate the update. As you have mentioned, I see the issue was not repro in previous updates which is possible we have introduced new bug which that regression will most likely be fixed or already being investigated. Could you please keep us posted with your results on 14393.321 regarding to this issue. It should be working and not repro for both IE and Edge on that update which is the latest stable Windows update.

    Please let us know your test results, we will pursue from there.

    All the best,
    The MS Edge Team

  • I’m currently using 14393.321 and the issue occurs there in both Edge and IE11. I recorded screen captures for both:
    https://youtu.be/ExKy-MPJnes
    https://youtu.be/PcPrtFpUT2g

    I’m currently updating the fast ring preview VM from 14936 to 14946 and will test it as well when it completes.

  • Could you please also test the issue on 14393.321 with changing your src path for the Js file as absolute path instead of relative path and see if that affects your result. In your case I assume it will be <script src="http://10.0.0.100/edgebugtest/test.js"></script>

    Best regards,
    The MS Edge Team

  • I’ve tried using the absolute URL in both the script tag and the path to the worker itself (individually and both), also cleaned cache in both browser. It didn’t seem to make a difference in 14393.321.

    I also tested in 14946: so far I couldn’t reproduce it in Edge, but it did occur as frequently as it occurred 14393.321 and 14936 in IE11. I’ve tested absolute URLs in 14946 as well and it didn’t seem to make any change.

  • I also see the issue is repro on local or remote server but sporadic behaviour with the file directly running the file directly on 14393.321 (mostly doesn’t give repro that’s why I was having trouble recreating the issue on Edge). Having said that I see that issue is no longer repro on the insider preview which I have tested on 14947 no matter is localhost, remote server or running directly. I will also reactivate the bug and escalate the issue for further review in IE11 but we can’t guarantee that will be fixed. Thanks again for your assistance in the investigation. 

    All the best,
    The MS Edge Team

  • Microsoft Edge Team

    Changed Status from “Fixed”

  • Thanks for trying to help.

    Here are screen recordings for 10240 (Original July 2015 Win10 RTM version) where it doesn’t seem to reproduce in both Edge and IE11:

    https://youtu.be/XAAjGlDHEbY
    https://youtu.be/Rjl1GOu0Kc0

    I’m not really in a position to test this on arbitrary historical Win10 builds but my first instinct would be to try it in the anniversary update and compare it to the one just before it. Perhaps if I find some way to do that in the future I will, when time allows.

    If the browsers weren’t tied to the OS it seems like it would have been possible to run some sort of an automated regression analysis tool like a ‘git bisect’ (or equivalent):

    https://git-scm.com/docs/git-bisect

    Though I guess that would probably take a very take a long time and be very intensive in resource usage.

  • Microsoft Edge Team

    Changed Assigned To to “Rico M.”

  • I’ve reproduced the issue in 14393.0 in both Edge and IE11. Based on the information in Wikipedia this build was published in July 18, 2016 to the fast ring and July 20, 2016 to the slow ring. Screen recordings:

    https://youtu.be/vutmxElG0lk
    https://youtu.be/6oUMHuX5ClI

    Since it reproduced relatively infrequently in Edge I also tried reducing the delay between worker creation from 100ms to 10ms and the frequency of occurrence of the issue seemed to increase:
    https://youtu.be/20dcQA0YGKQ

    (I tried the lower delay version in 14946 and it still didn’t reproduce)

    I’ll try to find a build of the November update (10586.XXX) and test it there as well.

  • The issue doesn’t seem to occur on 10586.420 (June 14, 2016 stable build):

    https://youtu.be/FgCBhs841u4
    https://youtu.be/dP-gN6Fssj8

    I doubt (though I can’t be 100% certain) it would reproduce in 10586.494 because I believe I’ve personally used it. Perhaps it’s still worth checking though.

    The next logical step would probably be to try some of the insider builds for the anniversary update. Based on Wikipedia build 14295.1005 (released to slow ring on April 22, 2016) included some security updates to IE11. However, I simply don’t have access to it and other such old builds.

  • Microsoft Edge Team

    Changed Assigned To from “Rico M.” to “Josh P.”

    Changed Status to “Confirmed”

  • https://developer.microsoft.com/en-us/microsoft-edge/platform/issues/9545866/ describes a similar issue, and provides an even smaller test case.

  • The complexity test case here is because:

    1. It was refined to demonstrate the problem occurs not only for existing workers, but for new ones as well!
    2. The issue is occurs more intermittently on Edge then on IE11 so more trials were needed to reliably reproduce it.
    3. There’s a 100ms delay between each worker creation iteration to demonstrate this doesn’t just happen due to a ‘burst’ of some sort.
    4. The code internally highlight and allows testing a particularly enigmatic aspect of the problem: removing a blank function assignment within the scope of the XMLHttpRequest object initialization causes the issue not to happen for some reason.

    I wasn’t aware of the `

  • Microsoft Edge Team

    Changed Assigned To from “Josh P.” to “Venkat K.”

    Changed Status from “Confirmed”

    Changed Assigned To from “Venkat K.” to “Steve B.”

  • My bug report is being closed as a duplicate of this one. Yet I cannot see the correlations between F12 Developer Tools and this bug.

    Anyone cares to explain?

  • I believe Edge may use either a slightly different JS engine, say an interpreter instead of a JIT (perhaps it is longer true, I have no idea), or disable some JIT optimizations when the developer tools are open. This is just a speculation, though. I’m not Microsoft person.

  • Microsoft Edge Team

    Changed Title from “All existing and newly created workers become unresponsive to messages after both an XHR object and a function are initialized through a closure inside a worker” to “Web Worker: All existing and newly created workers become unresponsive to messages after both an XHR object and a function are initialized through a closure inside a worker”

  • This test stops just after a couple of messages in both the latest stable IE (14.393.000) and Edge 38.14393.1066.0

  • The problem still persists for me, running

    • Edge v40.15063.0.0
    • EdgeHtml v15.15063
    • Windows 10 Enterprise version 10.0.15063 Build 15063

    This bug unfortunately breaks docfx’s search for edge users. We use docfx to build our documentation, and (being an ms internal product) most of our users use edge.
    If this isn’t fixed (which it doesn’t look like it will be), we will need to either completely rewrite the search using a less performant method, or tell our users to use a different browser.

  • +1 to Carson T. – We’re also an internal Microsoft team using DocFx and find this issue (which has started appearing once we’ve upgraded to Creator’s Update) extremely frustrating. (I personally installed Chrome just so that I could search our internal DocFx site.)

  • +1 We have the same bug in our application.

    I can only reproduce this with the test_blobs.js, which additionally sends large typed arrays to the main program. This is exactly what we do in our application (loading binary geometry files, decoding them, sending them to the main program, and rendering them with WebGL).
    Instead of 100 workers/messages, several tests resulted in 70, 37, 1, 1, 24 workers/messages.

    Same versions as Carson T. has:
    EdgeHtml 15.15063 / Windows 10 Enterprise version 10.0.15063 Build 15063.

    In an older version, the test runs fine (EdgeHtml 13.10586).

    Since working as a web developer, I spent a large amount of time fixing IE and Edge quirks :-(
    When can we expect this to be fixed?

  • Note that in test_blobs.js, I added the oReq.send( ), and post the message after the response has been received. For me, this was necessary to reproduce the issue.

  • We’ve been waiting 9 months for a resolution in IE 11. Its necessary for us to inform customers not to use IE until there is a resolution. We had believed it would be resolved this Fall but the latest word is that the fix was pulled from the schedule.

You need to sign in to your Microsoft account to add a comment.

Sign in