Implement infinite loop detection of child process #98

make-github-pseudonymous-again · 2014-08-26T22:16:41Z

It is not very clear as if there is a way to limit the execution time of a test run. It would be useful for example if the tested code contains an infinite loop.

kof · 2014-08-28T21:31:54Z

You mean per test or per file?

kof · 2014-08-28T21:43:58Z

There is QUnit.config.testTimeout but I didn't expose this setting to be configurable via runner api. Currently its accessible within tests.

What is your use case?

make-github-pseudonymous-again · 2014-08-29T18:48:22Z

I would like to try the code mutation methods provided by https://github.com/shybyte/grunt-mutation-testing on some JavaScript project. I would then test the mutated code with node-qunit. Mutated code could produce non terminating runs, but in this case a non terminating run would mean that the mutated code is buggy and thus that the test was successful since mutated code is not supposed to pass all the tests.

kof · 2014-08-29T18:51:35Z

I am not sure I understand you. If its an async code and callback never gets called, qunit will trigger a timeout error anyways.

kof · 2014-08-29T18:52:36Z

Maybe just give me an example of your code and test you would have ... and a concrete problem with the current version.

make-github-pseudonymous-again · 2014-08-29T19:09:04Z

Concrete problem is that I don't know how to set a timeout on this testrun.js script (see code below).

If its an async code and callback never gets called, qunit will trigger a timeout error anyways.

How much time will qunit wait?

lib.js

exports.version = "1.0.0"

infinite-loop.js

test("infinite-loop", function(){ for(;;); });

runtest.js

var path = require('path');
var testrunner = require('qunit');

testrunner.options.coverage = true;
testrunner.options.log.tests = false;
testrunner.options.log.testing = false;
testrunner.options.log.assertions = false;
testrunner.options.log.summary = false;

var n = 1;
var failed = 0;
var done = 0;

var cb = function(err, report) {
    if (err) {
        console.log('error :  ' + err.message);
        process.exit(-1);
    }

    failed += report.failed;
    ++done;
    if (done === n) process.exit(failed);
};

testrunner.run(
    {
        code : {
            path : path.normalize("lib.js"),
            namespace: "lib"
        },
        tests :  path.normalize("infinite-loop.js")
    }, cb
);

kof · 2014-08-29T19:14:00Z

Concrete problem is that I don't know how to set a timeout on this testrun.js script (see code below).
For me this is not the problem, this is what you want to be able to do, I am still trying to understand your motivation.

http://api.qunitjs.com/QUnit.config/

Just found out that the default number is undefined, so it will not, I was wrong. And we have currently no api to pass it. Do you want to submit a pull request for this?

kof · 2014-08-29T19:14:18Z

Basically we can allow setting any of this options.

kof · 2014-08-29T19:15:04Z

But also you can set it as I said within the test file itself, thats why I am trying to understand what you do.

kof · 2014-08-29T19:15:29Z

You have a direct access to QUnit within your test file via QUnit namespace.

make-github-pseudonymous-again · 2014-08-29T19:17:15Z

You mean putting QUnit.config.testTimeout = 12345 at the beginning of the test file? What if there are multiple test files?

kof · 2014-08-29T19:19:13Z

yes, every test file is runnning in own spawned process, so you need to do that in every test file then.

make-github-pseudonymous-again · 2014-08-29T19:26:17Z

Well, putting QUnit.config.testTimeout = 1 doesn't seem to work so...

kof · 2014-08-29T19:29:52Z

you also used an asyncTest or stopped the test?

make-github-pseudonymous-again · 2014-08-29T19:34:16Z

QUnit.config.testTimeout = 1;
asyncTest("infinite-loop", function(){ for(;;); });

kof · 2014-08-29T19:36:44Z

lol dude, you blocked completely that process. In this case there needs to be a per process timeout in testrunner or blocked loop detector.

make-github-pseudonymous-again · 2014-08-29T19:39:57Z

yes :)

kof · 2014-08-29T19:41:16Z

so, lots of things can be done, I personally never needed them, but you seem to. I still don't really understand your usecase and which of all possible fixes you require.

make-github-pseudonymous-again · 2014-08-29T19:48:57Z

So, of course I'm not writing obviously failing tests containing infinite loops. What I want to do is

test the original lib code L with test code T
create mutant code L' from lib code L and test with test code T
repeat 2. for each mutant

The point is that the mutated lib code can make the test code run into an infinite loop, and if it does so it means that the test succeeded since the mutated code is supposed to fail at least one of the tests.
So it would be great if infinite loop failure could be handled seamlessly.

kof · 2014-08-29T19:50:58Z

infinite loop would block all tests in that file .... basically all tests will fail ....

kof · 2014-08-29T19:51:26Z

we can add this infinite loop detection but I am not sure it will help you

make-github-pseudonymous-again · 2014-08-29T19:53:41Z

If they all fail then that's no problem because I don't want to know where it failed. The only information I want is when a mutant passes all the tests.

kof · 2014-08-29T19:55:27Z

If you are willing to implement the fix we can talk abut details. Are you?

make-github-pseudonymous-again · 2014-08-29T19:56:54Z

I could trigger a pull request when I have time, and then we discuss? Or we discuss before?

kof · 2014-08-29T19:57:43Z

We need to agree on basic technics.

How would you do that detection?

make-github-pseudonymous-again · 2014-08-29T20:00:43Z

Mmh like this ?

kof · 2014-08-29T20:04:02Z

So you are talking about a global timeout for the whole file, which you do in testrunner, not in the child right?

The downside of this is you need to know how long file could run, this can variate between different needs, so the default would be very high, like 5 minutes or so ?

The other approach could be a ping tiime setting defined in testrunner and passed to the child, child sends a ping every X ms and if it doesn't happen it means process is blocked ... so we could have a way faster time period for the block detection. f..e. 20sec

make-github-pseudonymous-again · 2014-08-29T20:07:53Z

So you are talking about a global timeout for the whole file, which you do in testrunner, not in the child right?

For my use case there is no difference between global or local timeout setup. So global is fine.

The downside of this is you need to know how long file could run, this can variate between different needs, so the default would be very high, like 5 minutes or so ?

Default behaviour could be no timeout. No?

The other approach could be a ping tiime setting defined in testrunner and passed to the child, child sends a ping every X ms and if it doesn't happen it means process is blocked ... so we could have a way faster time period for the block detection. f..e. 20sec

Does that mean the user would have to make a ping call every <20 seconds? I'm not sure I completely understand.

kof · 2014-08-29T20:10:18Z

Yeah, no timeout by default would be also an option.

Does that mean the user would have to make a ping call every <20 seconds? I'm not sure I completely understand.

No child process would send a ping message to the parrent every X seconds automatically. So that parent knows the child is not blocked.

kof · 2014-08-29T20:12:01Z

I believe having this kind of detector by default is a valuable feature in some edge cases. Will probably save some people a bit of pain.

make-github-pseudonymous-again · 2014-08-29T21:41:39Z

ah, understood

make-github-pseudonymous-again · 2014-08-29T21:45:47Z

Yep everything crystal clear

kof · 2014-08-29T21:46:44Z

great, don'f forget to write tests for this feature ;)

make-github-pseudonymous-again · 2014-08-29T21:52:18Z

OK

make-github-pseudonymous-again · 2014-09-13T22:00:01Z

Old tests seem to pass with current draft implementation. What kind of scenario should I consider in the new tests for this feature?

kof · 2014-09-14T00:41:30Z

Ideally a test with code which has an infinite loop and test runner exists with an error f.e. https://github.com/kof/node-qunit/blob/master/test/testrunner.js#L146

make-github-pseudonymous-again · 2014-09-14T09:12:59Z

Each child takes some time to start up. This can cause problems with really short maxBlockDuration values (e.g. 500ms on my computer).
Should I add a child 'start' message handler that will start the interval in the parent?
Is the error where the parent is unable to spawn a new child already handled?

kof · 2014-09-14T10:13:49Z

we can just have a higher value for maxBlockDuration, f.e. 5000ms ... how much time does it take on your machine, 100ms?

make-github-pseudonymous-again · 2014-09-14T10:17:21Z

I get a ~510ms delay. Implementing this start event would ensure it works with all (> 0) maxBlockDuration values.

kof · 2014-09-14T10:19:34Z

wow pretty slow, but ok, lets do that with start event.

make-github-pseudonymous-again · 2014-09-14T12:25:16Z

There seems to be a huge delay between the moment the child stops executing the interval code that sends the ping and the moment the parent receives the 'done' event. I see there is a debounced callback for QUnit.done but normally this should not interrupt the interval, right?

kof · 2014-09-14T12:30:24Z

not sure what you mean, you don't need to wait for done event in master if ping is not received at time, you just kill the child and pass an error along.

make-github-pseudonymous-again · 2014-09-14T12:39:04Z

What I mean is that when using a very small maxBlockDuration (i.e. 1000ms) I get ping errors on the first test, for example

Testing  /home/genius/dev/node/node-qunit/test/fixtures/testrunner-code.js ... 
start 1410696389688
1410696390691 1410696389688 1003 1000

assert.js:92
  throw new assert.AssertionError({
        ^
AssertionError: no errors
    at /home/genius/dev/node/node-qunit/test/testrunner.js:49:11
    at /home/genius/dev/node/node-qunit/lib/testrunner.js:186:24
    at null.<anonymous> (/home/genius/dev/node/node-qunit/lib/testrunner.js:84:21)
    at wrapper (timers.js:252:14)
    at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

The start 1410696389688 line is emitted only once the child interval has been setup. Here the child interval use a delay value of .5 * maxBlockDuration so it should definitely send 'ping' messages every ~500ms.

So it looks like the child interval stops some time before the 'done' event is sent. Or there is a lot of background work going on in QUnit at the end of the test, and this prevents the callback of the interval from being executed.

kof · 2014-09-14T12:42:55Z

I need to see what you do. Maybe you start sending ping before requiring all dependencies, which is sync and blocks the process. I don't think anything in qunit with just some simple dummy test would block for that period of time.

make-github-pseudonymous-again · 2014-09-14T12:49:15Z

child.js

// start sending ping
setInterval(function() {
    console.log('send ping');
    process.send({
        event: 'ping'
    });
}, Math.floor(.5 * options.maxBlockDuration));

// send start event
process.send({
    event: 'start'
});

// require deps
options.deps.forEach(function(dep) {
    _require(dep, true);
});

// require code
_require(options.code, true);

// require tests
options.tests.forEach(function(test) {
    _require(test, false);
});

testrunner.js

if (msg.event === 'start') {
    lastPing = Date.now();
    console.log('start', lastPing);

    pingInterval = setInterval(function(){
        var now;
        now = Date.now();
        if (now - lastPing > opts.maxBlockDuration) {
            console.log(now, lastPing, now - lastPing, opts.maxBlockDuration);
            callback(new Error('ping error'));
            kill();
        }
    }, opts.maxBlockDuration);
}
...
} else if (msg.event === 'ping') {
    console.log(Date.now() - lastPing);
    lastPing = Date.now();
    console.log('recv ping');
}

make-github-pseudonymous-again · 2014-09-14T12:51:09Z

and there is a clearInterval(pingInterval) in kill()

kof · 2014-09-14T12:59:50Z

try to find which calls in child takes this time. f.e. you are sending start event before requiring dependencies which blocks the process, but actually shouldn't take that much time.

make-github-pseudonymous-again · 2014-09-14T13:10:14Z

Adding console.log(...) calls at appropriate places I get

Testing  /home/genius/dev/node/node-qunit/test/fixtures/testrunner-code.js ... 
deps loaded
start 1410700108743
{ path: '/home/genius/dev/node/node-qunit/test/fixtures/testrunner-code.js' } true
code loaded
{ path: '/home/genius/dev/node/node-qunit/test/fixtures/testrunner-tests.js' } false
no more tests
1410700109746 1410700108743 1003 1000

assert.js:92
  throw new assert.AssertionError({
        ^
AssertionError: no errors
    at /home/genius/dev/node/node-qunit/test/testrunner.js:49:11
    at /home/genius/dev/node/node-qunit/lib/testrunner.js:185:24
    at null.<anonymous> (/home/genius/dev/node/node-qunit/lib/testrunner.js:84:21)
    at wrapper (timers.js:252:14)
    at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

which doesn't make it look like something takes time to load..

make-github-pseudonymous-again · 2014-09-17T20:19:10Z

could you explain me why you need to debounce the QUnit.done callback?

kof · 2014-09-17T20:25:36Z

I believe there are cases where the callback has been called more than once.

make-github-pseudonymous-again · 2014-09-17T21:18:11Z

There seems to be some kind of inherent limit on the minimum maxBlockDuration I can use with my implementation. You could take a look at it. It breaks somewhere between 500ms and 600ms on my computer. I will soon push all my recent changes on my fork.

…equire call in child.js, set ping period to half the check period (qunitjs#98)

make-github-pseudonymous-again · 2014-09-17T21:28:53Z

Voilà -> https://github.com/aureooms/node-qunit. If it's ok then I'll clean the console.log calls. I suppose I will also have to add a reference to maxBlockDuration inside the doc and the cli.

make-github-pseudonymous-again · 2014-10-02T19:37:38Z

up?

kof · 2014-10-02T22:23:37Z

ok did it, everything considered, works also with very small maxBlockDuration. Starting child process takes 400ms on my machine, 113ms just the child, the rest javascript libs.

Checkout the new version, its already published.

kof added the Enhancement label Aug 31, 2014

kof changed the title ~~Set a timeout on a test run~~ Aug 31, 2014

make-github-pseudonymous-again added a commit to aureooms-contrib/node-qunit that referenced this issue Sep 13, 2014

draft implementation of maxBlockDuration feature (qunitjs#98)

b961c4c

make-github-pseudonymous-again added a commit to aureooms-contrib/node-qunit that referenced this issue Sep 14, 2014

test draft for maxBlockDuration feature (qunitjs#98)

77da5fc

make-github-pseudonymous-again added a commit to aureooms-contrib/node-qunit that referenced this issue Sep 17, 2014

add start event in testrunner.js and force to send ping after each _r…

8caafce

…equire call in child.js, set ping period to half the check period (qunitjs#98)

kof closed this as completed in 9d0f841 Oct 2, 2014

Implement infinite loop detection of child process #98

Implement infinite loop detection of child process #98

Comments

make-github-pseudonymous-again commented Aug 26, 2014

kof commented Aug 28, 2014

kof commented Aug 28, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

kof commented Aug 29, 2014

kof commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

kof commented Aug 29, 2014

make-github-pseudonymous-again commented Aug 29, 2014

make-github-pseudonymous-again commented Sep 13, 2014

kof commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 14, 2014

kof commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 14, 2014

kof commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 14, 2014

kof commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 14, 2014

kof commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 14, 2014

kof commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 14, 2014

make-github-pseudonymous-again commented Sep 17, 2014

kof commented Sep 17, 2014

make-github-pseudonymous-again commented Sep 17, 2014

make-github-pseudonymous-again commented Sep 17, 2014

make-github-pseudonymous-again commented Oct 2, 2014

kof commented Oct 2, 2014