Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement infinite loop detection of child process #98

Closed
make-github-pseudonymous-again opened this issue Aug 26, 2014 · 77 comments
Closed

Comments

@make-github-pseudonymous-again

It is not very clear as if there is a way to limit the execution time of a test run. It would be useful for example if the tested code contains an infinite loop.

@kof
Copy link
Contributor

kof commented Aug 28, 2014

You mean per test or per file?

@kof
Copy link
Contributor

kof commented Aug 28, 2014

There is QUnit.config.testTimeout but I didn't expose this setting to be configurable via runner api. Currently its accessible within tests.

What is your use case?

@make-github-pseudonymous-again
Copy link
Author

I would like to try the code mutation methods provided by https://github.com/shybyte/grunt-mutation-testing on some JavaScript project. I would then test the mutated code with node-qunit. Mutated code could produce non terminating runs, but in this case a non terminating run would mean that the mutated code is buggy and thus that the test was successful since mutated code is not supposed to pass all the tests.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

I am not sure I understand you. If its an async code and callback never gets called, qunit will trigger a timeout error anyways.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

Maybe just give me an example of your code and test you would have ... and a concrete problem with the current version.

@make-github-pseudonymous-again
Copy link
Author

Concrete problem is that I don't know how to set a timeout on this testrun.js script (see code below).

If its an async code and callback never gets called, qunit will trigger a timeout error anyways.

How much time will qunit wait?

lib.js

exports.version = "1.0.0"

infinite-loop.js

test("infinite-loop", function(){ for(;;); });

runtest.js

var path = require('path');
var testrunner = require('qunit');

testrunner.options.coverage = true;
testrunner.options.log.tests = false;
testrunner.options.log.testing = false;
testrunner.options.log.assertions = false;
testrunner.options.log.summary = false;

var n = 1;
var failed = 0;
var done = 0;

var cb = function(err, report) {
    if (err) {
        console.log('error :  ' + err.message);
        process.exit(-1);
    }

    failed += report.failed;
    ++done;
    if (done === n) process.exit(failed);
};

testrunner.run(
    {
        code : {
            path : path.normalize("lib.js"),
            namespace: "lib"
        },
        tests :  path.normalize("infinite-loop.js")
    }, cb
);
@kof
Copy link
Contributor

kof commented Aug 29, 2014

Concrete problem is that I don't know how to set a timeout on this testrun.js script (see code below).
For me this is not the problem, this is what you want to be able to do, I am still trying to understand your motivation.

http://api.qunitjs.com/QUnit.config/

Just found out that the default number is undefined, so it will not, I was wrong. And we have currently no api to pass it. Do you want to submit a pull request for this?

@kof
Copy link
Contributor

kof commented Aug 29, 2014

Basically we can allow setting any of this options.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

But also you can set it as I said within the test file itself, thats why I am trying to understand what you do.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

You have a direct access to QUnit within your test file via QUnit namespace.

@make-github-pseudonymous-again
Copy link
Author

You mean putting QUnit.config.testTimeout = 12345 at the beginning of the test file? What if there are multiple test files?

@kof
Copy link
Contributor

kof commented Aug 29, 2014

yes, every test file is runnning in own spawned process, so you need to do that in every test file then.

@make-github-pseudonymous-again
Copy link
Author

Well, putting QUnit.config.testTimeout = 1 doesn't seem to work so...

@kof
Copy link
Contributor

kof commented Aug 29, 2014

you also used an asyncTest or stopped the test?

@make-github-pseudonymous-again
Copy link
Author

QUnit.config.testTimeout = 1;
asyncTest("infinite-loop", function(){ for(;;); });
@kof
Copy link
Contributor

kof commented Aug 29, 2014

lol dude, you blocked completely that process. In this case there needs to be a per process timeout in testrunner or blocked loop detector.

@make-github-pseudonymous-again
Copy link
Author

yes :)

@kof
Copy link
Contributor

kof commented Aug 29, 2014

so, lots of things can be done, I personally never needed them, but you seem to. I still don't really understand your usecase and which of all possible fixes you require.

@make-github-pseudonymous-again
Copy link
Author

So, of course I'm not writing obviously failing tests containing infinite loops. What I want to do is

  1. test the original lib code L with test code T
  2. create mutant code L' from lib code L and test with test code T
  3. repeat 2. for each mutant

The point is that the mutated lib code can make the test code run into an infinite loop, and if it does so it means that the test succeeded since the mutated code is supposed to fail at least one of the tests.
So it would be great if infinite loop failure could be handled seamlessly.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

infinite loop would block all tests in that file .... basically all tests will fail ....

@kof
Copy link
Contributor

kof commented Aug 29, 2014

we can add this infinite loop detection but I am not sure it will help you

@make-github-pseudonymous-again
Copy link
Author

If they all fail then that's no problem because I don't want to know where it failed. The only information I want is when a mutant passes all the tests.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

If you are willing to implement the fix we can talk abut details. Are you?

@make-github-pseudonymous-again
Copy link
Author

I could trigger a pull request when I have time, and then we discuss? Or we discuss before?

@kof
Copy link
Contributor

kof commented Aug 29, 2014

We need to agree on basic technics.

How would you do that detection?

@make-github-pseudonymous-again
Copy link
Author

Mmh like this ?

@kof
Copy link
Contributor

kof commented Aug 29, 2014

So you are talking about a global timeout for the whole file, which you do in testrunner, not in the child right?

The downside of this is you need to know how long file could run, this can variate between different needs, so the default would be very high, like 5 minutes or so ?

The other approach could be a ping tiime setting defined in testrunner and passed to the child, child sends a ping every X ms and if it doesn't happen it means process is blocked ... so we could have a way faster time period for the block detection. f..e. 20sec

@make-github-pseudonymous-again
Copy link
Author

So you are talking about a global timeout for the whole file, which you do in testrunner, not in the child right?

For my use case there is no difference between global or local timeout setup. So global is fine.

The downside of this is you need to know how long file could run, this can variate between different needs, so the default would be very high, like 5 minutes or so ?

Default behaviour could be no timeout. No?

The other approach could be a ping tiime setting defined in testrunner and passed to the child, child sends a ping every X ms and if it doesn't happen it means process is blocked ... so we could have a way faster time period for the block detection. f..e. 20sec

Does that mean the user would have to make a ping call every <20 seconds? I'm not sure I completely understand.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

Yeah, no timeout by default would be also an option.

Does that mean the user would have to make a ping call every <20 seconds? I'm not sure I completely understand.

No child process would send a ping message to the parrent every X seconds automatically. So that parent knows the child is not blocked.

@kof
Copy link
Contributor

kof commented Aug 29, 2014

I believe having this kind of detector by default is a valuable feature in some edge cases. Will probably save some people a bit of pain.

@make-github-pseudonymous-again
Copy link
Author

ah, understood

@make-github-pseudonymous-again
Copy link
Author

Yep everything crystal clear

@kof
Copy link
Contributor

kof commented Aug 29, 2014

great, don'f forget to write tests for this feature ;)

@make-github-pseudonymous-again
Copy link
Author

OK

@kof kof changed the title Set a timeout on a test run Aug 31, 2014
@make-github-pseudonymous-again
Copy link
Author

Old tests seem to pass with current draft implementation. What kind of scenario should I consider in the new tests for this feature?

@kof
Copy link
Contributor

kof commented Sep 14, 2014

Ideally a test with code which has an infinite loop and test runner exists with an error f.e. https://github.com/kof/node-qunit/blob/master/test/testrunner.js#L146

@make-github-pseudonymous-again
Copy link
Author

Each child takes some time to start up. This can cause problems with really short maxBlockDuration values (e.g. 500ms on my computer).
Should I add a child 'start' message handler that will start the interval in the parent?
Is the error where the parent is unable to spawn a new child already handled?

make-github-pseudonymous-again added a commit to aureooms-contrib/node-qunit that referenced this issue Sep 14, 2014
@kof
Copy link
Contributor

kof commented Sep 14, 2014

we can just have a higher value for maxBlockDuration, f.e. 5000ms ... how much time does it take on your machine, 100ms?

@make-github-pseudonymous-again
Copy link
Author

I get a ~510ms delay. Implementing this start event would ensure it works with all (> 0) maxBlockDuration values.

@kof
Copy link
Contributor

kof commented Sep 14, 2014

wow pretty slow, but ok, lets do that with start event.

@make-github-pseudonymous-again
Copy link
Author

There seems to be a huge delay between the moment the child stops executing the interval code that sends the ping and the moment the parent receives the 'done' event. I see there is a debounced callback for QUnit.done but normally this should not interrupt the interval, right?

@kof
Copy link
Contributor

kof commented Sep 14, 2014

not sure what you mean, you don't need to wait for done event in master if ping is not received at time, you just kill the child and pass an error along.

@make-github-pseudonymous-again
Copy link
Author

What I mean is that when using a very small maxBlockDuration (i.e. 1000ms) I get ping errors on the first test, for example

Testing  /home/genius/dev/node/node-qunit/test/fixtures/testrunner-code.js ... 
start 1410696389688
1410696390691 1410696389688 1003 1000

assert.js:92
  throw new assert.AssertionError({
        ^
AssertionError: no errors
    at /home/genius/dev/node/node-qunit/test/testrunner.js:49:11
    at /home/genius/dev/node/node-qunit/lib/testrunner.js:186:24
    at null.<anonymous> (/home/genius/dev/node/node-qunit/lib/testrunner.js:84:21)
    at wrapper (timers.js:252:14)
    at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

The start 1410696389688 line is emitted only once the child interval has been setup. Here the child interval use a delay value of .5 * maxBlockDuration so it should definitely send 'ping' messages every ~500ms.

So it looks like the child interval stops some time before the 'done' event is sent. Or there is a lot of background work going on in QUnit at the end of the test, and this prevents the callback of the interval from being executed.

@kof
Copy link
Contributor

kof commented Sep 14, 2014

I need to see what you do. Maybe you start sending ping before requiring all dependencies, which is sync and blocks the process. I don't think anything in qunit with just some simple dummy test would block for that period of time.

@make-github-pseudonymous-again
Copy link
Author

child.js

// start sending ping
setInterval(function() {
    console.log('send ping');
    process.send({
        event: 'ping'
    });
}, Math.floor(.5 * options.maxBlockDuration));

// send start event
process.send({
    event: 'start'
});

// require deps
options.deps.forEach(function(dep) {
    _require(dep, true);
});

// require code
_require(options.code, true);

// require tests
options.tests.forEach(function(test) {
    _require(test, false);
});

testrunner.js

if (msg.event === 'start') {
    lastPing = Date.now();
    console.log('start', lastPing);

    pingInterval = setInterval(function(){
        var now;
        now = Date.now();
        if (now - lastPing > opts.maxBlockDuration) {
            console.log(now, lastPing, now - lastPing, opts.maxBlockDuration);
            callback(new Error('ping error'));
            kill();
        }
    }, opts.maxBlockDuration);
}
...
} else if (msg.event === 'ping') {
    console.log(Date.now() - lastPing);
    lastPing = Date.now();
    console.log('recv ping');
}
@make-github-pseudonymous-again
Copy link
Author

and there is a clearInterval(pingInterval) in kill()

@kof
Copy link
Contributor

kof commented Sep 14, 2014

try to find which calls in child takes this time. f.e. you are sending start event before requiring dependencies which blocks the process, but actually shouldn't take that much time.

@make-github-pseudonymous-again
Copy link
Author

Adding console.log(...) calls at appropriate places I get

Testing  /home/genius/dev/node/node-qunit/test/fixtures/testrunner-code.js ... 
deps loaded
start 1410700108743
{ path: '/home/genius/dev/node/node-qunit/test/fixtures/testrunner-code.js' } true
code loaded
{ path: '/home/genius/dev/node/node-qunit/test/fixtures/testrunner-tests.js' } false
no more tests
1410700109746 1410700108743 1003 1000

assert.js:92
  throw new assert.AssertionError({
        ^
AssertionError: no errors
    at /home/genius/dev/node/node-qunit/test/testrunner.js:49:11
    at /home/genius/dev/node/node-qunit/lib/testrunner.js:185:24
    at null.<anonymous> (/home/genius/dev/node/node-qunit/lib/testrunner.js:84:21)
    at wrapper (timers.js:252:14)
    at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

which doesn't make it look like something takes time to load..

@make-github-pseudonymous-again
Copy link
Author

could you explain me why you need to debounce the QUnit.done callback?

@kof
Copy link
Contributor

kof commented Sep 17, 2014

I believe there are cases where the callback has been called more than once.

@make-github-pseudonymous-again
Copy link
Author

There seems to be some kind of inherent limit on the minimum maxBlockDuration I can use with my implementation. You could take a look at it. It breaks somewhere between 500ms and 600ms on my computer. I will soon push all my recent changes on my fork.

make-github-pseudonymous-again added a commit to aureooms-contrib/node-qunit that referenced this issue Sep 17, 2014
…equire call in child.js, set ping period to half the check period (qunitjs#98)
@make-github-pseudonymous-again
Copy link
Author

Voilà -> https://github.com/aureooms/node-qunit. If it's ok then I'll clean the console.log calls. I suppose I will also have to add a reference to maxBlockDuration inside the doc and the cli.

@make-github-pseudonymous-again
Copy link
Author

up?

@kof kof closed this as completed in 9d0f841 Oct 2, 2014
@kof
Copy link
Contributor

kof commented Oct 2, 2014

ok did it, everything considered, works also with very small maxBlockDuration. Starting child process takes 400ms on my machine, 113ms just the child, the rest javascript libs.

Checkout the new version, its already published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants