Test Runner add support for retries #48754

Filipoliko · 2023-07-13T06:45:16Z

What is the problem this feature will solve?

If the test runner is used for some integration, or end-2-end testing, there tends to be some flakiness because of the more complex setup relying on the stability of not just the tests, but also the environment that tests are running against.

What is the feature you are proposing to solve the problem?

Implement support for retries configuration. If a test fails and there is a configuration retries: 1, it will re-run the test one more time and if the second run succeeds, the test is considered as successful. Also all surrounding beforeEach and afterEach scripts should be also executed again as they can contain some necessary setups and cleanups. By default retries should be disabled as it is not something most users would like to have configured by default.

Reporters should also consider this possibility and could report on flaky/retried tests.

Inspiration can be taken from playwright documentation on retries.

CLI
Usage: node --test --test-retries=1 ./tests/test.js

TestRunner.run

import { run } from 'node:test';
import path from 'node:path';

run({
  files: [path.resolve('./tests/test.js')],
  retries: 1
})

The same configuration can be also useful in describe and test/it options object.

What alternatives have you considered?

I considered extending Node Test Runner with my own implementation of retries just for my use-case. But there does not seem to be an elegant solution how to accomplish this right now as the Test Runner does not seem to be very well extensible in this way.

The solution that would work, but wouldn't be very nice, is implementing wrapper function for the tests, that would handle retrying. But this would require wrapping all the test functions to be able to accomplish some global behaviour. Also Test Reporters would not display nice information about flakiness.

The text was updated successfully, but these errors were encountered:

benjamingr · 2023-07-14T08:18:57Z

I wonder if it's not enough to put it on the describe context:

describe("something", (t) => {
  t.setRetries(4);
 // the rest
});

Which would enable per suite/test configuration and programmatic access from within the test file.

MoLow · 2023-07-16T06:49:00Z

I wonder if it's not enough to put it on the describe context:
describe("something", (t) => {
  t.setRetries(4);
 // the rest
});
Which would enable per suite/test configuration and programmatic access from within the test file.

I agree, and also personally prefer to just use a userland retry implementation, since:

I consider that a bad practice for testing
it is trivial to implement

const retry = (fn, retries = 5) => {
	try {
		await fn();
	} catch {
		if (retries > 0) {
			return rety(fn, retries - 1);
		}
	}
}
it("something", retry(() => {
 // the rest
}));

benjamingr · 2023-07-16T07:03:35Z

I agree, and also personally prefer to just use a userland retry implementation, since:

That implementation seems problematic to me because:

The failures are not reported so the user isn't aware of flaky tests
The failures can't be easily turned off globally or for a whole suite to detect flaky tessts.

As for:

I consider that a bad practice for testing

I mostly agree but I think this is a common enough use case that other test runners support and it's useful to run tests with retries as long as you go back and fix the flakiness.

If you look at our own code for example - we rerun flaky tests and fix them after all the time when we "resume ci". We just do it in a very manual/slow way.

cjihrig · 2023-07-16T14:36:09Z

I proposed this as a potential feature when originally writing the test runner, but received push back. I think this is one of those things that in theory most people agree you shouldn't do. The tests should be fixed instead. In practice though, as @benjamingr points out, it happens all the time. I'm not opposed to adding this feature.

Filipoliko · 2023-09-29T10:55:31Z

Regarding the user implementation of retries.. it seems trivial at first, but there are some problems you can run into, especially when implementing some long time running tests (i.e. with some browser setup and navigation). Some of them are mentioned by @benjamingr and I ran into some more.

Test timeout is shared for all retries (so if timeout for the test is 10 sec and the first run takes up that time, the second try is never executed)
Test fail error causes from previous retries are lost (which is very important to have to debug some flakiness)
beforeEach/afterEach scripts are not executed for each retry (this can make some retries doomed to fail before they even started)

This all can be still somehow implemented on the user side, but it is getting more and more complex and it is no longer the straightforward simple implementation mentioned by @MoLow

ErickWendel · 2023-10-04T08:05:13Z

I think this might be useful as a tool to check if a test suite is flaky.

Brainstorm - put a retry as 1000 times and output how many times it has failed - this could be amazing for debugging but I'm not sure about the effort for it

MoLow · 2023-10-04T08:21:33Z

I think this might be useful as a tool to check if a test suite is flaky.

that is a valid use case, but my guess is retries would be used as a footgun to just ignore/introduce flakiness not to debug it.
if it was totally impossible, I would agree adding it to core makes sense, but I still believe running this on user land is possible and not to complicated.

I am happy with this anty-pattern not being easy to use out of the box as long as there are decent alternatives in userland

benjamingr · 2023-10-04T11:59:58Z

@MoLow how can we say that while doing it ourselves in our own tests?

What about resume-ci like functionality where the test runner can take a list of failed tests in some format (tap or junit or whatever) and re-run just those failed tests?

MoLow · 2023-10-05T09:23:28Z

@benjamingr we don't use reruns in core in our tests. I'm just saying its a footgun that has many implementations in userland (some of we use)

benjamingr · 2023-10-05T10:25:33Z

@MoLow no, but we resume-ci a ton which is basically the same thing but more manual

When a PR fails for flakiness as most do we just rerun. We don't stop everything and look at the flakes they are gathered in the ci-reliability repo and dealt with asynchronously usually.

domenic · 2024-01-05T13:46:57Z

This is one of the few missing features required to move jsdom from Mocha to Node's built-in test runner.

benjamingr · 2024-01-05T15:18:11Z

@MoLow WDYT?

MoLow · 2024-01-06T21:28:17Z

I still think this is an anty-pattern and would personally use a userland implementation for retrying failures, but if one would implement a PR I won't block

github-actions · 2024-07-05T01:27:36Z

There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the never-stale Mark issue so that it is never considered stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment.
For more information on how the project manages feature requests, please consult the feature request management document.

benjamingr · 2024-07-05T18:59:23Z

not stale, worked on

Filipoliko added the feature request Issues that request new features to be added to Node.js. label Jul 13, 2023

debadree25 added the test_runner Issues and PRs related to the test runner subsystem. label Jul 13, 2023

mingjunlu mentioned this issue Feb 15, 2024

chore: migrate several tests to node:test withastro/astro#10133

Merged

RedYetiDev linked a pull request Apr 21, 2024 that will close this issue

test_runner: allow retryable tests #52626

Open

github-actions bot added the stale label Jul 5, 2024

benjamingr removed the stale label Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Runner add support for retries #48754

Test Runner add support for retries #48754

Filipoliko commented Jul 13, 2023 •

edited

Loading

benjamingr commented Jul 14, 2023

MoLow commented Jul 16, 2023 •

edited

Loading

benjamingr commented Jul 16, 2023

cjihrig commented Jul 16, 2023

Filipoliko commented Sep 29, 2023

ErickWendel commented Oct 4, 2023

MoLow commented Oct 4, 2023

benjamingr commented Oct 4, 2023

MoLow commented Oct 5, 2023 •

edited

Loading

benjamingr commented Oct 5, 2023 •

edited

Loading

domenic commented Jan 5, 2024

benjamingr commented Jan 5, 2024

MoLow commented Jan 6, 2024 •

edited

Loading

github-actions bot commented Jul 5, 2024

benjamingr commented Jul 5, 2024

Test Runner add support for retries #48754

Test Runner add support for retries #48754

Comments

Filipoliko commented Jul 13, 2023 • edited Loading

What is the problem this feature will solve?

What is the feature you are proposing to solve the problem?

What alternatives have you considered?

benjamingr commented Jul 14, 2023

MoLow commented Jul 16, 2023 • edited Loading

benjamingr commented Jul 16, 2023

cjihrig commented Jul 16, 2023

Filipoliko commented Sep 29, 2023

ErickWendel commented Oct 4, 2023

MoLow commented Oct 4, 2023

benjamingr commented Oct 4, 2023

MoLow commented Oct 5, 2023 • edited Loading

benjamingr commented Oct 5, 2023 • edited Loading

domenic commented Jan 5, 2024

benjamingr commented Jan 5, 2024

MoLow commented Jan 6, 2024 • edited Loading

github-actions bot commented Jul 5, 2024

benjamingr commented Jul 5, 2024

Filipoliko commented Jul 13, 2023 •

edited

Loading

MoLow commented Jul 16, 2023 •

edited

Loading

MoLow commented Oct 5, 2023 •

edited

Loading

benjamingr commented Oct 5, 2023 •

edited

Loading

MoLow commented Jan 6, 2024 •

edited

Loading