Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_runner: allow retryable tests #52626

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

RedYetiDev
Copy link
Member

@RedYetiDev RedYetiDev commented Apr 21, 2024

Adds a new retries(X) method to the test_runner, which allows users to run tests multiple times (in case of failure). This value can also be hardcoded via the { retries } option.

(Fixes #48754)

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/test_runner
@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem. labels Apr 21, 2024
@RedYetiDev RedYetiDev added the notable-change PRs with changes that should be highlighted in changelogs. label Apr 21, 2024
Copy link
Contributor

The notable-change PRs with changes that should be highlighted in changelogs. label has been added by @RedYetiDev.

Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section.

@cjihrig
Copy link
Contributor

cjihrig commented Apr 21, 2024

This seems like the wrong design for this feature. We shouldn't be creating a new class. I would expect this to be an option when users create a test like:

test('foo', { retries: 5 }, () => {});

I also don't think this is something we should support on the test context object. This also needs to take test hooks into consideration. I would double check with other frameworks, but there is an expectation that certain hooks run again when a test is retried.

@RedYetiDev
Copy link
Member Author

This seems like the wrong design for this feature. We shouldn't be creating a new class. I would expect this to be an option when users create a test like:

test('foo', { retries: 5 }, () => {});

I also don't think this is something we should support on the test context object. This also needs to take test hooks into consideration. I would double check with other frameworks, but there is an expectation that certain hooks run again when a test is retried.

Everything you said makes sense, except why wouldn't it be supported on the test context object?

@cjihrig
Copy link
Contributor

cjihrig commented Apr 21, 2024

Everything you said makes sense, except why wouldn't it be supported on the test context object?

Actually, let's survey how other frameworks implement this before committing to a design.

@RedYetiDev
Copy link
Member Author

RedYetiDev commented Apr 21, 2024

AFAIK,
Mocha: this.retries(5); (On the TestContext)
Jest: jest.retryTimes(5);

But I do think you are right about the attempt() method being unneeded and the use of the test() method for both is better.

@cjihrig
Copy link
Contributor

cjihrig commented Apr 21, 2024

I'd be OK with following Mocha on this.

@RedYetiDev
Copy link
Member Author

I'd be OK with following Mocha on this.

I don't know, I think a config option { retries } seems more like the rest of Node.js's testings

@cjihrig
Copy link
Contributor

cjihrig commented Apr 21, 2024

I personally prefer that style. One nice thing about having it on the test context is that you can dynamically set it if you detect something happening while the test is running (like encountering a specific error). At the end of the day, both APIs would work the same way under the hood, so they could both be supported.

@RedYetiDev
Copy link
Member Author

I personally prefer that style. One nice thing about having it on the test context is that you can dynamically set it if you detect something happening while the test is running (like encountering a specific error). At the end of the day, both APIs would work the same way under the hood, so they could both be supported.

True, I can implement it :-).

@RedYetiDev RedYetiDev marked this pull request as draft April 21, 2024 20:09
@RedYetiDev
Copy link
Member Author

Moving to draft for development

@RedYetiDev RedYetiDev marked this pull request as ready for review April 21, 2024 20:41
@RedYetiDev RedYetiDev requested a review from cjihrig April 21, 2024 20:42
@RedYetiDev
Copy link
Member Author

@cjihrig, I've made the requested changes.

@RedYetiDev
Copy link
Member Author

RedYetiDev commented Apr 21, 2024

I need to address a few errors.

@MoLow
Copy link
Member

MoLow commented Apr 21, 2024

I am -1 on adding this to node as it is an anti-pattern, and it can be achieved with a few lines of code, or a npm package

});

test('dynamic retries', (t) => {
t.retries(5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth documenting whether this can be called multiple times

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I'll add that documentation tomorrow!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolving because I don't think this was addressed.

test/fixtures/test-runner/output/retryable.js Show resolved Hide resolved
@juliangruber
Copy link
Member

I am -1 on adding this to node as it is an anti-pattern, and it can be achieved with a few lines of code, or a npm package

Why is it considered an anti pattern?

@MoLow
Copy link
Member

MoLow commented Apr 22, 2024

Why is it considered an anti pattern?

since it is mostly abused by encouraging tests to be non-deterministic and non-predictable (in the majority of cases i've seen). such features are often used to hide flakiness instead of fixing it.

@RedYetiDev
Copy link
Member Author

RedYetiDev commented Apr 22, 2024

since it is mostly abused by encouraging tests to be non-deterministic and non-predictable (in the majority of cases i've seen). such features are often used to hide flakiness instead of fixing it.

I think it'll be helpful when users know that some tests fail and want to give it a second (or third) chance to succeed.

We can see lots of instances where this will be helpful just by looking at implementations using Jest: https://github.com/search?q=%2FJest%5C.retryTimes%5C%28%5Cd%2F&type=code

Or, like @cjihrig mentioned, you could increment the retries by +1 when you encounter a specific error, such as a network issue.

@targos targos added the semver-minor PRs that contain new features and should be released in the next minor version. label Apr 22, 2024
@targos
Copy link
Member

targos commented Apr 22, 2024

It's fine to add the notable-change label, but most important for the changelog is semver-minor PRs that contain new features and should be released in the next minor version.

@RedYetiDev
Copy link
Member Author

RedYetiDev commented Apr 22, 2024

It's fine to add the notable-change label, but most important for the changelog is semver-minor PRs that contain new features and should be released in the next minor version.

Thanks for the info! I think it is a notable change because we are adding functionality to the test runner (and the functionality exists in other test runners)

@targos
Copy link
Member

targos commented Apr 22, 2024

Yes I understand, but semver-minor changes are automatically considered notable by the release tool. The notable-change label is only really useful for notable semver-patch changes.

@RedYetiDev RedYetiDev removed the notable-change PRs with changes that should be highlighted in changelogs. label Apr 22, 2024
@RedYetiDev
Copy link
Member Author

Yes I understand, but semver-minor changes are automatically considered notable by the release tool. The notable-change label is only really useful for notable semver-patch changes.

That's good to know! we learn something new every day :-)

@RedYetiDev
Copy link
Member Author

Would it be worth it to also add a reruns value, to rerun the test despite failure/success?

MoLow
MoLow previously requested changes Apr 26, 2024
Copy link
Member

@MoLow MoLow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding an explicit request changes as I think this should be added

@atlowChemi
Copy link
Member

atlowChemi commented Apr 26, 2024

I am -1 on adding this to node as it is an anti-pattern, and it can be achieved with a few lines of code, or a npm package

I second this. I am -1 for adding this as well.

@benjamingr
Copy link
Member

Basically #48754 (comment) - I'm +1 because I think that this can't be done well in userland (it's not "just a few lines of code" as implied) and otoh it's hypocritical to say it's bad while doing it ourselves in our own CI since that's a purity argument (sure, retries and flakey tests are bad but they're a fact of life and fixing tests while retrying seems like something that's pretty common)

@RedYetiDev
Copy link
Member Author

since it is mostly abused by encouraging tests to be non-deterministic and non-predictable (in the majority of cases i've seen). such features are often used to hide flakiness instead of fixing it.

There also cases where tests have to be non-predictible. If a test is `fetching a website, and that site returns a strange response, this feature would let the test re-try to succeed (for example). There are thousands of examples of use-cases for this feature.

@cjihrig
Copy link
Contributor

cjihrig commented Jun 28, 2024

@benjamingr are you able to find a path forward for this? There seems to be support, but also at least two -1's. I think we should decide if this is something we will implement, or close the PR. (I'm trying to get through some of my PR review requests)

@benjamingr
Copy link
Member

I think we wait for Moshe to change his mind (enough users asking for this for example) or for us to change our mind 😅

@RedYetiDev RedYetiDev requested a review from MoLow June 28, 2024 15:05
@cjihrig
Copy link
Contributor

cjihrig commented Jun 28, 2024

I just noticed this comment 😆 - #48754 (comment)

@RedYetiDev
Copy link
Member Author

I still think this is a step in the right direction. Test failures are unavoidable, but handling them isn't impossible.

@benjamingr
Copy link
Member

@cjihrig yeah I pointed that out to him (that he said he wouldn't block, forgot about that and blocked :D)

@MoLow MoLow dismissed their stale review June 30, 2024 06:48

removing my block. I still think it is bad practice but I see a lot of people asking for it

@MoLow
Copy link
Member

MoLow commented Jun 30, 2024

@benjamingr @cjihrig I have removed my block

@@ -209,6 +210,19 @@ class TestContext {
return subtest.start();
}

retries(number) {
validateNumber(number, 'number', 0, NumberMAX_SAFE_INTEGER);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use validateInteger() here.

@@ -209,6 +210,19 @@ class TestContext {
return subtest.start();
}

retries(number) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name this parameter count.

@@ -209,6 +210,19 @@ class TestContext {
return subtest.start();
}

retries(number) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three of these functions are missing proper API docs.

- REPLACEME
-->

Failed tests can be retried via a hardcoded `{ retries }` option, or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Failed tests can be retried via a hardcoded `{ retries }` option, or
Failing tests can be automatically retried via the `retries` test option, or

Note, retries needs API docs and we can link to them from here.

-->

Failed tests can be retried via a hardcoded `{ retries }` option, or
via the dynamic `retries()` TestContext prototype method.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
via the dynamic `retries()` TestContext prototype method.
via the `retries()` `TestContext` prototype method.

Please also link from here.

- REPLACEME
-->

Failed tests can be retried via a hardcoded `{ retries }` option, or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention that hooks are re-run for each retry.

Failed tests can be retried via a hardcoded `{ retries }` option, or
via the dynamic `retries()` TestContext prototype method.

Via the `.currentAttempt` property, the current attempt (the number of times
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the leading . from the property names.

Also, currentAttempt is a bit confusing IMO. I would expect it to be one-based and not only specific to retries. For example, if I run a test with no retries, it will run once. I would expect currentAttempt to be one on that single run.

kCallbackAndPromisePresent,
));
await SafePromiseRace([ret, stopPromise]);
while (true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an infinite loop here makes me nervous. Would a for loop work:

for (this.currentAttempt = 1; this.currentAttempt <= this.retries + 1; this.currentAttempt++)

You could still break out of the loop if the test passes.

test('should call before/after hooks (success)', {
retries: 3,
}, (t) => {
if (t.attempts < 3) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be currentAttempt?

const common = require('../../../common');
const test = require('node:test');
const assert = require('node:assert');

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a test case where a test (not suite) has retries set and there are several subtests that also have retries, but the last subtest never passes. We need to make sure there is no duplicated reporting.

@benjamingr
Copy link
Member

removing my block. I still think it is bad practice but I see a lot of people asking for it

We can discuss an alternative design like --test-flaky that outputs (or tags) tests that passed with retries as flaky or something similar.

My point is both these things are true at the same time:

  • Retries are useful as a mechanism to not block CI while you have flaky tests
  • Retries are a very bad way to deal with flaky tests

So I'm all ears for a different design for retries while acknowledging that some way is needed and that it benefits from being in core.

@RedYetiDev
Copy link
Member Author

RedYetiDev commented Jul 1, 2024

I'm not opposed to that idea, maybe there should be a discussion on how to best achieve this?

Maybe the TSC could decide whether to use CLI flags or programatic methods (or another way)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-ci PRs that need a full CI run. semver-minor PRs that contain new features and should be released in the next minor version. test_runner Issues and PRs related to the test runner subsystem.
8 participants