`waitForReport` YAML instruction is racy #19861

mrjerryjohns · 2022-06-22T18:04:55Z

Problem

CI Tests have been failing a lot more on Darwin recently. One example of that is here.

In that test, there is CPU starvation/context switching happening that's exposing some racy test logic.

That test in particular does a few things:

Sends UpOrOpen command to get the window blind 'moving'. That should change the OperationalStatus value immediately to 0x21.
Waits for 2s to get some initial movement.
Subscribes to the OperationalStatus attribute with a min-interval of 4s, max of 5s, which returns back the value of 0x21. However, at this point, the test app side has a big, unexplained jump in timestamp for some reason:

2022-06-21T19:30:30.6266150Z 2022-06-21 19:30:30.626 ERROR   19:29:08.528 - TEST OUT  : [0;34m[1655839748528] [28482:124272] CHIP: [DMG] 				DataVersion = 0x7da58729,[0m
2022-06-21T19:30:30.6267080Z 2022-06-21 19:30:30.626 ERROR   19:29:08.528 - TEST OUT  : [0;34m[1655839748528] [28482:124272] CHIP: [DMG] 				AttributePathIB =[0m
2022-06-21T19:30:30.6267950Z 2022-06-21 19:30:30.626 ERROR   19:29:19.707 - TEST OUT  : [0;34m[1655839748528] [28482:124272] CHIP: [DMG] 				{[0m
2022-06-21T19:30:30.6269390Z 2022-06-21 19:30:30.626 ERROR   19:29:19.707 - TEST OUT  : [0;34m[1655839748528] [28482:124272] CHIP: [DMG] 					Endpoint = 0x1,[0m
2022-06-21T19:30:30.6270010Z 2022-06-21 19:30:30.626 ERROR   19:29:19.707 - TEST OUT  : [0;34m[1655839748528] [28482:124272] CHIP: [DMG] 					Cluster = 0x102,[0m

This time jump of 10s happens right while it's parsing the data.

Then, we call StopMotion on the target. That results in a report being generated immediately containing a value of 0 for the status back to the client, because on the server, the minimum interval requirement of 4s had already been satisfied.
We wait for 3s for some reason on the test in Step 2b, and then check if we received the report. The report was already received earlier however, so we end up waiting and subsequently, timing out waiting for the change.

This test only passes when there is no CPU starvation happening, because it relies on the min interval of 4s holding off the report generation till we've gotten to the right test stage. That is really racy.

Proposal

waitForReport YAML command should really be checking for the report to have been received right from the previous call to subscribeAttribute, and not when the that test step is executed.

The text was updated successfully, but these errors were encountered:

mrjerryjohns · 2022-06-22T18:05:33Z

FYI @vivien-apple @woody-apple @bzbarsky-apple

stale · 2023-04-25T19:00:08Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

ericzijian1994 · 2023-09-01T07:27:16Z

Think the problem is still happening...

bzbarsky-apple · 2023-09-02T04:11:30Z

Yes, #28257 is not merged yet.

mrjerryjohns added testing V1.0 labels Jun 22, 2022

jmeg-sfy self-assigned this Aug 23, 2022

bzbarsky-apple added the yaml Missing features or bugs in the YAML test harness label Oct 6, 2022

franck-apple added the p1 priority 1 work label Oct 24, 2022

stale bot added the stale Stale issue or PR label Apr 25, 2023

bzbarsky-apple removed the stale Stale issue or PR label Apr 25, 2023

hare-siterwell mentioned this issue Jul 22, 2023

Delay processing test event in Smoke CO Alarm #28044

Merged

vivien-apple linked a pull request Jul 25, 2023 that will close this issue

[chiptool.py] Ensure async report that came in before the test got a … #28257

Open

ericzijian1994 mentioned this issue Aug 25, 2023

Modify the label for the manual step of the TC-SMOKECO-2.4 #28876

Merged

raju-apple mentioned this issue Oct 19, 2023

[SMOKECO] Fails in TH waits for a report from DUT with a timeout of 300 seconds project-chip/certification-tool#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`waitForReport` YAML instruction is racy #19861

`waitForReport` YAML instruction is racy #19861

mrjerryjohns commented Jun 22, 2022 •

edited

Loading

mrjerryjohns commented Jun 22, 2022

stale bot commented Apr 25, 2023

ericzijian1994 commented Sep 1, 2023

bzbarsky-apple commented Sep 2, 2023

waitForReport YAML instruction is racy #19861

waitForReport YAML instruction is racy #19861

Comments

mrjerryjohns commented Jun 22, 2022 • edited Loading

Problem

Proposal

mrjerryjohns commented Jun 22, 2022

stale bot commented Apr 25, 2023

ericzijian1994 commented Sep 1, 2023

bzbarsky-apple commented Sep 2, 2023

`waitForReport` YAML instruction is racy #19861

`waitForReport` YAML instruction is racy #19861

mrjerryjohns commented Jun 22, 2022 •

edited

Loading