Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User agent does not include "Chrome Lighthouse" anymore? #14917

Open
hjelmdal opened this issue Mar 22, 2023 · 21 comments
Open

User agent does not include "Chrome Lighthouse" anymore? #14917

hjelmdal opened this issue Mar 22, 2023 · 21 comments
Assignees

Comments

@hjelmdal
Copy link

hjelmdal commented Mar 22, 2023

What happened?

The user agent of the pagespeed api doesn't seem to include "Chrome Lighthouse" anymore, so that libraries like isBot do not detect when Google Lighthouse tests are running, and thereby we can not control loading 3rd party content blocking the viewport like cookie policy overlays, that is then considered LCP and might be blocking potential CLS elements.

User agent is now:
"Mozilla/5.0 (Linux; Android 11; moto g power (2022)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Mobile Safari/537.36"

What did you expect?

User agent before v10:

"Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4695.0 Mobile Safari/537.36 Chrome-Lighthouse"

@hjelmdal hjelmdal added the bug label Mar 22, 2023
@hjelmdal hjelmdal changed the title User agent does not include "lighthouse" anymore? Mar 22, 2023
@paulirish
Copy link
Member

paulirish commented Mar 22, 2023

Correct. We removed it from the useragent in #14384

@ejerskov
Copy link

ejerskov commented Mar 23, 2023

@paulirish, I understand gaming the performance score is bad, but what exactly is the gain of doing that and why do you want to prevent it? They're getting an incorrect higher score but their users are going to have a bad time anyway - but thats their choice, right? If they're already gaming the score, they're actively ignore their users' experience. CruX data will reveal the real user experience anyway.
Unless the PSI score is used for seo ranking or gives any other advantages I really can't see the problem.

We now have a huge issue with the cookie popup that will take place as LCP and make the full page screenshot useless. Most of our users are returning visitors who's already interacted with the cookie popup, so the best "real" use case we can make for measuring performance for our visitors is by blocking the cookie popup.

The CLS score (which is now weighing 25%) is also hard to get a really good measurement of with the cookie popup blocking everything.

@TomaszBudny
Copy link

TomaszBudny commented Mar 23, 2023

Unless the PSI score is used for seo ranking or gives any other advantages I really can't see the problem.

Exactly, this is an auditing tool, so what's the point?

I used this detection to display the high-contrast version of my website for audit purposes ( I have a contrast toggle in accordance with https://www.w3.org/WAI/WCAG21/Techniques/general/G174.html ), but now you're showing me issues that don't exist because you're incorrectly evaluating the standard-contrast version.

Since you check for contrast issues, then at least set prefers-contrast in the audit so you can evaluate contrast where it's needed, because in my case you're showing bad results.

@fevangelou

This comment was marked as off-topic.

@BuslikDrev
Copy link

Correct. We removed it in #14384

We still need to make random devices.

@hjelmdal
Copy link
Author

hjelmdal commented Mar 28, 2023

I understand there's a number of different reasons- or motivations for being able to detect the user agent in this case. My motivation is certainly not to trick the audit, because I would only fool myself. But to give the end users the best experience, and to be able to use the tool as a true auditing tool, I need to be able to toggle things like cookie-overlays blocking the entire viewport. Apart from that, after the update, I now suddenly see a lot of new traffic in my analytics polluting my overview over where my users come from, which is really not ideal as well.

If the Lighthouse team is using the same PSI api to "score" websites officially for their organic search rank, I suggest that those audits can be "stealth" if really needed, and if these audit counts are not thousands of requests to each origin audited (?), but for all other purposes, it would really be helpful to us, to have the option to still know the user agent of the PSI api visits from our own audits. As a minimum, I think we should be able to control whether to show the user agent or not as an option through the payload.

I really hope you will consider this, as our audits in terms of debugging and fixing CLS are useless right now - that one metric, that in the same update has increased weight to 25% of the total score.

As an example, I have attached a screenshot from before detecting user agent, and then what I am getting now - I hope you see my point.
Before v10: Link
After v10: Link

@hjelmdal
Copy link
Author

hjelmdal commented Mar 30, 2023

Can you please share your pov on this issue or paradox, @connorjclark ?

@hjelmdal
Copy link
Author

It's getting kinda critical to get a resolution to this - it's been almost a month now with useless screenshots, and CLS that can not be debugged properly bacause of this.. @connorjclark or @paulirish, Would you please give an update on this and at least consider adding an option to expose the user agent or not through the api until you decide whether it should be fully exposed again or not?

Thanks alot in advance 🙏

@paulirish
Copy link
Member

I'm currently pursuing a solution that would restore the signal, but if it works, it's still going to take some time..

In the meantime, here's what I suggest:

  • You can add a custom ?queryParam or #hash.. (eg. ?isLighthouse) in the URLs that you feed to the PSI API. On the backend you'll have to check those and adapt the result accordingly.

I know isBot is not going to detect this today, but you can certainly account for this separately. I'm aware this is not as convenient as a UA check, but hopefully it's of some use.

@alekseykulikov
Copy link
Contributor

+1 for restoring Chrome-Lighthouse. I know many teams that use it to remove LH traffic from analytics.

@gilbertococchi
Copy link

gilbertococchi commented May 11, 2023

To debug CLS I believe best solution would be to use Timespan mode and userflows (https://web.dev/lighthouse-user-flows/), not simply skipping the CMP consent anyway.

A scripted test in Lighthouse would alllow to measure whether the CMP may cause an INP issue (really important in the debugging workflow of INP) but would also allow to scroll on the page or perform multiple interactions that may cause CLS and other INP issues.

A PageLoad with CMP consent emulation would still not be sufficient to highlight all the CLS possible issues in my opinion, more developers should think to look postLoad and emulating interactions in my opinion.

@trinhpham
Copy link

@paulirish
My 2 cents:

  • If you don't want to distinguish the UA of LH from other traffic, don't hard code and override browser UA. Let browser uses its own one if emulatedUserAgent is not set.
  • Allow user to dynamically modify emulatedUserAgent at runtime. So people could solve their own needs
    • Allow user to config UA suffix, so we can just restore the old traffic by setting it to Chrome-Lighthouse
    • Allow reading version from the launched browser. Then people who stuck with new version of LH can have more room to keep their browser up to date.
@prathamesh-gharat
Copy link

@hjelmdal from what I understand google does not do "stealth" lighthouse checks. From my understanding the Chrome Web Vitals (CWV) metrics are captured from your visitors browser's itself (reported in search console's core web vitals too). Check the following link to understand Chrome UX Report: https://developer.chrome.com/docs/crux/

If you are showing your normal users cookie policy popups / overlays then they contribute to LCP / CLS too depending on how they are displayed to first time visitors. If you want to test what your CWV metrics are then you could try creating parameterized URLs to toggle certain features and test your scores (for your own understanding only).

Use ways that do not cover the full screen (don't make it a large element for "LCP"). Don't move elements on screen (to cause CLS) without user interaction. ( https://web.dev/cls/#expected-vs-unexpected-layout-shifts )

@hjelmdal
Copy link
Author

@prathamesh-gharat, I am not really sure what you mean by your comment? You should never mix CWV with Lighthouse - CWV are collected from RUM where as Lighthouse is a simulated test conducted in a controlled environment suitable for comparing results against one another. I was more curious to know the reasons behind why it is important to the Google team to run stealth Lighthouse tests not giving the option to even do it stealth or not.

The parameterized solution has been mentioned several times, but that requires a code change to all solutions already running to be able to support this, which will be cumbersome. Again, what's the purpose of Lighthouse, is it to help debug and improve performance, which will then reflect in CWV or is there another hidden purpose that we don't know about when it is mentioned, that the UA footprint had to be removed due to someone "trying to game the system" - who are they fooling? - themselves, unless Lighthouse is secretly used as a scoring parameter in Googles algorithms across their search tools etc.

About your CLS - I am solely talking about the option to efficiently being able to debug CLS in an automated setup. Most of the sites I monitor are using Cookieinformation as 3rd party cookie compliance vendor, which is presented as a full screen overlay - when that is rendered in an automated lighthouse test, it's impossible to evaluate CLS and any other visual evaluation is useless, because the screenshot will only show the cookie popup. Obviously, you will be able to debug manually, but this is not really the point here :-)

@Olegt0rr
Copy link

Olegt0rr commented Jun 21, 2023

Same problem.

After changing User-Agent, Lighthouse's bot looks like a real user for Yandex.Metrica, so the metric registers bounced visits :(

@BuslikDrev
Copy link

Same problem.

After changing User-Agent, Lighthouse's bot looks like a real user for Yandex.Metrica, so the metric registers bounced visits :(

А зачем вы вообще её загружаете после начала загрузки страницы? Загружайте только тогда, когда было взаимодействие с сайтом (mouseover, scroll, click), а после начала загрузки страницы запускайте только функцию получения времени.

@upnishad
Copy link

This might be a huge bump in our development process as we are using this to exclude the GTM scripts which is provided by google itself and acts as a blocker for us by giving a huge load time and acts as a blocking resource, which is provoking us to ignore using GTMs anymore, how would you want us to proceed then, first make us use your scripts and then give us a mountain to climb with it on our back?

@hjelmdal
Copy link
Author

I'm currently pursuing a solution that would restore the signal, but if it works, it's still going to take some time..

In the meantime, here's what I suggest:

  • You can add a custom ?queryParam or #hash.. (eg. ?isLighthouse) in the URLs that you feed to the PSI API. On the backend you'll have to check those and adapt the result accordingly.

I know isBot is not going to detect this today, but you can certainly account for this separately. I'm aware this is not as convenient as a UA check, but hopefully it's of some use.

@paulirish, can you share some info on timeline on this - it's been well over 6 months and a major since with no update.. And it is still making the lighthouse tool useless in some senses right now, where we are forced to use alternative tools, hacks and methods to debug- and get useful information for improving our sites.

@benschwarz
Copy link
Contributor

benschwarz commented Nov 23, 2023

@hjelmdal I have some suggested workarounds for you:

PageSpeed Insights, PSI API, or Chrome Devtools

  • Use a query string parameter (as @paulirish suggested above) - e.g.: my-domain.tld/page?isLighthouse=true

Lighthouse command line interface

  • Set --emulatedUserAgent or --extra-headers flags

Other pro paid tools such as Calibre (disclaimer: this is my company) have dedicated IPV4 addresses for all testing, so you can add it to allow lists. Alternatively you can set custom HTTP headers or Cookies. If you require greater control over the testing environment then you may want to consider an external service, or run your own Lighthouse service.

@pedddro
Copy link

pedddro commented Dec 15, 2023

Looks like people are now using userAgentData to do the trickery, and this only hurts legitimate users.

@jodylecompte
Copy link

To add to the list of use cases where having the user agent or some general method for detecting a lighthouse test compared to a user agent.

My use case is that there is a popup with an overlay, in the screenshots for accessibility testing, the overlay is there and it's disrupting the ability to detect color contrast. In the case of desktop, this results in an artificially lower score because it flags things that evaluate as AA or AAA conformant as non-passing because its factoring in the overlay, but on mobile the test just out-right fails and I can no longer measure those particular metrics via light house.

The workaround is to disable it in our staging environment, but this does not produce an accurate user score because now we have created two sources of truth.

To be honest, I'm not really understanding the rationale of stopping people from gamifying their results because that's really only an internal matter of someone misleading clients or stakeholders. The actual SEO rankings are based on real data, not lab data, is that correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment