Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LiveMetric: realtime statistics for debug UI. #405

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hysw
Copy link
Collaborator

@hysw hysw commented Jan 16, 2024

This PR is a fork of #377 .

Adding real time statistics into the metric framework so it is easier to read.

@hysw hysw requested review from Keenuts and mdagois January 16, 2024 00:21
@hysw hysw force-pushed the tianc/metric-live branch 2 times, most recently from c875639 to 55f4a42 Compare January 16, 2024 00:28
@mdagois
Copy link
Collaborator

mdagois commented Jan 16, 2024

Could you please add a quick summary of what problem the PR solves?
I am not extremely familiar with the benchmark, and I have not touched BigWheels for a few months.

@hysw
Copy link
Collaborator Author

hysw commented Jan 16, 2024

Could you please add a quick summary of what problem the PR solves?
I am not extremely familiar with the benchmark, and I have not touched BigWheels for a few months.

This gives a readable statistics for cpu submission time / gpu work duration / etc. i.e. the leading digit does not jump around every frame. This version is due to suggestion of @Keenuts 's suggestion on have it in the metric framework. The stand alone version is #377 .

Copy link
Member

@Keenuts Keenuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. A few comments 😊

src/ppx/metrics.cpp Outdated Show resolved Hide resolved
src/ppx/metrics.cpp Outdated Show resolved Hide resolved
include/ppx/metrics.h Outdated Show resolved Hide resolved
include/ppx/metrics.h Outdated Show resolved Hide resolved
include/ppx/metrics.h Outdated Show resolved Hide resolved
@@ -299,6 +415,12 @@ class Manager final
// Get Gauge Basic Statistics, only works for type GAUGE
GaugeBasicStatistics GetGaugeBasicStatistics(MetricID id) const;

// Get Live Statistics for a LiveMetric, only works for type GAUGE
LiveStatistics GetLiveStatistics(MetricID id) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall this be renamed to GetGaugeLiveStatistics to match the function above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not report MetricGauge's live statistics, so I'll leave Gauge out of the name.

@mdagois
Copy link
Collaborator

mdagois commented Jan 17, 2024

Could you please add a quick summary of what problem the PR solves?
I am not extremely familiar with the benchmark, and I have not touched BigWheels for a few months.

This gives a readable statistics for cpu submission time / gpu work duration / etc. i.e. the leading digit does not jump around every frame. This version is due to suggestion of @Keenuts 's suggestion on have it in the metric framework. The stand alone version is #377 .

It is a kind of tracker of statistics for an underlying gauge metrics, and you reset it (by clearing history) whenever you want. Is that correct?

@Keenuts
Copy link
Member

Keenuts commented Jan 17, 2024

Could you please add a quick summary of what problem the PR solves?
I am not extremely familiar with the benchmark, and I have not touched BigWheels for a few months.

This gives a readable statistics for cpu submission time / gpu work duration / etc. i.e. the leading digit does not jump around every frame. This version is due to suggestion of @Keenuts 's suggestion on have it in the metric framework. The stand alone version is #377 .

It is a kind of tracker of statistics for an underlying gauge metrics, and you reset it (by clearing history) whenever you want. Is that correct?

The initial idea is to have a layer on top of the raw metrics that averages it over X samples.
Precision is not important since it's only used for UI display. What's more important is to be able to eyeball the metrics quickly (hence the window average).
Because this is for quick eyeballing investigation, it's a "nice to have" to be able to disable recording.

That's for the goal/idea I have in mind.

Copy link
Member

@Keenuts Keenuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! Added comments, but LGTM for the direction. Addind second owner.

include/ppx/application.h Outdated Show resolved Hide resolved
src/test/metrics_test.cpp Outdated Show resolved Hide resolved
@Keenuts Keenuts requested a review from apazylbe January 17, 2024 13:34
@hysw
Copy link
Collaborator Author

hysw commented Jan 17, 2024

It is a kind of tracker of statistics for an underlying gauge metrics, and you reset it (by clearing history) whenever you want. Is that correct?

Addition to Nathan's comment. The use case is to tracks a set of gauge metrics that we want to see on screen, not necessary the ones we want to record to a profiling run (e.g. cpu time spent on input processing). The samples are $2^{-t_{elapsed}}$ weighted to make sure we are showing recent information (also this formulation allows constant time update). Please note this is not the last X samples since interval between sample can vary from $<0.1ms$ to $100^+ ms$ . The clearing of history is somewhat unnecessary, since old data's weight will decay to almost zero in a few seconds, it's mainly done to since if we know parameter has changed, immediate update is better than waiting a few seconds.

@hysw
Copy link
Collaborator Author

hysw commented Jan 17, 2024

Adding Angela & Ran since they have reviewed #377 .

@mdagois
Copy link
Collaborator

mdagois commented Jan 18, 2024

Could you please add a quick summary of what problem the PR solves?
I am not extremely familiar with the benchmark, and I have not touched BigWheels for a few months.

This gives a readable statistics for cpu submission time / gpu work duration / etc. i.e. the leading digit does not jump around every frame. This version is due to suggestion of @Keenuts 's suggestion on have it in the metric framework. The stand alone version is #377 .

It is a kind of tracker of statistics for an underlying gauge metrics, and you reset it (by clearing history) whenever you want. Is that correct?

The initial idea is to have a layer on top of the raw metrics that averages it over X samples. Precision is not important since it's only used for UI display. What's more important is to be able to eyeball the metrics quickly (hence the window average). Because this is for quick eyeballing investigation, it's a "nice to have" to be able to disable recording.

That's for the goal/idea I have in mind.

Ok, thx. Makes sense.

@mdagois
Copy link
Collaborator

mdagois commented Jan 18, 2024

LGTM from a metrics perspective. I'll defer approval to people that own the project.

Copy link
Collaborator

@apazylbe apazylbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review, only left a couple of comments cause I need to take another look fully.
Personally though, I find the metrics framework more confusing with these changes, and not sure if the complexity is justified. But if I am in minority, I am also fine with this.
Also I would like for Application to contain less code that doesn't need to be there, I understand we have some wrappers there to add additional metrics, but do the others functions need to be there?

bool BindMetric(metrics::MetricID metricID, const metrics::MetricMetadata& metadata);

// Bind a live metric, return true on success.
bool BindLiveMetric(metrics::MetricID metricID);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate here? What does "binding" a metric mean? Why do we need to call that? What happens if we don't?
I would ideally like to know these things without reading the code.

@@ -523,13 +524,28 @@ class Application
// See StartMetricsRun for why this wrapper is necessary.
virtual bool HasActiveMetricsRun() const;

// Allocate a metric id to be used for a combined live/recorded metric.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have all of these metric related functions in Application?

@hysw
Copy link
Collaborator Author

hysw commented Jan 18, 2024

Personally though, I find the metrics framework more confusing with these changes, and not sure if the complexity is justified.

My preference is keeping it simple and separate, however Nathan have some good point, so I also made this version to see if integration makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants