0

Problem: While using multi delta streamer for kafka ingestion, out of many tables, if one of the table ingestion fails, job succeeds. There is no way to check for success/failure for a particular table instead of directly going into the logs.

I want to use hoodie metrics to push events to prometheus and want to extract information if any of the table ingestion failed by creating alarms. But I couldn't find any combination of metrics to detect the same.

Note: I was trying to create a combination depicting job has successfully fetched events from kafka but there was no commits in hudi (may be _metadata_deltacommit_totalBytesWritten) meaning there was data to be ingested but it failed.

There is no way to find the first part i.e. job was able to fetch events from kafka.

Infra: AWS EMR cluster, Apache kafka, Hudi

0