I have a raw event table that I'm working with. It has two columns date
and metadata
. Metadata has raw json dump of all the event attributes. But I'd like to make it clear that each time different attributes are sent. And then I need to ingest this data into looker.
Raw table:
timestamp | metadata |
---|---|
2024-04-1 | {"type":"created","title":"test1","due":"2024-04-02","id":12345} |
2024-04-1 | {"type":"confirmed","id":12345} |
2024-04-1 | {"type":"completed","id":12345, "completedby":"johndoe"} |
Now I need to normalize it
OPTION A :
timestamp | type | title | due_date | id | completed_by |
---|---|---|---|---|---|
2024-04-1 | created | test1 | 2024-04-02 | 12345 | |
2024-04-1 | confirmed | 12345 | |||
2024-04-1 | completed | johndoe |
OPTION B:
timestamp | type | title | due_date | id | completed_by |
---|---|---|---|---|---|
2024-04-1 | created | test1 | 2024-04-02 | 12345 | johndoe |
2024-04-1 | confirmed | test1 | 2024-04-02 | 12345 | johndoe |
2024-04-1 | completed | test1 | 2024-04-02 | 12345 | johndoe |
How should I design the table, should I fill all the rows with their respective information (Option B) or should I leave them as nulls (Option A)?
created
,confirmed
andcompleted
types different events? as in: john doe created something in one moment in time, then confirmed, and finally completed? do you want to treat these as different events?