1

We use postgresql as a database. For the backup part, we store full cluster backups and wal files to allow point in time restoration of our database.

Our wal files take up quite a bit of space when compared to our full backups, so we would like to inspect the content of our wal files, and more precisely be able to see what tables contribute the most to the volume of our wal files.

Question

Is there a way to inspect what tables/relations are targeted by the actions stored in a specific wal file ? and how many rows or records are affected ?

2 Answers 2

1

pg_waldump is the tool you'd want to use to inspect the contents of a WAL file. Basically, you can look to see which DML and DDL statements (along with other events) were captured into the WAL file.

One example is below:

-bash-4.2$ pgbench -i
dropping old tables...
creating tables...
generating data (client-side)...
100000 of 100000 tuples (100%) done (elapsed 0.27 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 0.77 s (drop tables 0.03 s, create tables 0.04 s, client-side generate 0.34 s, vacuum 0.21 s, primary keys 0.15 s).
-bash-4.2$ psql -c "select * from pg_class where relname= 'pgbench_accounts'"
  oid  |     relname      | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | relallvisible | reltoastrelid | relhasindex | relisshared | relpersistence | relkind | relnatts | relch
ecks | relhasrules | relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated | relreplident | relispartition | relrewrite | relfrozenxid | relminmxid | relacl |    reloptions    | relpartbound 
-------+------------------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+------
-----+-------------+----------------+----------------+----------------+---------------------+----------------+--------------+----------------+------------+--------------+------------+--------+------------------+--------------
 16434 | pgbench_accounts |         2200 |   16436 |         0 |       10 |     2 |       16440 |             0 |     1640 |    100000 |          1640 |             0 | t           | f           | p              | r       |        4 |      
   0 | f           | f              | f              | f              | f                   | t              | d            | f              |          0 |          514 |          1 |        | {fillfactor=100} | 
(1 row)

-bash-4.2$ pg_waldump 00000001000000000000000* | grep 16440
rmgr: Storage     len (rec/tot):     42/    42, tx:        514, lsn: 0/02EC03B0, prev 0/02EC0380, desc: CREATE base/13255/16440
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02EC0CC0, prev 0/02EC0C78, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 0
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02EC2650, prev 0/02EC0CC0, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 1
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02EC3FC8, prev 0/02EC2650, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 2
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02EC5958, prev 0/02EC3FC8, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 3
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02EC72E8, prev 0/02EC5958, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 4
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02EC8C78, prev 0/02EC72E8, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 5
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02ECA608, prev 0/02EC8C78, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 6
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02ECBF80, prev 0/02ECA608, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 7
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02ECD910, prev 0/02ECBF80, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 8
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02ECF2A0, prev 0/02ECD910, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 9
rmgr: Heap2       len (rec/tot):   6515/  6515, tx:        514, lsn: 0/02ED0C30, prev 0/02ECF2A0, desc: MULTI_INSERT+INIT 61 tuples flags 0x00, blkref #0: rel 1663/13255/16440 blk 10
<snip>

Bear in mind that in WAL, tables are referred to by their relfilenode and not by their oid (look for 16440 in the example)

More information about pg_waldump can be found in the documentation

5
  • thanks, your pointer on how to make sense of the rel xxx/yyy/zzz part definitely helped.
    – LeGEC
    Commented Oct 20, 2023 at 8:37
  • Isn't there a builtin option which translates those numbers to human readable output by connecting to a known server ? do I have to roll out a script of my own ?
    – LeGEC
    Commented Oct 20, 2023 at 8:38
  • No, you cannot automatically translate the WAL information to tables. That's what logical decoding does, but it cannot be done later. Commented Oct 20, 2023 at 8:41
  • @LaurenzAlbe: noted. In your answer you mention TRUNCATE which will make a specific oid for a table obsolete, for example. Are there other regular actions which can change the oid of a table ? or can I consider that, if I don't truncate any tables, their oid will remain stable in time ?
    – LeGEC
    Commented Oct 20, 2023 at 10:45
  • It doesn't change the OID, but the file. Table/index rewrite also happens during VACUUM (FULL), CLUSTER, REINDEX and some ALTER TABLE statements. Commented Oct 20, 2023 at 11:04
1

I would use a radically different approach to solve your problem. WAL is binary information, and the connection between WAL records and tables is tenuous: for example, if you TRUNCATE a table, it gets a different file, and you cannot figure out any more what table an old WAL entry belongs to.

But you can easily see how many rows get inserted, updated and deleted in each table:

SELECT relid::regclass, n_tup_ins, n_tup_upd, n_tup_del
FROM pg_stat_user_tables;

These counts are cumulative and include all activity since the last statistics reset or crash, so you would run the same query on two consecutive days (or weeks) and calculate the difference to get an idea how many data modifications each table receives. The more data modifications, the more WAL.

This is of course a very coarse estimate. For one, it will depend on the row size. You could estimate the row size with

SELECT schemaname, tablename, sum(avg_width) + 23 AS row_size
FROM pg_stats
GROUP BY schemaname, tablename;

You can factor that size into the calculation. Again, that row size is a very coarse estimate.

1
  • thanks for your input, we will start monitoring these statistics
    – LeGEC
    Commented Oct 20, 2023 at 12:51

Not the answer you're looking for? Browse other questions tagged or ask your own question.