Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peertube log rotation does not release deleted files, inducing an infinite disk usage grow #6041

Open
JohnXLivingston opened this issue Nov 20, 2023 · 5 comments
Labels
Status: Blocked ✋ Somehow, somewhere *else*, something has gone very wrong. Until they fix it we're stuck. Type: Bug 🐛 Confirmed bug, at least replicated once by another contributor

Comments

@JohnXLivingston
Copy link
Contributor

Describe the current behavior

Hi.

On one of my server, i saw a huge disk usage leak. See the /srv/ usage:

image

After some searching, I saw that there was a huge difference between df and du: df told there was 369G used storage, but du only found 215G.

This kind of difference can be because of deleted files that are kept open by processes.
I then remembered that when my server is rebooted, i already saw that some space was released by peertube. I thought it was because of deleted temp files... Until today. I only restarted Peertube, and this happened:

root@xxx:~# df -h
Sys. de fichiers     Taille Utilisé Dispo Uti% Monté sur
/dev/mapper/vg1-srv    404G    369G   19G  96% /srv
root@xxx:~# systemctl restart peertube
root@xxx:~# df -h
Sys. de fichiers     Taille Utilisé Dispo Uti% Monté sur
/dev/mapper/vg1-srv    404G    215G  173G  56% /srv

More than 150G were released!

Unfortunately, i did not think to check which deleted files where opened before restarting...

I checked on another Peertube instance:

root@yiny:~# lsof -p 2577772  | grep deleted
peertube 2577772 peertube   20w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube   28w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube   30w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube  126w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube  127w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube  145w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube  152w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube  161w      REG              253,0 12604465    432496 /var/www/peertube/storage/logs/peertube.log (deleted)
peertube 2577772 peertube  183w      REG              253,0 12583136    395665 /var/www/peertube/storage/logs/peertube17.log (deleted)

So i assume that it is the log rotation system who keeps open files after rotation.

Note: on the server where i freed 150G, the log level is debug, it is a pre-prod instance.

Steps to reproduce

  1. Open a terminal on a production instance that is running for several days
  2. Get the process id of the Peertube process
  3. `lsop -p THE_PID | grep deleted

Describe the expected behavior

No response

Additional information

  • PeerTube instance:
    • Version: 5.2
    • NodeJS version: 18.18.2
@Chocobozzz
Copy link
Owner

Thanks for the issue. Seems to be a winston bug: winstonjs/winston#2100

@Chocobozzz Chocobozzz added Status: Blocked ✋ Somehow, somewhere *else*, something has gone very wrong. Until they fix it we're stuck. Type: Bug 🐛 Confirmed bug, at least replicated once by another contributor labels Nov 20, 2023
@JohnXLivingston
Copy link
Contributor Author

Thanks for the issue. Seems to be a winston bug: winstonjs/winston#2100

Make sense. 1 year, and no fix :/

@JohnXLivingston
Copy link
Contributor Author

For the record, i just noticed that on one of my instance, it seems that Peertube logs in different log files at the same time.

Sorting log files by last modification date:

-rw-r--r--  1 peertube peertube 12582943 23 nov.  20:13 peertube5.log
-rw-r--r--  1 peertube peertube 12583146 24 nov.  02:17 peertube6.log
drwxr-xr-x  2 peertube peertube     4096 24 nov.  02:17 .
-rw-r--r--  1 peertube peertube 12592284 24 nov.  09:48 peertube4.log
-rw-r--r--  1 peertube peertube  7190255 24 nov.  10:12 peertube7.log

As you can see, files are not well ordered.

Moreover, using lsof:

peertube 753 peertube  157w      REG              253,0 12582943  391734 /var/www/peertube/storage/logs/peertube5.log
peertube 753 peertube  163w      REG              253,0 12592284  393432 /var/www/peertube/storage/logs/peertube4.log
peertube 753 peertube  172w      REG              253,0 12592284  393432 /var/www/peertube/storage/logs/peertube4.log
peertube 753 peertube  176w      REG              253,0  7190255  504354 /var/www/peertube/storage/logs/peertube7.log

3 different files are still open.

On this server, i use NodeJS v16 (it was v18 on the other instance).

@JohnXLivingston
Copy link
Contributor Author

@Chocobozzz : FYI, it seems there is a fix that might help, and this fix is here with Peertube 6.0.x.
More info: winstonjs/winston#2100 (comment)

I will let you know if it works (i have to wait a few hours/days to test).

@JohnXLivingston
Copy link
Contributor Author

@Chocobozzz , it seems the bug we have is not exactly the same as the one in winstonjs/winston#2100
Or maybe there were 2 bugs: one that was fixed in Winston 3.11.0 (Peertube v6.0.x), and another that might be related to multi threading in Peertube.

As discussed with a Winston maintainer, i opened a new issue: winstonjs/winston#2393
If you have some clue, or information, don't hesitate to comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Blocked ✋ Somehow, somewhere *else*, something has gone very wrong. Until they fix it we're stuck. Type: Bug 🐛 Confirmed bug, at least replicated once by another contributor
2 participants