Handling medic-sentinel logs backups

Describe the Bug
Our instance of medic-os especially the sentinel service has been creating logs backups that are filling up the servers(medic-data volume). Is there a standard way/service to purge these logs?

Expected Behavior
Logs should be rotated and compressed frequently. Alternatively have a way of getting rid of old logs when the volume capacity reaches a certain threshold.

Screenshots

Environment

  • Client platform: Linux,Docker
  • App: sentinel
  • Version: 3.9.0

Hi @drono! Welcome to our community and thanks for posting this issue.

First, let’s take a look at what is flooding the logs. Can you run:
tail -200 /srv/storage/medic-sentinel/logs/medic-sentinel.log
and ensure Sentinel isn’t erring with the same output?

We use logrotate inside the container to compress and rotate logs. Let’s ensure that’s running appropriately:
/sbin/logrotate -v /etc/logrotate.conf

Logrotate should compress and rotate logs for 21 days, so I suspect the service was corrupted during a container restart. Some of your logs are far older than 21 days and very very large, and it’s worth investigating why Sentinel is continually flooding the logs. Please closely view the verbose output of the above logrotate command and ensure you fix any errors and re-run the command.

While investing the Sentinel error, you should shut the service off:
/boot/svc-stop medic-sentinel

For immediate resolution, here are the recommended steps:

  • Turn off medic-sentinel
  • Archive/Delete 10+gb old medic-sentinel log files
  • Run logrotate in verbose, watch output for its run on medic-sentinel logs
  • Fix any logrotate errors regarding permissions/ownership. Logrotate will save a new statefile and continue its 21 day rotation.
  • Turn medic-sentinel back on
4 Likes

@hareet Thank you very much for the feedback. I managed to spot some error outbound errors in the logs which I think contributed mostly to the huge log files. I deleted them and tested rotation and it seems to works well

For reference, we improved outbound error logging size in 3.10.0 (The outbound error response logging is needlessly verbose · Issue #6024 · medic/cht-core · GitHub).