Bug 274194 - sysutils/loki: New rc.d file for promtail
Summary: sysutils/loki: New rc.d file for promtail
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Robert Clausecker
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-01 08:16 UTC by Eren Türkay
Modified: 2023-10-19 07:18 UTC (History)
3 users (show)

See Also:
freebsd: maintainer-feedback-


Attachments
0001-sysutils-loki-add-promtail-rc.d-script (6.88 KB, patch)
2023-10-07 06:56 UTC, Eren Türkay
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eren Türkay 2023-10-01 08:16:31 UTC
Hello,

Thank you for packaging loki in bug #256030. I realized that loki and promtail binaries are installed with the package but only loki has an rc.d script. For most cases, when you install loki as a centralized log server, you only configure and run promtail to read/send logs from other nodes.

I would love to take initiative on this issue and this will be my first contribution to FreeBSD community. I can simply copy .in file for promtail and send a patch but I wanted to make sure we discuss other considerations such as default configuration file and user to run.


Default Files Provided by Grafana
=================================
On official release page, Grafana team provides pre-built binaries and configuration files. For Debian/Ubuntu systems [0], the default configuration file looks like this:


```
server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://localhost:3100/loki/api/v1/push

scrape_configs:
- job_name: system
  static_configs:
  - targets:
      - localhost
    labels:
      job: varlogs
      __path__: /var/log/*log
```

Which I believe it's OK configuration file. When you run promtail, it will start reading logs out of the box and you will be able to extend it. However, the problem is that their systemd file runs with user `promtail`. As you can guess, this user will not be able to read /var/log/. Whenever I installed promtail, I always changed systemd service to run as root because I need logs in /var/log/ along with application specific logs.

The application does not crash. You simply see a number of error messages in promtail logs. But generating error messages after the installation is not a good default behavior.

Also, `positions.yml` is too important to put into /tmp/. After a system reboot, promtail will not be able to know which line it read last. So, putting it into /usr/local/etc/promtail/ makes more sense.


[0] https://github.com/grafana/loki/releases/download/v2.9.1/promtail_2.9.1_amd64.deb


Which User To Run On
====================
I do not see a problem with providing default configuration like above (with fixes) and running log aggregator as root because you mostly want to run as root. Usually, there are different users for different applications, each writing to different directory, and you need to read all application logs, label them, and send it.

However, if that causes a security concern within FreeBSD, we can simply comment the example configuration file and put a note in there. Alternatively, we can write a message after package install, reminding that `promtail_user`, `promtail_group` is present.

The configuration file above is only an example, provided by official grafana team. We are not bound to use it as-is, we can also think of adding additional defaults based on user needs.

Thank you for your time reading the report.

Best,
Eren
Comment 1 Christopher Beppler 2023-10-03 18:49:58 UTC
Hi Eren,

Thank you for taking initiative!

I like the idea of also providing a rc script for promtail. Since we shouldn't deviate from upstream too much, I'd propose to use https://github.com/grafana/loki/blob/v2.9.1/clients/cmd/promtail/promtail-local-config.yaml as default configuration (adapted to FreeBSD paths where necessary, see below)

I'd see positions.yaml somewhere around /usr/local/var/db/loki. You're right, it's a best a huge performance penalty if positions.yaml state file gets lost on reboot.

If promtail is later configured by the user to run as root is up to the user. Alternatively, the loki user could be put into the right groups to get read access to the log files. That's not in scope of this port - we try to stay as close as possible to upstream. If we can get the upstream promtail-local-config.yaml running, we're good. The user will have to read through the documentation to configure it properly anyways :-).

I plan to work on the version bump in the next days. If you attach a patch for the current ports version (2.8.1), I am happy to submit it all together.

Thanks
Christopher
Comment 2 Christopher Beppler 2023-10-03 18:51:18 UTC
* /var/db/loki or /var/db/promtail - the /usr/local is not correct.. Sorry.
Comment 3 Eren Türkay 2023-10-04 13:48:25 UTC
Hey Christopher,

Thanks for the reply. I agree with staying close to upstream and using the default configuration file. In that case, I'm planning on following:

- Add default configuration file with /var/db/promtail/positions.yml
- Add promtail user/group
- Add rc.d script
- Run promtail under promtail user (upstream system.d file also runs with promtail)

The only downside is that promtail will not be able to read /var/log/ as provided by default but it will not crash. As you mentioned this will not be the scope of the work, I will leave that as-is and we can later improve the default behavior.

I can send this patch in a few days, probably on the weekend. If you happen to bump the version earlier, we can add it later.

Best,
Eren
Comment 4 Eren Türkay 2023-10-07 06:56:40 UTC
Created attachment 245475 [details]
0001-sysutils-loki-add-promtail-rc.d-script

Hello,

The patch is ready and tested but I've came across a few issues. 2 of them are port related (I used portfmt and portlint), and the last one is configuration/daemon behavior.



SHA2 Mismatch (hashicorp-consul-v1.5.1_GH0.tar.gz)
=============
I got checksum error on this file. I do not think it should change but I had to build the package with `NO_CHECKSUM=yes` to proceed. When I manually download the file, the hash in `distinfo` does not match the one I download. Since I do not know what's going on with the hash, I did not include it in the patch.

```
$ wget 'https://codeload.github.com/hashicorp/consul/tar.gz/v1.5.1?dummy=/hashicorp-consul-v1.5.1_GH0.tar.gz' -O hashicorp-consul-v1.5.1_GH0.tar.gz
$ sha256sum hashicorp-consul-v1.5.1_GH0.tar.gz
7d9a318ca5fe3c5f35adb950750e92dff45a67d5fe37d447f173b4f0f99c095f  hashicorp-consul-v1.5.1_GH0.tar.gz
```


Portlint Messages
=================
I generated the patch using `make makepatch` but I still get a warning. I know portlint can give some false positives but I still wanted to let you know:


```
WARN: Makefile: possible use of absolute pathname "/seed".
WARN: Makefile: possible use of absolute pathname "/var/db/${PORTNAME}".
WARN: Makefile: possible use of absolute pathname "/var/db/promtail".
WARN: /usr/home/ec2-user/ports/sysutils/loki/files/patch-cmd-loki-loki-local-config.yaml: patch was not generated using ``make makepatch''.  It is recommended to use ``make makepatch'' when you need to [re-]generate a patch to ensure proper patch format.
WARN: /usr/home/ec2-user/ports/sysutils/loki/files/patch-promtail-local-config.yaml: patch was not generated using ``make makepatch''.  It is recommended to use ``make makepatch'' when you need to [re-]generate a patch to ensure proper patch format.
0 fatal errors and 5 warnings found.
```


Promtail Stop Behavior
=====================
This one is interesting. Since we include upstream default configuration and I do not have loki server installed, promtail has a long retry configuration. Once it reads the file, it cannot find the loki server and starts retrying. When you want to stop the daemon, you wait on `waiting for PIDs` message.

I waited for a couple of minutes at first, the daemon did not stop and I interrupted it to stop it again. I thought it was a problem with the rc.d script but it turns out it's related with retry configuration.

The defaults are described here: https://grafana.com/docs/loki/latest/send-data/promtail/configuration/#clients

min_period: 500ms
max_period: 5m
max_retries: 10

This means you need to wait at least 5 minutes before promtail finishes retries. I have confirmed it by lowering those values and seeing that max_retries need to be reached before stopping the server. So, if for any reason promtail cannot reach loki and retries, it should finish first. The sad thing is that you do not see how many retries are left in the log file, you just see `will retry` message and it can be confusing.

As someone worked on this package, I know this information but daemon stop behavior can be hard to spot. Do you think it's appropriate to add a note somewhere and refer to the official documentation?

Regards,
Eren
Comment 5 Christopher Beppler 2023-10-14 20:25:00 UTC
Hi Eren,

Thank you very much. I am right now working on the 2.9.1 port. This should at least get rid of the SHA256SUM errors you mentioned.

Will let you know once I incorporated your patch for 2.9.1.

Best,
Christopher
Comment 6 Christopher Beppler 2023-10-14 22:44:56 UTC
Hi Eren,

Thank you very much. I am right now working on the 2.9.1 port. This should at least get rid of the SHA256SUM errors you mentioned.

Will let you know once I incorporated your patch for 2.9.1.

Best,
Christopher
Comment 7 Christopher Beppler 2023-10-14 22:45:15 UTC
Hi Eren,

I've made a few small Makefile adjustments (2.9.1 related) to the patch that you submitted and added it to the version bump to 2.9.1 in bug #274473.
It worked like a charm.

Thanks again for your contribution.

Christopher

This bug can be closed.
Comment 8 Robert Clausecker freebsd_committer freebsd_triage 2023-10-16 05:20:09 UTC
This will be fixed through the patches attached to bug #274473.
Comment 9 commit-hook freebsd_committer freebsd_triage 2023-10-19 07:09:32 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=b16c0ca7d04e67ba0dd9e2ff7a5cde372a79171b

commit b16c0ca7d04e67ba0dd9e2ff7a5cde372a79171b
Author:     Eren Türkay <turkay.eren@gmail.com>
AuthorDate: 2023-10-06 09:46:13 +0000
Commit:     Robert Clausecker <fuz@FreeBSD.org>
CommitDate: 2023-10-19 07:07:26 +0000

    sysutils/loki: add promtail rc.d script

    Promtail is a log reader and it can be run as a service to gather the
    logs to be sent to remote loki server. For nodes that need to send logs
    only, promtail service and configuration file is needed

    PR:             274194, 274473
    Signed-off-by:  Christopher Beppler <freebsd@funzi.org>
    Approved by:    Chirstopher Beppler <freebsd@funzi.org> (maintainer)

 UIDs                                               |  2 +-
 sysutils/loki/Makefile                             | 13 +++-
 .../files/patch-promtail-local-config.yaml (new)   | 11 ++++
 sysutils/loki/files/promtail.in (new)              | 70 ++++++++++++++++++++++
 sysutils/loki/pkg-plist                            |  4 ++
 5 files changed, 96 insertions(+), 4 deletions(-)
Comment 10 Robert Clausecker freebsd_committer freebsd_triage 2023-10-19 07:18:05 UTC
Thank you for your contribution.