Summary: | prometheus sysctl exporter asserts with nvidia driver loaded | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Nikolai Lifanov <lifanov> | ||||||
Component: | bin | Assignee: | Ed Schouten <ed> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Only Me | CC: | emaste | ||||||
Priority: | --- | ||||||||
Version: | CURRENT | ||||||||
Hardware: | Any | ||||||||
OS: | Any | ||||||||
Attachments: |
|
Description
Nikolai Lifanov
2017-07-26 19:37:01 UTC
Thanks for reporting! I suspect that the nvidia kernel module (which I don't use myself, unfortunately), exports a sysctl that has a weird character (e.g., a dash) in its name. These metrics cannot be exported as proper metrics. Could you send me the output of 'sysctl -a' prior to and after loading nvidia.ko? If you're not comfortable with sharing all of that info, the diff between the output is sufficient. Thanks! Actually, it doesn't seem to be nvidia at all: # sysctl -Na | grep -vE '^([a-z]|[A-Z]|[0-9]|_|\.|\%)+$' kern.timecounter.tc.ACPI-fast.quality kern.timecounter.tc.ACPI-fast.frequency kern.timecounter.tc.ACPI-fast.counter kern.timecounter.tc.ACPI-fast.mask kern.timecounter.tc.TSC-low.quality kern.timecounter.tc.TSC-low.frequency kern.timecounter.tc.TSC-low.counter kern.timecounter.tc.TSC-low.mask Yes, but for those specific metrics we already have hints in place to map the timer names (e.g., "ACPI-fast") into labels, meaning they are allowed to contain dashes. That said, could you please attach a diff between 'sysctl -a' output before/after loading nvidia.ko? I just tested it without nvidia kernel module loaded and it's still failing: $ prometheus_sysctl_exporter -dgh Assertion failed: (name[strspn(name, "abcdefghijklmnopqrstuvwxyz" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "0123456789_")] == '\0'), function oidname_print, file /usr/src/usr.sbin/prometheus_sysctl_exporter/prometheus_sysctl_exporter.c, line 390. Abort trap (core dumped) I'm going to attach sysctl -Na. Created attachment 184807 [details]
without nvidia.ko
Created attachment 184808 [details]
with nvidia.ko
I made a mistake previously: it fails both with and without nvidia module loaded. Hmmm... Odd. Looking at the sysctl output, I can't think of any sysctls that would cause this. You're running "prometheus_sysctl_exporter -dgh". The disadvantage of the -g and -h flags is that it causes the prometheus_sysctl_exporter to buffer output prior to printing it. If you were to run "prometheus_sysctl_exporter -d", it should still crash, but that allows you to get the name of the sysctl entry right before the one causing the crash. Could you please give me the output of that? Here are the last few lines of just -d: # HELP sysctl_kstat_zfs_misc_zio_trim_bytes Number of bytes successfully TRIMmed sysctl_kstat_zfs_misc_zio_trim_bytes 0 sysctl_kstat_zfs_misc_metaslab_trace_stats_metaslab_trace_over_limit 0 Assertion failed: (name[strspn(name, "abcdefghijklmnopqrstuvwxyz" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "0123456789_")] == '\0'), functio n oidname_print, file /usr/src/usr.sbin/prometheus_sysctl_exporter/prometheus_sysctl_exporter.c, line 390. sysctl_dev_umsAbort trap (core dumped) This is the output from the one just after this one: $ sysctl hptmv.status hptmv.status: RocketRAID 18xx SATA Controller driver Version v1.16 Thanks for pasting the output. That was very helpful. In your case, it tried to export dev.${driver}.${index}.%domain, which fails due to the % being present. I've just committed a fix to convert such characters to underscores. Can you let me know whether >=r321678 works for you? It works for me now. Thank you! Awesome! Enjoy! |