Bug 255405 - EventStats macros: unknown error handler name 'fallback:iso-8859-1'
Summary: EventStats macros: unknown error handler name 'fallback:iso-8859-1'
Status: In Progress
Alias: None
Product: Services
Classification: Unclassified
Component: Wiki (show other bugs)
Version: unspecified
Hardware: Any Any
: --- Affects Some People
Assignee: Kubilay Kocak
Depends on:
Reported: 2021-04-26 01:05 UTC by Jethro Nederhof
Modified: 2021-05-17 10:43 UTC (History)
2 users (show)

See Also:

script to populate hitcount caches for pages (2.92 KB, text/plain)
2021-04-26 12:10 UTC, Jethro Nederhof
no flags Details
Updated cold start script. (3.80 KB, text/plain)
2021-05-17 10:14 UTC, Jethro Nederhof
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jethro Nederhof 2021-04-26 01:05:24 UTC
Moin has a feature to keep track of views of each page and store stats for them.
This enables producing reports like the following:

On the FreeBSD wiki equivalent pages (https://wiki.freebsd.org/PageHits , https://wiki.freebsd.org/EventStats/HitCounts), the requests seem to time out and an error is produced:
<<StatsChart: execution failed [unknown error handler name 'fallback:iso-8859-1'] (see also the log)>> 

This data would be useful for populating the "Popular" section on the wiki homepage at https://wiki.freebsd.org/
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2021-04-26 01:31:01 UTC
werkzeug behaviour change, fixed in moin (upstream didn't resolve it there [1]) 

[1] https://github.com/pallets/werkzeug/issues/1706#issuecomment-578552492
Comment 2 Kubilay Kocak freebsd_committer freebsd_triage 2021-04-26 02:25:28 UTC
Error is now sorted out (pkg upgrade on wiki instance).

The macro when used however, results in gateway timeouts. Need to investigate potential workarounds/solution to this (if there are any).

See Also: 

Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2021-04-26 02:43:40 UTC
See Also:


Relevant part:

The statistics stuff (EventStats, PageHits) is reading data/event-log. That file is growing over time and big event-logs slow down the statistics stuff. So if you are not interested in the stats from 2 years ago, you maybe want to rotate that log for performance reasons. You could even just truncate event-log to 0 bytes if you don't mind your statistics stuff starting from scratch. 

Our event-log is not small.

I wonder if there's a way to process this offline (cli)
Comment 4 Jethro Nederhof 2021-04-26 12:10:52 UTC
Created attachment 224440 [details]
script to populate hitcount caches for pages

Haven't been able to test it properly but if I'm reading the moin code right it would be something like this to do that.
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2021-05-05 03:41:00 UTC
The behaviour of the stats gathering mechanisms is that it will process (backward in time) the hits/stats file until the last cached timestamp is reaches. We ran a test of attachment 224440 [details] and it failed at aggregation stage. Needs a tweak so we can complete a full run
Comment 6 Jethro Nederhof 2021-05-17 10:14:26 UTC
Created attachment 225020 [details]
Updated cold start script.

- Updated to skip until it's processing just the last 90 days
- Only count stats for pages that still exist
- Handle the error case where the TSV row randomly has additional columns
- Print some progress messages
- Force some garbage collection in case that helps the memory issue
- Add a report of popular pages that aren't found any more (dead links = good redirect candidates)

Totally forgot about this, apologies for the delay!
Comment 7 Kubilay Kocak freebsd_committer freebsd_triage 2021-05-17 10:43:16 UTC
No apologies necessary, thank you Jethro :)