Bug 193694 - non-ASCII characters lost on the way from commit-hook into Bugzilla
Summary: non-ASCII characters lost on the way from commit-hook into Bugzilla
Status: Open
Alias: None
Product: Services
Classification: Unclassified
Component: Core Infrastructure (show other bugs)
Version: unspecified
Hardware: Any Any
: --- Affects Some People
Assignee: Peter Wemm
URL:
Keywords: easy, needs-patch, needs-qa
Depends on:
Blocks:
 
Reported: 2014-09-16 21:24 UTC by Matthias Andree
Modified: 2019-06-15 03:08 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Andree freebsd_committer 2014-09-16 21:24:57 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193682#c4 contains a comment added by the ports commit hook.  The changelog itself contained one UTF-8 character (in the Submitted by:, "Siebörger") which is present and properly encoded in the commit log mailed out to the lists, but has been replaced by a question mark in the bug's comment.  The Bugzilla rendered bug page claims to also be encoded as UTF-8, so it's not clear why the umlaut got lost.

Please change the bug tracker such that UTF-8 characters get properly recorded and rendered.
Comment 1 Matthias Andree freebsd_committer 2014-09-16 21:25:22 UTC
That should have been https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193682#c3
Comment 2 Marcus von Appen freebsd_committer freebsd_triage 2014-09-18 09:21:48 UTC
This might be related due to several issues.
The notifier script for example does not declare a content encoding in the mail being generated, which might then be just reencoded/played safe on picking up the mail.
I'm adding portmgr@ to keep them informed about the test outcome.
Comment 3 Marcus von Appen freebsd_committer freebsd_triage 2014-09-22 10:00:50 UTC
Local tests show that it is related to the (missing) content encoding.

@portmgr: Can we assume commit messages to be UTF-8 encoded and add a 

  echo 'Content-Type: text/plain; charset="UTF-8"'

into hooks/scripts/notify_bz.sh? Otherwise the content encoding is guessed randomly, with a fallback to the executing user's locale.
Comment 4 Matthias Andree freebsd_committer 2014-09-22 18:33:16 UTC
Internally, Subversion stores everything as UTF-8 encoded unicode, or so the SVNBook.red-bean.com claims:

<http://svnbook.red-bean.com/en/1.7/svn.tour.importing.html#svn.tour.importing.naming>

Now the client (svn) will re-encode according to the locale setting, I'm not sure what svnlook does.  So in order to play it safe, you'll probably want to add

  export LANG=en_US.UTF-8

or more radical and to the point

  export LC_ALL=en_US.UTF-8

to the notify_bz.sh to enforce the declared encoding (C.UTF-8 or POSIX.UTF-8 causes complaining about svn not being able to set LC_CTYPE).

Please test what svnlook renders, I don't currently have a server-side repo at hand.
Comment 5 Marcus von Appen freebsd_committer freebsd_triage 2014-10-15 06:25:42 UTC
What's the status of this? Did someone of portmgr@ look into the necessary adjustments for notify_bz.sh?
Comment 6 Marcus von Appen freebsd_committer freebsd_triage 2014-11-13 06:52:16 UTC
portmgr@: is there any progress on this issue?
Comment 7 Matthias Andree freebsd_committer 2014-11-18 22:14:53 UTC
If it's script adjustments, we need to get bugmeister on the hook.  Not sure if we can expect much help from portmgr@, so let's just try bugmeister.
Comment 8 Matthias Andree freebsd_committer 2014-11-18 22:16:40 UTC
Sorry - I see that bugmeister reassigned to portmgr in September already. Reverting my changes.
Comment 9 Marcus von Appen freebsd_committer freebsd_triage 2014-12-14 09:04:51 UTC
Any news on this?
Comment 10 Matthias Andree freebsd_committer 2014-12-14 12:45:53 UTC
Is this an area where non-maintainer commits get reverted? 

Else it's time for someone else to invoke maintainer timeout and take action.
Comment 11 Antoine Brodin freebsd_committer 2014-12-14 12:48:25 UTC
I don't see any patch provided in this bug report,  why would timeout be invoked?
Comment 12 Antoine Brodin freebsd_committer 2014-12-14 13:09:22 UTC
Also,  this probably affects base and docs,  it's not specific to ports so I'm not sure portmgr is the right contact (maybe svnadm@ / peter@)
Comment 13 Marcus von Appen freebsd_committer freebsd_triage 2014-12-14 20:01:32 UTC
(In reply to Antoine Brodin from comment #12)
> Also,  this probably affects base and docs,  it's not specific to ports so
> I'm not sure portmgr is the right contact (maybe svnadm@ / peter@)

Yes, it will affect all commits, which also write something into Bugzilla. So the problem will be the same for all different source trees, be it ports, doc or base. The issue is (in my opinion) simple to fix (see comment #3) and "only" would have an impact on Bugzilla comments. Should we give a fix a go?
Comment 14 Mathieu Arnold freebsd_committer 2015-01-21 13:39:55 UTC
Got bitten with it in https://bugs.freebsd.org/196964 so adding me to the CC.
Comment 15 Kubilay Kocak freebsd_committer freebsd_triage 2015-11-02 07:24:00 UTC
@Marcus, is your add code suggestion in comment 3 still valid? If so I can add a patch here. If not, if someone else could that would be great.

Who is the maintainer/owner of the SVN hook scripts? We should assign the Product/Component/Assignee accordingly.
Comment 16 Kubilay Kocak freebsd_committer freebsd_triage 2015-11-02 07:28:44 UTC
Spoke to Peter on IRC, over to him (and clusteradm). Thanks Pete!
Comment 17 Peter Wemm freebsd_committer freebsd_triage 2015-11-02 07:30:03 UTC
I was planning to make some adjustments to the way bugzilla receives email, I'll take care of this as well.
Comment 18 Yuri Victorovich freebsd_committer 2019-06-15 03:08:32 UTC
This is still a problem.