When there is an invalid UTF-8 character, for instance sys/dev/ata/atapi-cd.c (from stable/9), vi(1) would report: sys/dev/ata/atapi-cd.c: unmodified: line 1; Conversion error on line 2 Then searching "geomf" in the file will not give any results, while it should.
(In reply to Xin LI from comment #0) We know these types of issues for quite a long time, but I don't have a desired behavior in mind. For now, the workaround would be to set the correct locale: env LC_CTYPE=en_US.ISO8859-1 nvi /usr/src/sys/dev/ata/atapi-cd.c or to make use of the 8-bit mode: env LC_CTYPE=C nvi /usr/src/sys/dev/ata/atapi-cd.c Specific encoding can be set after the file is loaded, with ":se fe=iso-8859-1", but 8-bit mode cannot (unfortunately due to a display related bug which I cannot solve).
(In reply to Xin LI from comment #0) Forgot to answer your original question: why the search does not go across the first defected line. The whole story is much worse than this: if you write the file, nothing is written after line 2: https://github.com/lichray/nvi2/issues/12 Right now I left them "consistently" awful.
(In reply to lichray from comment #2) Well, data corruption is much more serious than merely not having search working. I think vi should probably ask the user whether they want to reload the file in 'C' locale when it encountered an error and quit if the user chooses not to.
(In reply to Xin LI from comment #3) It's not quite "data corruption", since an error will be shown, and your data is not immediately lost: just switch to the correct encoding (at runtime, after the error is shown) with ":se fe=iso-8859-1" and write again then your data is back. Reload the file, like, as if ":e"? Sounds interesting. Added to the github issue. But I need to implement the raw-write first, otherwise the data is really lost. The change itself does not solve the problem. For example, when you open a slightly larger file, and the conversion error is close to the end of the file, then during your editing no error is shown (the conversion is only needed when the line is needed); the error only happens when writing.
Although not resolving this issue, FYI, the patch to prevent file truncation upon writing is merged: https://github.com/lichray/nvi2/commit/310d1e86c0b3db7f7e025e3092afc78e3d906fa2
This should have been addressed in 281373. Over to bapt@.
(In reply to Xin LI from comment #6) But don't close this bug. The bug reported here is not resolved. I plan to work on that later.
(In reply to lichray from comment #4) If you don't notice the "conversion error" message or aren't aware of the ramifications, then you continue with your editing and write the file, then it does become "data corruption". This is made worse because the later text is still visible in the editing buffer. It's only after you save and quit is the missing data evident. I just hit this with the nvi in freebsd 10 (ver 2.1.2). There was a Makefile where someone entered their name with an o+umlaut encoded in iso-8859-1 in the header block on the first line. I have LC_CTYPE=en_US.UTF-8 and edited the file without noticing the 'conversion error' message. The resulting file after save/quit was empty. No such problem on FreeBSD 9 (older nvi). Good to hear a fix is available upstream (untested by me). We should import the update into freebsd.
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.