I was editing a file with vi, and got
Error: ?!: Illegal byte sequence; ?!: WARNING: FILE TRUNCATED.
After this, it refused to save the file. In middle of the file there was ~ one one line. However, any attempt to edit that line caused error
Error: unable to retrieve line 7
The line could not be removed or edited.
This is nasty as it destroys the file being edited.
I recovered the file from backup, and I get
paypal: unmodified: line 1; Conversion error on line 7
I might have missed that error when starting to edit.
This is plain text file. If vi has some magic for UTF8 or whatever, it should
never go confused, and simply switch locate to C with appropriate warning message.
Hi, I created https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203040 a while back. This looks very similar.
I've downgraded some machines to nvi 1.79 from FreeBSD 9.3 source tree, which doesn't have iconv dependancy.
(In reply to heikki from comment #0)
I see something similar in 10.3-RELEASE-p7 after playing around with Greek Unicode characters in zsh. I have the following line in my histfile, which contains a few Greek characters at the beginning, then the 0xb1, which I don't remember how to type, and then only ASCII-characters.
% sed -n -e 837p histfile | od -c
0000000 ρ ** θ ** θ ** σ ** 261 g h f g
% sed -n -e 837p histfile | hexdump -C
00000000 cf 81 ce b8 ce b8 cf 83 b1 20 20 20 67 68 66 67 |......... ghfg|
00000010 0a |.|
When writing the file, vi truncates it right at this point.
Also seen in 11.0-RELEASE-p8, resulting in data loss.
(In reply to Michael Dexter from comment #3)
I have used this patch for 11.0, (from https://lists.freebsd.org/pipermail/freebsd-bugs/2015-August/063464.html), but note the couple more matches in the FreeBSD bug list:
New | 202740 | vi/ex string substitution problem when there is m
New | 202290 | /usr/bin/vi conversion error on valid character
--- contrib/nvi/common/encoding.c (revision 292832)
+++ contrib/nvi/common/encoding.c (working copy)
@@ -96,7 +96,7 @@
if (i >= nbytes)
- if (buf[i] & 0x40) /* 10xxxxxx */
+ if ((buf[i] & 0xc0) != 0x80) /* 10xxxxxx */
(In reply to Bjorn Robertsson from comment #4)
There are even more bugs which are probably related to the same problem:
Bug 196447 - vi(1) misbehavior when encountered invalid Unicode character
Bug 203040 - Nvi truncates files with non-ASCII characters
It seems to be possible to get the data back in a number of cases, even after having written a corrupted file to disk:
The issue as a whole seems to be quite involved, however: