Bug 205697 - vi gets confused and corrupts file being edited
Summary: vi gets confused and corrupts file being edited
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.2-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
Keywords: patch
Depends on:
Reported: 2015-12-29 15:35 UTC by heikki
Modified: 2017-03-24 08:28 UTC (History)
3 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description heikki 2015-12-29 15:35:49 UTC
I was editing a file with vi, and got

Error: ?!: Illegal byte sequence; ?!: WARNING: FILE TRUNCATED.

After this, it refused to save the file.  In middle of the file there was ~ one one line.  However, any attempt to edit that line caused error 

Error: unable to retrieve line 7

The line could not be removed or edited. 

This is nasty as it destroys the file being edited.

I recovered the file from backup, and I get 

paypal: unmodified: line 1; Conversion error on line 7

I might have missed that error when starting to edit.

This is plain text file.  If vi has some magic for UTF8 or whatever, it should
never go confused, and simply switch locate to C with appropriate warning message.
Comment 1 Bjorn Robertsson 2015-12-29 17:00:28 UTC
Hi, I created https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203040 a while back. This looks very similar.

I've downgraded some machines to nvi 1.79 from FreeBSD 9.3 source tree, which doesn't have iconv dependancy.

(In reply to heikki from comment #0)
Comment 2 Alexander Klein 2016-08-17 09:20:13 UTC
I see something similar in 10.3-RELEASE-p7 after playing around with Greek Unicode characters in zsh. I have the following line in my histfile, which contains a few Greek characters at the beginning, then the 0xb1, which I don't remember how to type, and then only ASCII-characters.

% sed -n -e 837p histfile | od -c
0000000    ρ  **   θ  **   θ  **   σ  ** 261               g   h   f   g
0000020   \n                                                            

% sed -n -e 837p histfile | hexdump -C
00000000  cf 81 ce b8 ce b8 cf 83  b1 20 20 20 67 68 66 67  |.........   ghfg|
00000010  0a                                                |.|

When writing the file, vi truncates it right at this point.
Comment 3 Michael Dexter freebsd_committer 2017-03-22 16:45:48 UTC
Also seen in 11.0-RELEASE-p8, resulting in data loss.
Comment 4 Bjorn Robertsson 2017-03-23 10:09:59 UTC
(In reply to Michael Dexter from comment #3)

I have used this patch for 11.0, (from https://lists.freebsd.org/pipermail/freebsd-bugs/2015-August/063464.html), but note the couple more matches in the FreeBSD bug list:
New         |    202740 | vi/ex string substitution problem when there is m 
New         |    202290 | /usr/bin/vi conversion error on valid character   

Index: contrib/nvi/common/encoding.c
--- contrib/nvi/common/encoding.c       (revision 292832)
+++ contrib/nvi/common/encoding.c       (working copy)
@@ -96,7 +96,7 @@
                                if (i >= nbytes)
                                        goto done;

-                               if (buf[i] & 0x40)      /* 10xxxxxx */
+                               if ((buf[i] & 0xc0) != 0x80)    /* 10xxxxxx */
                                        return -1;
Comment 5 Alexander Klein 2017-03-24 08:28:00 UTC
(In reply to Bjorn Robertsson from comment #4)

There are even more bugs which are probably related to the same problem:

Bug 196447 - vi(1) misbehavior when encountered invalid Unicode character
Bug 203040 - Nvi truncates files with non-ASCII characters

It seems to be possible to get the data back in a number of cases, even after having written a corrupted file to disk:


The issue as a whole seems to be quite involved, however: