sed doesn't appear to be doing the right thing with escape chars (in this case '\t'); it's not properly reinterpreting '\t' as \011, but is instead interpreting it was 't': $ echo "foot " | sed -e 's/[\\t ]*$//' | hexdump -C 00000000 66 6f 6f 0a |foo.| 00000004 $ echo "foot " | sed -E -e 's/[\\t ]*$//' | hexdump -C 00000000 66 6f 6f 0a |foo.| 00000004 GNU sed does do the right thing with escape chars (verified on Fedora 17): # cat /etc/redhat-release Fedora release 17 (Beefy Miracle) # echo foot | sed -e 's/[\t ]*$//' | hexdump -C 00000000 66 6f 6f 74 0a |foot.| 00000005 # echo "foot " | sed -e 's/[\t ]*$//' | hexdump -C 00000000 66 6f 6f 74 0a |foot.| 00000005 How-To-Repeat: echo foot | sed -e 's/[\t ]*$//'
Hi. This should be files as a Question not a Bug. You need to investigate things before claiming "bug". > $ echo "foot " | sed -e 's/[\\t ]*$//' | hexdump -C What I see is your using [] wrong. '\t' doesn't go inside it for many WELL documented reasons. Furthermore sed mathes '\t' without [] so why bother filing a bug against [] ? '\t' has been around more years than your were born and probably sed too... anyhow. "ediquitte" is you ask questions to users and file bugs about only if your very as professor sure and also know how to fix it (pref. wtih code patch filed which absolutely CANNOT cause past software to stop working). p.s. So, when you see a "majorly relied upon" unix tool doing "something stupid" it's very likely you have more learning and asking to do. 1) you don't want people hacking sed to make it work your way it would break past software 2) you don't want BSD-SED to have false bugs filed it will make BSD look worse than it is 3) since early stable versions of free bsd, bsd has accumulated way too many bugs and fixes and hardware compat changes - about a suspicious ammount - and still isn't as stable as it had been )
On Sat, Oct 20, 2012 at 5:37 PM, John D. Hendrickson and Sara Darnell <johnandsara2@cox.net> wrote: ... > What I see is your using [] wrong. '\t' doesn't go inside it for many WELL > documented reasons. Cite documentation please and explain where I got it wrong. > Furthermore sed mathes '\t' without [] so why bother filing a bug against [] > ? The point was to match one or the other as this was something that someone else wrote that's a valid regular expression in bracket notation that I executed on both FreeBSD and Linux. '\040' != '\011' and as such a space shouldn't be a drop in replacement for a horizontal tab. The PR was filed to note the discrepancy. Thanks, -Garrett
Despite what comment #1 says, sed does NOT match literal "\t" to a tab character outside [] as well -- quoting http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html: 9.3.2 BRE Ordinary Characters The interpretation of an ordinary character preceded by an unescaped <backslash> ('\\') is undefined. 9.4.2 ERE Ordinary Characters The interpretation of an ordinary character preceded by an unescaped <backslash> ( '\\' ) is undefined. There's an exception that is important here that <backslash> loses it's special meaning inside the bracket expression, so in your examples the bracket expression "[\t ]" correctly matches the 't' character and whitespace. Given the above, GNU sed actually violates the standard which defines the 's' command as the following: [2addr]s/BRE/replacement/flags ...so all BRE (ERE with -E) rules apply here. I would agree that *outside* of the bracket expression we could make "\t" match the tab character (that's what libtre does apparently) as it's more readable than inserting literal tab characters in RE, but inside the bracket expression GNU sed is clearly wrong, and our sed (through regex(3)) is doing the right thing. (I just hope I'm understanding everything correctly here, of course)
For bugs matching the following conditions: - Status == In Progress - Assignee == "bugs@FreeBSD.org" - Last Modified Year <= 2017 Do - Set Status to "Open"
\t (and \r and \n) was properly implemented in r355590 and MFC'd to stable/12 in r355902. (In reply to Yuri Pankov from comment #3) I tend to agree that it's a standards violation based on my re-reading of the spec, but I would propose that it's an OK violation in the name of reasonable expectations and not making users' lives harder. \t => t in a bracketed expression serves little value and doesn't accomplish anything you can't do otherwise, but being able to [ \t] to capture the subset of [:space:] that might represent leading indentation you want to normalize.