Summary: | awk(1) fails to treat var as integer | ||
---|---|---|---|
Product: | Base System | Reporter: | Steffen "Daode" Nurpmeso <sdaoden> |
Component: | bin | Assignee: | Warner Losh <imp> |
Status: | Open --- | ||
Severity: | Affects Only Me | CC: | imp, jwb, nosuw, syzosab, zaqi |
Priority: | Normal | ||
Version: | Unspecified | ||
Hardware: | Any | ||
OS: | Any |
Description
Steffen "Daode" Nurpmeso
2013-07-05 18:20:01 UTC
----- Forwarded message from Steffen Daode Nurpmeso <sdaoden@gmail.com> ----- Date: Fri, 05 Jul 2013 23:52:45 +0200 From: Steffen Daode Nurpmeso <sdaoden@gmail.com> To: freebsd-bugs@FreeBSD.org Subject: Re: bin/180328: awk(1) fails to treat var as integer User-Agent: s-nail s-nail-14.3.2-20-g1f64075 Hello. uwe@netbsd prodded that i dig a bit deeper and so here is the thing a bit narrowed down. Sorry. | Please, can you minimize the test case? As far as I understand it | should be reducible to the script and to a single line of input that | triggers the problem. Hmmm. cat > test.sh <<\! printf '1 '; printf "F0000\n" | awk '{r2 = r1 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}' printf '2 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = r1; while (r1 <= r2) {print r1; ++r1}}' printf '3 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); while (r1 <= 983040) {print r1; ++r1}}' printf '4 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}' printf '5 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}' printf '6 '; printf "F0000 F0001\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 < r2) {print r1; ++r1}}' sh ./test.sh results in 1 983040 2 983040 3 983040 4 983040 5 983040 6 So -- indeed. Sorry. | -uwe --steffen But $ make ucd; ll test/sa/t_props.dat; make ucd-clean;\ sed -e 40d -i '' tools/t-base.t; make ucd; ll test/sa/t_props.dat becomes (when i strip all the other messages) ucd: ok 4956 -rw-rw-r-- 1 steffen staff 5071362 5 Jul 23:40 test/sa/t_props.dat ucd-clean: ok ... ucd: ok 4188 -rw-rw-r-- 1 steffen staff 4284954 5 Jul 23:40 test/sa/t_props.dat _______________________________________________ freebsd-bugs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org" ----- End forwarded message ----- Hello, i'm forwarding one more. (This time to bug-followup@ -- hello, Mark Linimon!) -------- Original Message -------- Date: Wed, 10 Jul 2013 10:53:13 +0200 From: Steffen "Daode" Nurpmeso <sdaoden@gmail.com> To: gnats-bugs@NetBSD.org Subject: Re: bin/48017: awk(1) fails to treat var as integer (may be related to #47840) David Holland <dholland-bugs@netbsd.org> wrote: | sprintf witih %d doesn't produce an number value; it produces a | string value, which you have to coerce to a number by adding zero to | it to get it to behave like a number. (Adding +0 was my final solution too, because GNU awk(1) didn't make it by the (presumably more expensive, too) sprintf("%X") call just as all other tested awk(1)s did.) So there is a problem with the implicit type conversion, since echo f001 f00d |\ awk '{ a=sprintf("%d", "0x" $1); b=sprintf("%d", "0x" $2); while (a < b) { print a; a++; }}' works just fine?!? I think the relevant parts from POSIX are the value of an expression shall be implicitly converted to the type needed for the context in which it is used. [.] A numeric value that is exactly equal to the value of an integer (see Concepts Derived from the ISO C Standard) shall be converted to a string by the equivalent of a call to the sprintf function (see String Functions) with the string "%d" as the fmt argument and the numeric value being converted as the first and only expr argument. [.] This volume of POSIX.1-2008 specifies no explicit conversions between numbers and strings. An application can force an expression to be treated as a number by adding zero to it, or can force it to be treated as a string by concatenating the null string ( "" ) to it. [.] A string value shall be considered a numeric string if it comes from one of the following: [.] 1. Field variables [.] 8. Variable assignment from another numeric string variable [...] and an implementation-dependent condition corresponding to either case (a) or (b) below is met. [.] b. After all the following conversions have been applied, the resulting string would lexically be recognized as a NUMBER token as described by the lexical conventions in Grammar : [.] Whether or not a string is a numeric string shall be relevant only in contexts where that term is used in this section. And because the `Table: Expressions in Decreasing Precedence in awk' contains the line expr < expr Less than Numeric None i believe its a bug. (That hopefully gets fixed by someone who yet has some experience with the awk codebase.) | David A. Holland | dholland@netbsd.org --steffen For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped I ran into something similar, also solved using if ( var1 + 0 < var2 ) For data read using getline, the behavior differs from mawk and gawk from ports. I'm not sure if this should be regarded as a bug, but it should at least be documented. Here's a minimal test case: BEGIN { x="10" y="9" printf("%s\n", x < y); # Always 1 x=10 y=9 printf("%s\n", x < y); # Always 0 getline x < "xy.txt" getline y < "xy.txt" printf("%s %s\n", x, y); # Prove we're using values from getline printf("%s\n", x < y); # awk 1, mawk and gawk 0 } xy.txt: 11 8 MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM This bug has a couple of different bugs mixed together. I'll sort out what's really a bug and what's not. |