Note first that this problem also occurs for Mac OS X Snow Leopard and NetBSD current. I have not yet tested GNU awk. I use awk(1) to generate test data from Unicode text files. . i think the best is i show it: ## Input producers io_unicode_data() { < unicode/UnicodeData.txt ${TAWK} ' BEGIN {FS = ";" ; OFS = ";"} # There are no comments in this, but.. /^[[:space:]]*[^#]+$/ { i = $2 # Ranges must become unrolled, otherwise step on if (i !~ /, First>/) { $2 = "" print next } r1 = sprintf("%d", "0x" $1) getline r2 = sprintf("%d", "0x" $1) $2 = "" # This gets around a bug in at least "awk version 20070501" as found # on Slow Leopard: there the range F0000-FFFFD, and only that one, # will *not* be evaluated unless we do this (once property test came) # XXX presumably the type system is a bit weird; check other AWKs! sprintf("%X %X", r1, r2) [ this is it; UnicodeData.txt contains multiple ranges, but only this one will be "omitted" without sprintf(), the while() will simply not execute otherwise. ] while (r1 <= r2) { $1 = sprintf("%X", r1) printf "%s\n", $0 ++r1 } } ' } How-To-Repeat: well..; git clone my S-CText and run `make ucd' with and without the line `sprintf("%X %X", r1, r2)', compare the resulting `test/sa/t_props.dat' files.
----- Forwarded message from Steffen Daode Nurpmeso <sdaoden@gmail.com> ----- Date: Fri, 05 Jul 2013 23:52:45 +0200 From: Steffen Daode Nurpmeso <sdaoden@gmail.com> To: freebsd-bugs@FreeBSD.org Subject: Re: bin/180328: awk(1) fails to treat var as integer User-Agent: s-nail s-nail-14.3.2-20-g1f64075 Hello. uwe@netbsd prodded that i dig a bit deeper and so here is the thing a bit narrowed down. Sorry. | Please, can you minimize the test case? As far as I understand it | should be reducible to the script and to a single line of input that | triggers the problem. Hmmm. cat > test.sh <<\! printf '1 '; printf "F0000\n" | awk '{r2 = r1 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}' printf '2 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = r1; while (r1 <= r2) {print r1; ++r1}}' printf '3 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); while (r1 <= 983040) {print r1; ++r1}}' printf '4 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}' printf '5 '; printf "F0000\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}' printf '6 '; printf "F0000 F0001\n" | awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 < r2) {print r1; ++r1}}' sh ./test.sh results in 1 983040 2 983040 3 983040 4 983040 5 983040 6 So -- indeed. Sorry. | -uwe --steffen But $ make ucd; ll test/sa/t_props.dat; make ucd-clean;\ sed -e 40d -i '' tools/t-base.t; make ucd; ll test/sa/t_props.dat becomes (when i strip all the other messages) ucd: ok 4956 -rw-rw-r-- 1 steffen staff 5071362 5 Jul 23:40 test/sa/t_props.dat ucd-clean: ok ... ucd: ok 4188 -rw-rw-r-- 1 steffen staff 4284954 5 Jul 23:40 test/sa/t_props.dat _______________________________________________ freebsd-bugs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org" ----- End forwarded message -----
Hello, i'm forwarding one more. (This time to bug-followup@ -- hello, Mark Linimon!) -------- Original Message -------- Date: Wed, 10 Jul 2013 10:53:13 +0200 From: Steffen "Daode" Nurpmeso <sdaoden@gmail.com> To: gnats-bugs@NetBSD.org Subject: Re: bin/48017: awk(1) fails to treat var as integer (may be related to #47840) David Holland <dholland-bugs@netbsd.org> wrote: | sprintf witih %d doesn't produce an number value; it produces a | string value, which you have to coerce to a number by adding zero to | it to get it to behave like a number. (Adding +0 was my final solution too, because GNU awk(1) didn't make it by the (presumably more expensive, too) sprintf("%X") call just as all other tested awk(1)s did.) So there is a problem with the implicit type conversion, since echo f001 f00d |\ awk '{ a=sprintf("%d", "0x" $1); b=sprintf("%d", "0x" $2); while (a < b) { print a; a++; }}' works just fine?!? I think the relevant parts from POSIX are the value of an expression shall be implicitly converted to the type needed for the context in which it is used. [.] A numeric value that is exactly equal to the value of an integer (see Concepts Derived from the ISO C Standard) shall be converted to a string by the equivalent of a call to the sprintf function (see String Functions) with the string "%d" as the fmt argument and the numeric value being converted as the first and only expr argument. [.] This volume of POSIX.1-2008 specifies no explicit conversions between numbers and strings. An application can force an expression to be treated as a number by adding zero to it, or can force it to be treated as a string by concatenating the null string ( "" ) to it. [.] A string value shall be considered a numeric string if it comes from one of the following: [.] 1. Field variables [.] 8. Variable assignment from another numeric string variable [...] and an implementation-dependent condition corresponding to either case (a) or (b) below is met. [.] b. After all the following conversions have been applied, the resulting string would lexically be recognized as a NUMBER token as described by the lexical conventions in Grammar : [.] Whether or not a string is a numeric string shall be relevant only in contexts where that term is used in this section. And because the `Table: Expressions in Decreasing Precedence in awk' contains the line expr < expr Less than Numeric None i believe its a bug. (That hopefully gets fixed by someone who yet has some experience with the awk codebase.) | David A. Holland | dholland@netbsd.org --steffen
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped
I ran into something similar, also solved using if ( var1 + 0 < var2 ) For data read using getline, the behavior differs from mawk and gawk from ports. I'm not sure if this should be regarded as a bug, but it should at least be documented. Here's a minimal test case: BEGIN { x="10" y="9" printf("%s\n", x < y); # Always 1 x=10 y=9 printf("%s\n", x < y); # Always 0 getline x < "xy.txt" getline y < "xy.txt" printf("%s %s\n", x, y); # Prove we're using values from getline printf("%s\n", x < y); # awk 1, mawk and gawk 0 } xy.txt: 11 8
MARKED AS SPAM
This bug has a couple of different bugs mixed together. I'll sort out what's really a bug and what's not.