Bug 210837 - lang/perl5.22 (and related?): ext/re/re_exec.c has long long format matched up with long argument
Summary: lang/perl5.22 (and related?): ext/re/re_exec.c has long long format matched u...
Status: Closed Works As Intended
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-perl (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-05 07:39 UTC by Mark Millard
Modified: 2016-08-01 22:30 UTC (History)
0 users

See Also:
bugzilla: maintainer-feedback? (perl)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Millard 2016-07-05 07:39:02 UTC
ext/re/re_exec.c has Perl_re_intuit_start(. . .) code that looks like:

            DEBUG_EXECUTE_r(PerlIO_printf(Perl_debug_log,
                "  Found /%s^%s/m, rescanning for anchored from offset %ld (rx_origin now %"IVdf")...\n",
                PL_colors[0], PL_colors[1],
                (long)(rx_origin - strbeg + prog->anchored_offset),
                (long)(rx_origin - strbeg)
            ));

but the IVdf vs. (long)(rx_origin - strbeg) do not match types. Other example code around seems to have (IV) for the cast when IVdf is in use.

The compiler doing the build [targeting armv6 (with -mcpu=cortex-a7)] reported the problem.
Comment 1 Mathieu Arnold freebsd_committer freebsd_triage 2016-07-05 07:48:32 UTC
Have you reported this to the right people ? I mean people who understands what you say, and can do something about it, like the Perl 5 Porters list http://lists.perl.org/list/perl5-porters.html
Comment 2 Mark Millard 2016-07-05 08:39:03 UTC
(In reply to Mathieu Arnold from comment #1)

The problem may well need to be reported upstream and any fix in FreeBSD's port(s) may well wait on upstream updating. That is not for me to say.

I did not build perl directly but it was involved via a dependency. (I build ports based on source.) I'm not familiar with perl's upstream at all. Your question told me more about upstream than I previously knew.

"A list for people interested in Perl5 core development. High volume.": So far I'm just reporting what the C/C++ compiler complained about and am not likely to be active on such a list.

Looking around it appears that actual bug tracking is tied to: https://rt.perl.org/Public/ but as of yet I do not have an account there.

There seems to be something called perlbug (see http://perldoc.perl.org/perlbug.html ). I'm not set up currently to try the latest stable release (apparently 5.24.0) but they request such testing. It may be a while before I've done enough to make a reasonable upstream submittal.
Comment 3 Mark Millard 2016-07-05 09:54:43 UTC
(In reply to Mathieu Arnold from comment #1)

Looks like perl-5.22.2/ext/re/re_exec.c is a copy made of perl-5.22.2/regexec.c .

It also appears that the 5.24.0 source has a fix in perl-5.22.2/regexec.c :

            DEBUG_EXECUTE_r(Perl_re_printf( aTHX_
                "  Found /%s^%s/m, rescanning for anchored from offset %"IVdf" (rx_origin now %"IVdf")...\n",
                PL_colors[0], PL_colors[1],
                (IV)(rx_origin - strbeg + prog->anchored_offset),
                (IV)(rx_origin - strbeg)
            ));

This is from:

author	David Mitchell <davem@iabyn.com>	
Wed, 9 Sep 2015 12:02:40 +0000 (13:02 +0100)
committer	David Mitchell <davem@iabyn.com>	
Fri, 2 Oct 2015 10:18:17 +0000 (11:18 +0100)
commit	73e8ff0004522621dfb42f01966853b51d5522a6
tree	1e2b70f790f84ab21c55ecfc85291511408af7ed
parent	0fa1f7e4e66a455cab6ccf1f9c49f2373c1ced80

as part of:

http://perl5.git.perl.org/perl.git/blobdiff/0fa1f7e4e66a455cab6ccf1f9c49f2373c1ced80..73e8ff0004522621dfb42f01966853b51d5522a6:/regexec.c



So how to classify this defect depends on the intent for "bug-for-bug compatible with upstream vs. not" for specific older perl5 versions (here 5.22.2) that are still in the ports tree. It does look like 5.24.0 would not have the problem.
Comment 4 Mark Millard 2016-07-05 10:48:42 UTC
(In reply to Mathieu Arnold from comment #1)

Looks like v5.23.4 is the first tag to have picked up the relevant change that I indicated earlier is (also) in v5.24.0 .
Comment 5 Mathieu Arnold freebsd_committer freebsd_triage 2016-07-06 10:33:26 UTC
Thank you for looking into this.

So, this is has already been fixed upstream, is just a warning and does not break anything, no need to add patches.
Comment 6 Mark Millard 2016-07-06 10:59:19 UTC
(In reply to Mathieu Arnold from comment #5)

It is fixed upstream in v5.23.4 and later tags but it was fixed because it is broken in v5.22 (and up until the fix) for some types of target architectures where long and the IVdf format do not match --and where long was too short and IV was needed in the casts and IVdf was needed for both formats.

Looks like I should possibly switch to v5.24 since I use TARGET_ARCH's for which v5.22 is broken here. (I'm not so sure that I'm likely to hit the broken code.)

It "works as intended" only in the sense of bug-for-bug 5.22 compatible since upstream is not updating things that old for such issues for targeting such architectures (no new 5.22.x tag that includes the fix).

Too bad "Closed" does not have a "bug-for-bug compatible with upstream's version" as an option. None of the existing options for Closed are a good fit for this specific defect. [In another environment I was in once there was a "no plan to fix" option for closing things. It avoided being misleading about if there was an actual issue present and made finding such things easier.]

I do think that "works as intended" is strongly misleading here: the complier's warning was not a false-positive.
Comment 7 Mathieu Arnold freebsd_committer freebsd_triage 2016-07-06 11:16:37 UTC
You never said anything was broken before this.

You said your compiler gave you a warning when building for arm, then went at length saying where it was, why it was, and that it had been fixed since.

Reading that, I understood that you wanted me to add a patch to fix a compiler warning, which I will not do.  it is only a warning and I don't want to have to maintain a patch until 5.22 gets removed (so, about 2.5 years) to silence a compiler.
Comment 8 Mathieu Arnold freebsd_committer freebsd_triage 2016-07-06 11:18:56 UTC
Commit 73e8ff0004522621dfb42f01966853b51d5522a6 applies fine on 5.22, does it fixes the breakage you encounter ?  If not, are there other changes needed ?
Comment 9 Mark Millard 2016-07-06 11:46:23 UTC
(In reply to Mathieu Arnold from comment #7)

I quoted the code in question and then I said of it:

"the IVdf vs. (long)(rx_origin - strbeg) do not match types"

That is a (too short?) description of a way of being broken: it indicates that the compiler's report was not a false positive.

In 32-bit long / 64-bit long long contexts the IVdf format uses more bytes (for long long) then are put out on the stack for the value (long), picking out some other bytes from memory as well. The details of the bad bytes use depends on big-endian vs. little-endian (or other memory layouts for such values). For little-endian if the extra bytes are zero bytes then the values would look okay. Otherwise not.

The operational difficulty for live testing comes in knowing how to reach the problem code in order to see the bad result. I'm not so sure that I'm likely to hit the broken code in the implicit use of perl5 in my context.

(perl5 was built because of dependencies, not because I directly built it for my own use.)

amd64 and powerpc64 would not be observably broken (long is also 64 bits). powerpc would be broken (long is 32-bits and poewrpc is big-endian). armv6 would depends on if the extra bytes at the time happened to be zeros (long is 32 bits little-endian).
Comment 10 Mark Millard 2016-07-06 12:13:14 UTC
(In reply to Mathieu Arnold from comment #8)

I have not reviewed the all the other 32-bit vs. 64-bit issues addressed by 73e8ff0004522621dfb42f01966853b51d5522a6 yet.

But it should help all TARGET_ARCH's with long being 32-bits and long long being 64-bits.

I'll note that building for i386 should be much like building for armv6 for these issues: 32-bit long's and 64-bit long long's, all little endian. I'd expect the same warnings-status form the 2 targets.

They are not as extreme as powerpc (big-endian) operationally but I'd expect powerpc (non-64) builds to also get the same warnings status as i386 and armv6.

If you want to see warnings generation yourself and you can target i386, that should work for seeing the 32-bit long vs. 64-bit long long sorts of issues in compiler reports. No armv6 or powerpc needed.

If I get some time I'll see if I can try a build on armv6 with the update applied somehow. [I'll not have powerpc access for weeks/months.]
Comment 11 Mark Millard 2016-07-25 09:20:21 UTC
(In reply to Mathieu Arnold from comment #8)

I still have more to do but I got a little time to see what the compile messages were like on armv6 (an ILP32 FreeBSD architecture) after applying the patch to the 5.22 source.

I did not see any warnings about odd mixes of 32-bit types and 64-bit types between formats and values/storage or other such.


BUT:

For armv6 there are format warnings for 32-bit types int vs. unsigned long (31-bit precision and sign bit vs. 32-bit precision and no sign bit). Various files get the same messages. I list the message a little later below.

If the types stay int and unsigned long on amd64 it would end up being 32-bit (int) vs. 64-bit (unsigned long). So this could then be more than a signed vs. not issue. But something might auto adjust so I need to try amd64 directly. That will not be tonight.

./const-xs.inc:1451:21: warning: format specifies type 'int' but the argument has type 'line_t' (aka 'unsigned long') [-Wformat]
                          COP_FILE(cop), CopLINE(cop));
                                         ^~~~~~~~~~~~
../../cop.h:529:21: note: expanded from macro 'CopLINE'
#define CopLINE(c)              ((c)->cop_line)
                                ^~~~~~~~~~~~~~~
./const-xs.inc:1456:50: warning: format specifies type 'int' but the argument has type 'line_t' (aka 'unsigned long') [-Wformat]
                          COP_FILE_F " line %d\n", sv, COP_FILE(cop), CopLINE(cop));
                                            ~~                        ^~~~~~~~~~~~
                                            %lu
../../cop.h:529:21: note: expanded from macro 'CopLINE'
#define CopLINE(c)              ((c)->cop_line)
                                ^~~~~~~~~~~~~~~



Side note:

There was the message:

op.c:13436:35: warning: comparison of constant 2251799813685247 with expression of type 'PADOFFSET' (aka 'unsigned long') is always false [-Wtautological-consta
nt-out-of-range-compare]
                if (intro && base >
                             ~~~~ ^

I do not know if 5.24 gets this or not. Nor if this is an okay "always false" or not.
Comment 12 Mark Millard 2016-08-01 22:20:25 UTC
(In reply to Mark Millard from comment #11)

[Note: Escape sequences hand edited out. Watch for editing errors in the deletions.]

amd64 11.0-BETA3 without the patch got:

# grep -h arning: ~/ports_typescripts/perl5_22_initial_build.typescript
op.c:3397:26: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:3445:15: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:9084:8: warning: nonnull parameter 'name' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:3397:26: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:3445:15: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:9084:8: warning: nonnull parameter 'name' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
inflate.c:1507:61: warning: shifting a negative signed value is undefined [-Wshift-negative-value]
ListUtil.xs:688:17: warning: unused variable 'a' [-Wunused-variable]
ListUtil.xs:689:17: warning: unused variable 'b' [-Wunused-variable]
ListUtil.xs:730:17: warning: unused variable 'a' [-Wunused-variable]
ListUtil.xs:731:17: warning: unused variable 'b' [-Wunused-variable]
Opcode.xs:230:5: warning: unused variable 'my_cxtp' [-Wunused-variable]
./sdbm.h:54:20: warning: 'extern' variable has an initializer [-Wextern-initializer]
Piece.xs:688:18: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Piece.xs:782:18: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Piece.xs:1059:29: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Piece.xs:350:13: warning: unused variable 'copyright' [-Wunused-variable]
Piece.xs:352:13: warning: unused variable 'sccsid' [-Wunused-variable]
<command line>:11:9: warning: 'TIME_HIRES_STAT' macro redefined [-Wmacro-redefined]


With the patch amd64 got those too, including the int vs. unsigned long notices that mean 32-bit vs. 64-bit in addition to signed vs. unsigned.

# grep arning: ~/ports_typescripts/perl5_22_patched_build.typescript | more
op.c:3397:26: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:3445:15: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:9084:8: warning: nonnull parameter 'name' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:3397:26: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:3445:15: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
op.c:9084:8: warning: nonnull parameter 'name' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
inflate.c:1507:61: warning: shifting a negative signed value is undefined [-Wshift-negative-value]
ListUtil.xs:688:17: warning: unused variable 'a' [-Wunused-variable]
ListUtil.xs:689:17: warning: unused variable 'b' [-Wunused-variable]
ListUtil.xs:730:17: warning: unused variable 'a' [-Wunused-variable]
ListUtil.xs:731:17: warning: unused variable 'b' [-Wunused-variable]
Opcode.xs:230:5: warning: unused variable 'my_cxtp' [-Wunused-variable]
./sdbm.h:54:20: warning: 'extern' variable has an initializer [-Wextern-initializer]
<command line>:11:9: warning: 'TIME_HIRES_STAT' macro redefined [-Wmacro-redefined]
Piece.xs:688:18: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Piece.xs:782:18: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Piece.xs:1059:29: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
Piece.xs:350:13: warning: unused variable 'copyright' [-Wunused-variable]
Piece.xs:352:13: warning: unused variable 'sccsid' [-Wunused-variable]

So for amd64 with and without the patch does not change the warnings. This is likely a expected/hoped-for result.

The nonnull parameter will evaluate to true warnings are:

--- op.o ---
op.c:3397:26: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
    SV * const stashsv = stash ? newSVhek(HvNAME_HEK(stash)) : &PL_sv_no;
                         ^~~~~ ~
op.c:3445:15: warning: nonnull parameter 'stash' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
    stashsv = stash ? newSVhek(HvNAME_HEK(stash)) : &PL_sv_no;
              ^~~~~ ~
op.c:9084:8: warning: nonnull parameter 'name' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
        name, name ? strlen(name) : 0, subaddr, NULL, NULL, NULL, 0
              ^~~~ ~
./embed.h:1274:69: note: expanded from macro 'newXS_len_flags'
#define newXS_len_flags(a,b,c,d,e,f,g)  Perl_newXS_len_flags(aTHX_ a,b,c,d,e,f,g)
                                                                     ^


The inflate.c shifting of a negative signed value being undefined is from:

--- inflate.o ---
cc -c  -I./zlib-src  -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -Wall -Werror=declaration-after-statement -Wextra -Wc++-compat -Wwrite-strings -pipe -g -fstack-protector -fno-strict-aliasing    -DVERSION=\"2.068\"  -DXS_VERSION=\"2.068\" -DPIC -fPIC "-I../.."  -DNO_VIZ -DZ_SOLO   -DGZIP_OS_CODE=3 inflate.c
--- lib/auto/DB_File/DB_File.so ---
LD_LIBRARY_PATH=/usr/obj/portswork/usr/ports/lang/perl5.22/work/perl\-5.22.2 ./miniperl -Ilib make_ext.pl lib/auto/DB_File/DB_File.so  MAKE="/usr/bin/make" LIBPERL_A=libperl.so.5.22.2 LINKTYPE=dynamic
--- lib/auto/Compress/Raw/Zlib/Zlib.so ---
inflate.c:1507:61: warning: shifting a negative signed value is undefined [-Wshift-negative-value]
    if (strm == Z_NULL || strm->state == Z_NULL) return -1L << 16;
                                                        ~~~ ^

The extern variable with an initializer is for:

--- sdbm.o ---
cc -c  -I../..  -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -Wall -Werror=declaration-after-statement -Wextra -Wc++-compat -Wwrite-strings -pipe -g -fstack-protector -fno-strict-aliasing    -DVERSION=\"1.13
\"  -DXS_VERSION=\"1.13\" -DPIC -fPIC "-I../.."  -DSDBM -DDUFF sdbm.c
In file included from sdbm.c:15:
./sdbm.h:54:20: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern const datum nullitem
                   ^


The TIME_HIRES_STAT notices are:

--- HiRes.o ---
cc -c    -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -Wall -Werror=declaration-after-statement -Wextra -Wc++-compat -Wwrite-strings -pipe -g -fstack-protector -fno-strict-aliasing    -DVERSION=\"1.9726\"  -DXS_VERSION=\"1.9726\" -DPIC -fPIC "-I../.."  -DTIME_HIRES_NANOSLEEP -DTIME_HIRES_CLOCK_GETTIME -DTIME_HIRES_CLOCK_GETRES -DTIME_HIRES_CLOCK -DTIME_HIRES_STAT=1 -DTIME_HIRES_STAT=4 -DATLEASTFIVEOHOHFIVE HiRes.c
In file included from <built-in>:319:
<command line>:11:9: warning: 'TIME_HIRES_STAT' macro redefined [-Wmacro-redefined]
#define TIME_HIRES_STAT 4
        ^
<command line>:10:9: note: previous definition is here
#define TIME_HIRES_STAT 1
        ^

which are from the command-line -D's.



The (signed) int vs. unsigned long warnings are for:

--- lib/auto/Time/Piece/Piece.so ---
Piece.xs:688:18: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
                        for (i = 0; i < asizeof(Locale->weekday); i++) {
                                    ~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~
Piece.xs:782:18: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
                        for (i = 0; i < asizeof(Locale->month); i++) {
                                    ~ ^ ~~~~~~~~~~~~~~~~~~~~~~
Piece.xs:1059:29: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
. . .
--- lib/auto/Time/Piece/Piece.so ---
        if ((len > 0 && len < sizeof(tmpbuf)) || (len == 0 && *fmt == '\0'))
                        ~~~ ^ ~~~~~~~~~~~~~~
Comment 13 Mark Millard 2016-08-01 22:30:29 UTC
(In reply to Mark Millard from comment #12)

I should have been more explicit for:

    So for amd64 with and without the patch does
    not change the warnings. This is likely a
    expected/hoped-for result.

I meant that for a patch designed to clean up ILP32 issues with 64-bit use, not from a overall correctness viewpoint.



The int vs. unsigned long issues for amd64 may well be a LP64 error that should be addressed -- or if they are not program errors then the source should try to avoid generating the warnings by being explicit in some way that silences compilers generally.