Bug 241773

Summary: lldb does not display external variables properly.
Product: Base System Reporter: bc979
Component: binAssignee: freebsd-toolchain (Nobody) <toolchain>
Status: Open ---    
Severity: Affects Many People CC: FreeBSD, arrowd, emaste, lwhsu
Priority: ---    
Version: 12.1-RELEASE   
Hardware: Any   
OS: Any   

Description bc979 2019-11-07 05:56:05 UTC
Here is an issue that has plagued me for some time:

Using FreeBSD 12.0:


testlib.c:
#include <sys/stat.h>
#include <stdio.h>
#include <string.h>
#include <strings.h>

char id[4];
int sock;

void testfunc() {
 struct stat sb;
 stat("testlib.c", &sb);
 strcpy (id, "aa");
 sock = 5;
 printf("Size of testlib.c is %i bytes.\n", (int)sb.st_size);
}


testprog.c:
#include <stdio.h>

extern char id[4];
extern int sock;

void testfunc(void);
int main(int argc, char **argv) {
 testfunc();
 printf ("id = %s\n", id);
 printf ("sock = %d\n", sock);
 return 0;
}


Makefile:
all:    clean testprog run

testprog:
       cc -Wall -g -c -fPIC -o testlib.o testlib.c
       cc  -shared -Wl,-export-dynamic -o testlib.so testlib.o
       cc -Wall -g -o testprog ./testlib.so testprog.c

clean:
       rm -f testlib.o testlib.so testprog

run:
       ./testprog


Using make:
rm -f testlib.o testlib.so testprog
cc -Wall -g -c -fPIC -o testlib.o testlib.c
cc  -shared -Wl,-export-dynamic -o testlib.so testlib.o
cc -Wall -g -o testprog ./testlib.so testprog.c
./testprog
Size of testlib.c is 268 bytes.
id = aa
sock = 5


Running lldb:
master# lldb testprog
(lldb) target create "testprog"
Current executable set to 'testprog' (x86_64).
(lldb) b main
Breakpoint 1: where = testprog`main + 22 at testprog.c:8, address = 0x0000000000201366
(lldb) r
Process 34787 launching
Process 34787 launched: '/home/doug/zzz/testprog' (x86_64)
Process 34787 stopped
* thread #1, name = 'testprog', stop reason = breakpoint 1.1
   frame #0: 0x0000000000201366 testprog`main(argc=1, argv=0x00007fffffffeb38) at testprog.c:8
  5   	
  6   	void testfunc(void);
  7   	int main(int argc, char **argv) {
-> 8   	  testfunc();
  9   	  printf ("id = %s\n", id);
  10  	  printf ("sock = %d\n", sock);
  11  	  return 0;
(lldb) n
Size of testlib.c is 268 bytes.
Process 34787 stopped
* thread #1, name = 'testprog', stop reason = step over
   frame #0: 0x000000000020137f testprog`main(argc=1, argv=0x00007fffffffeb38) at testprog.c:9
  6   	void testfunc(void);
  7   	int main(int argc, char **argv) {
  8   	  testfunc();
-> 9   	  printf ("id = %s\n", id);
  10  	  printf ("sock = %d\n", sock);
  11  	  return 0;
  12  	}
(lldb) p id
error: use of undeclared identifier 'id'
(lldb) p sock
error: Couldn't materialize: couldn't get the value of variable sock: testlib.so[0x4004] can't be resolved, testlib.so is not currently loaded
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression
(lldb) c
id = aa
sock = 5
Process 34787 resuming


You notice that lldb cannot display values for id or sock.  It even gives quite different messages about them.  However the program can access the values and it prints them out properly.  Why can't lldb see them?  How can that be corrected?

What is even more interesting is that in the real application there are quite a few of these global variables and lldb can display some of them, just not all.  Possibly it has to do with the specific names as DATE generally works.  sock and id never seem to work.

Now using FreeBSD 12.1:

We now have 12.1 with an updated compiler and linker.  Things have changed.  I added a bit to testprog.c:

master# more testprog.c
#include <stdio.h>

extern char id[4];
extern int sock;
int unknown;

void testfunc(void);
int main(int argc, char **argv) {
 testfunc();
 unknown = 9;
 printf ("id (%x) = %s\n", (int)&id, id);
 printf ("sock(%x)  = %d\n", (int)&sock, sock);
 printf ("unknown (%x) = %d\n", (int)&unknown, unknown);
 return 0;
}


running it yields:
master# ./testprog
Size of testlib.c is 268 bytes.
id (20400c) = aa
sock(204010)  = 5
unknown (204008) = 9

That appears to be correct.  Everything worked.  However, with lldb, instead of error messages I get wrong values:

master# lldb testprog
(lldb) target create "testprog"
Current executable set to 'testprog' (x86_64).
(lldb) b main
Breakpoint 1: where = testprog`main + 22 at testprog.c:9:3, address = 0x00000000002012f6
(lldb) r
Process 6039 launching
Process 6039 launched: '/home/doug/zzz/testprog' (x86_64)
Process 6039 stopped
* thread #1, name = 'testprog', stop reason = breakpoint 1.1
   frame #0: 0x00000000002012f6 testprog`main(argc=1, argv=0x00007fffffffeb38) at testprog.c:9:3
  6   	
  7   	void testfunc(void);
  8   	int main(int argc, char **argv) {
-> 9   	  testfunc();
  10  	  unknown = 9;
  11  	  printf ("id (%x) = %s\n", (int)&id, id);
  12  	  printf ("sock(%x)  = %d\n", (int)&sock, sock);
(lldb) n
Size of testlib.c is 268 bytes.
Process 6039 stopped
* thread #1, name = 'testprog', stop reason = step over
   frame #0: 0x0000000000201307 testprog`main(argc=1, argv=0x00007fffffffeb38) at testprog.c:10:11
  7   	void testfunc(void);
  8   	int main(int argc, char **argv) {
  9   	  testfunc();
-> 10  	  unknown = 9;
  11  	  printf ("id (%x) = %s\n", (int)&id, id);
  12  	  printf ("sock(%x)  = %d\n", (int)&sock, sock);
  13  	  printf ("unknown (%x) = %d\n", (int)&unknown, unknown);
(lldb) 
Process 6039 stopped
* thread #1, name = 'testprog', stop reason = step over
   frame #0: 0x0000000000201312 testprog`main(argc=1, argv=0x00007fffffffeb38) at testprog.c:11:3
  8   	int main(int argc, char **argv) {
  9   	  testfunc();
  10  	  unknown = 9;
-> 11  	  printf ("id (%x) = %s\n", (int)&id, id);
  12  	  printf ("sock(%x)  = %d\n", (int)&sock, sock);
  13  	  printf ("unknown (%x) = %d\n", (int)&unknown, unknown);
  14  	  return 0;
(lldb) p id
(char [4]) $0 = ""
(lldb) p &id
(char (*)[4]) $1 = 0x000000080024d000
(lldb) p unknown
(int) $2 = 9
(lldb) p &unknown
(int *) $3 = 0x0000000000204008
(lldb) 

For the variable unknown lldb displays the correct address and value.  For id and sock, it displays the wrong address and the value at the wrong address.  The correct values are in the proper addresses:

x 0x204008
0x00204008: 09 00 00 00 61 61 00 00 05 00 00 00 00 00 00 00  ....aa..........
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2019-11-07 06:44:39 UTC
What happens if you examine such a variable after a program accesses it at least once?
It could be that the shared library is loaded lazily, so those symbols are not actually resolved until they are accessed.
Comment 2 bc979 2019-11-07 06:53:27 UTC
I modified testlib to be:

master# vi testlib.c

#include <sys/stat.h>
#include <stdio.h>
#include <string.h>
#include <strings.h>

char id[4];
int sock;

void testfunc() {
  struct stat sb;
  stat("testlib.c", &sb);
  strcpy (id, "aa");
  sock = 5;
  printf ("sock = %d, id = %s\n", sock, id);
  printf("Size of testlib.c is %i bytes.\n", (int)sb.st_size);
}


The results are exatly the same other than there is the extra line of output.

(lldb) p id
(char [4]) $0 = ""
(lldb) p sock
(int) $1 = 0
(lldb) p &id
(char (*)[4]) $2 = 0x000000080024d000
Comment 3 Andriy Gapon freebsd_committer freebsd_triage 2019-11-08 08:30:39 UTC
I meant that the program (executable) accesses a variable, not an intra-library access.
Comment 4 bc979 2019-11-08 09:33:47 UTC
Apparently I misunderstood.  Here is a lldb session with changes to testprog:

   8   	int main(int argc, char **argv) {
   9   	  strcpy (id, "yy");
   10  	  sock = 88;
-> 11  	  testfunc();
   12  	  unknown = 9;
   13  	  printf ("id (%x) = %s\n", (int)&id, id);
   14  	  printf ("sock(%x)  = %d\n", (int)&sock, sock);
(lldb) p id
(char [4]) $0 = ""
(lldb) p sock
(int) $1 = 0
(lldb) p &sock
(int *) $2 = 0x000000080024d004
(lldb) p & id
(char (*)[4]) $3 = 0x000000080024d000
(lldb) p *id
(char) $4 = '\0'

Even after setting the variables, lldb doesn't find the right values although it seems to get the right addresses.  After testfunc is called, nothing has changed from before.
Comment 5 Andriy Gapon freebsd_committer freebsd_triage 2019-11-08 12:05:28 UTC
Thank you for the test.
So, the problem is not what I thought it might be.
That's a very strange problem.
Comment 6 Ed Maste freebsd_committer freebsd_triage 2020-11-02 17:43:51 UTC
Tried to reproduce on -current with

lldb version 11.0.0 (git@github.com:llvm/llvm-project.git revision llvmorg-11.0.0-rc2-0-g414f32a9e86)
  clang revision llvmorg-11.0.0-rc2-0-g414f32a9e86
  llvm revision llvmorg-11.0.0-rc2-0-g414f32a9e86

$ make
rm -f testlib.o testlib.so testprog
cc -Wall -g -c -fPIC -o testlib.o testlib.c
cc  -shared -Wl,-export-dynamic -o testlib.so testlib.o
cc -Wall -g -o testprog ./testlib.so testprog.c
./testprog
sock = 5, id = aa
Size of testlib.c is 311 bytes.
id = aa
sock = 5

$ lldb ./testprog
(lldb) target create "./testprog"
Current executable set to '/home/emaste/bugs/pr241773/testprog' (x86_64).
(lldb) b main
Breakpoint 1: where = testprog`main + 22 at testprog.c:8:2, address = 0x0000000000201976
(lldb) run
Process 2340 launching
Process 2340 launched: '/home/emaste/bugs/pr241773/testprog' (x86_64)
Process 2340 stopped
* thread #1, name = 'testprog', stop reason = breakpoint 1.1
    frame #0: 0x0000000000201976 testprog`main(argc=1, argv=0x00007fffffffe720) at testprog.c:8:2
   5   
   6    void testfunc(void);
   7    int main(int argc, char **argv) {
-> 8     testfunc();
   9     printf ("id = %s\n", id);
   10    printf ("sock = %d\n", sock);
   11    return 0;
(lldb) n
sock = 5, id = aa
Size of testlib.c is 311 bytes.
Process 2340 stopped
* thread #1, name = 'testprog', stop reason = step over
    frame #0: 0x000000000020197b testprog`main(argc=1, argv=0x00007fffffffe720) at testprog.c:9:2
   6    void testfunc(void);
   7    int main(int argc, char **argv) {
   8     testfunc();
-> 9     printf ("id = %s\n", id);
   10    printf ("sock = %d\n", sock);
   11    return 0;
   12   }
(lldb) p id
(char [4]) $0 = ""
(lldb) p sock
(int) $1 = 0
(lldb) c
id = aa
sock = 5
Process 2340 resuming
Process 2340 exited with status = 0 (0x00000000) 
(lldb) ^D
Comment 7 Ed Maste freebsd_committer freebsd_triage 2020-11-02 17:48:24 UTC
You can turn on logging in lldb to help figure out what is going on - sorry it has been some time since I've been actively involved in lldb and do not recall the specific channels and categories.

To see all of the channels and categories:
(lldb) log list

For example, run
(lldb) log enable posix all
then run
(lldb) p id
Comment 8 bc979 2020-11-03 00:39:00 UTC
Having upgraded to 12.2, I ran the lldb session again.  Lldb had degraded significantly.  Stepping through the call to testfunc yields the following:

   6   	
   7   	void testfunc(void);
   8   	int main(int argc, char **argv) {
-> 9   		testfunc();
   10  		unknown = 9;
   11  		printf ("id (%x) = %s\n", (int)&id, id);
   12  		printf ("sock(%x)  = %d\n", (int)&sock, sock);
(lldb) n
Size of testlib.c is 262 bytes.
id (203d74) = aa
sock(203d78)  = 5
unknown (203d70) = 9
Process 26735 exited with status = 0 (0x00000000) 

lldb no longer stops on the next statement but runs to completion.  The output is correct as always.

Stepping into testfunc works however a print inside testfunc yields the following:

Process 26847 stopped
* thread #1, name = 'testprog', stop reason = step over
    frame #0: 0x000000080024d6e2 testlib.so`testfunc at testlib.c:14:53
   11  	 stat("testlib.c", &sb);
   12  	 strcpy (id, "aa");
   13  	 sock = 5;
-> 14  	 printf("Size of testlib.c is %i bytes.\n", (int)sb.st_size);
   15  	}
   16  	
(lldb) p sock
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
Stack dump:
0.	Program arguments: lldb testprog 
1.	HandleCommand(command = "p sock")
#0 0x0000000003b15cae (/usr/bin/lldb+0x3b15cae)
#1 0x0000000003b13f25 (/usr/bin/lldb+0x3b13f25)
#2 0x0000000003b165b0 (/usr/bin/lldb+0x3b165b0)
#3 0x00000008046c9b70 (/lib/libthr.so.3+0x14b70)
Abort (core dumped)

Basically, 12.2 lldb is not suitable for development work.  It looks like I am going to have to revert back to 12.1.
Comment 9 bc979 2020-11-03 00:44:48 UTC
Using 

(lldb) log enable posix all
then run
(lldb) p id


generates a lot of ptrace entries, however after the p id command, I get:

Process 26878 stopped
* thread #1, name = 'testprog', stop reason = step in failed (Could not create return address breakpoint. Return address (0x80024d6ae) permissions not found.)
    frame #0: 0x0000000800404dd0 libc.so.7`stat
libc.so.7`stat:
->  0x800404dd0 <+0>: pushq  %rbp
    0x800404dd1 <+1>: movq   %rsp, %rbp
    0x800404dd4 <+4>: pushq  %r15
    0x800404dd6 <+6>: pushq  %r14
(lldb)
Comment 10 Shane 2020-11-04 06:08:03 UTC
The failure to step looks to have been fixed in lldb v11.

I see the failure to step, in 12.2 base lldb as well as lldb10 from ports, using lldb11 from ports steps as expected. (lldb 8/9 also work)


Using 12-stable from Feb and some older ports, I also see another differing result, compiling with clang 9 and using lldb10 from ports.

Using the first version that prints values only, lldb10 gives a permission error

* thread #1, name = 'testprog', stop reason = step over failed (Could not create return address breakpoint. Return address (0x2012db) permissions not found.)

Then using the newer version with printing address as well there is no error, lldb10 fails to step and continues to the end. Just changing the printf lines makes this variation.
Comment 11 bc979 2020-12-13 00:59:49 UTC
With 12.2 this problem has become much worse.  Any p command to lldb causes a "Program aborted due to an unhandled Error:
Error value was Success. "

Interestingly enough, I copied lldb from a 12.1 system onto 12.2 and it seems to print properly for variables defined in the source file.  I don't believe global variables will work though.
Comment 12 Gleb Popov freebsd_committer freebsd_triage 2022-09-09 06:28:42 UTC
The "Return address (0x...) permissions not found" error is still reproducible on 12.3-RELEASE. On all newer FreeBSD versions reproduction steps work without any error.