TypeCodes

Linux多进程和多线程的一次gdb调试实例

Linux C/C++开发中gdb进行多进程和多线程的调试一直比较麻烦,在CSDN上看到高科的一篇文章《gdb调试多进程和多线程命令》比较有启发,这里就自己重新整理并做了一个GDB多进程/线程的调试实践。

1 原文整理

默认设置下,在调试多进程程序时gdb只会调试主进程。gdb7以上的版本(gdb --version)支持多进程调试,只需要设置好follow-fork-mode(fork追踪模式)以及detach-on-fork(指示GDB在fork之后是否断开某个进程的调试)即可。

这两个参数的设置命令分别是:set follow-fork-mode [parent|child],set detach-on-fork [on|off]。两者结合起来构成了GDB的调试模式:

follow-fork-mode  detach-on-fork    说明
    parent              on          GDB默认的调试模式:只调试主进程
    child               on          只调试子进程
    parent              off         同时调试两个进程,gdb跟主进程,子进程block在fork位置
    child               off         同时调试两个进程,gdb跟子进程,主进程block在fork位置

查看gdb默认的参数设置:

(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) show detach-on-fork
Whether gdb will detach the child of a fork is on.
(gdb)

2 演示代码

下面这段代码的主要流程就是在main函数中fork创建一个子进程,然后在父进程中又创建一个线程,接着就使用gdb进行调试(block子进程)。注意,在调试设置断点的时候,由于之前调试的时候代码最前面没有加上这7行说明文字,所以设置断点的行号要加上7。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
/** 
 * @FileName    gdb_pthread.c
 * @Describe    A simple example for the debug of multiprocess and multithreading using gdb in linux system.
 * @Author      vfhky 2016-02-25 22:48 https://typecodes.com/cseries/multilprocessthreadgdb.html
 * @Compile     gcc gdb_pthread.c -g -o gdb_pthread
 * @Reference   http://blog.csdn.net/pbymw8iwm/article/details/7876797
 */
#include <stdio.h>
#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>


//Parent process handle.
void Parent();
//Child process handle.
void Child();
//Parent process handle after generate a thread.
void * ParentDo( char *argv );

int main( int argc, const char **argv )
{
    int pid;
    pid = fork();
    if(pid != 0)        //add the first breakpoint.
        Parent();
    else
        Child();
    return 0;
}

//Parent process handle.
void Parent()
{
    pid_t pid = getpid();
    char cParent[] = "Parent";
    char cThread[] = "Thread";
    pthread_t pt;

    printf( "[%s]: [%d] [%s]\n", cParent, pid, "step1" );

    if( pthread_create( &pt, NULL, (void *)*ParentDo, cThread ) )
    {
        printf( "[%s]: Can not create a thread.\n", cParent );
    }

    ParentDo( cParent );
    sleep(1);
}

void * ParentDo( char *argv )
{
    pid_t pid = getpid();
    pthread_t tid = pthread_self();     //Get the thread-id selfly.
    char tprefix[] = "thread";

    printf( "[%s]: [%d] [%s] [%lu] [%s]\n", argv, pid, tprefix, tid, "step2" );         //add the second breakpoint.
    printf( "[%s]: [%d] [%s] [%lu] [%s]\n", argv, pid, tprefix, tid, "step3" );

    return NULL;
}

void Child()
{
    pid_t pid = getpid();
    char prefix[] = "Child";
    printf( "[%s]: [%d] [%s]\n", prefix, pid, "step1" );
    return;
}

已知如果直接运行程序,那么输出的内容如下:

[vfhky@typecodes pthread_key]$ gdb_pthread
[Parent]: [22648] [step1]
[Parent]: [22648] [thread] [139722467432256] [step2]
[Parent]: [22648] [thread] [139722467432256] [step3]
[Thread]: [22648] [thread] [139722450630400] [step2]
[Thread]: [22648] [thread] [139722450630400] [step3]
[Child]: [22649] [step1]
[vfhky@typecodes pthread_key]$

3 gdb调试

3.1 设置调试模式和Catchpoint

设置调试父子进程,gdb跟主进程,子进程block在fork位置。

[vfhky@typecodes pthread_key]$ gdb gdb_pthread
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/vfhky/bin/gdb_pthread...done.
(gdb) set detach-on-fork off
#####catch让程序在发生某种事件(fork、异常throw、异常catch、动态库加载等)的时候停止运行
(gdb) catch fork 
Catchpoint 1 (fork)
(gdb) info b
Num     Type           Disp Enb Address            What
1       catchpoint     keep y                      fork
(gdb)

如下图所示:

开启gdb调试

3.2 开始gdb调试
(gdb) r                         ####运行到断点/捕捉点(第17行处的fork函数,23873是子进程PID)
Starting program: /home/vfhky/bin/gdb_pthread 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Catchpoint 1 (forked process 23873), 0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
130       pid = ARCH_FORK ();
(gdb) bt                        #####查看堆栈情况
#0  0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
#1  0x00000000004007b4 in main (argc=1, argv=0x7fffffffe4c8) at gdb_pthread.c:17
(gdb) info threads              #######显示运行的线程信息(23869是父进程的PID)
  Id   Target Id         Frame 
* 1    Thread 0x7ffff7fe1740 (LWP 23869) "gdb_pthread" 0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
(gdb) info inferiors            ######显示正在调试的进程:1前面的星号表示当前调试的进程(PID 23869)。  
  Num  Description       Executable        
* 1    process 23869     /home/vfhky/bin/gdb_pthread 
(gdb) info b                    ######列出所有断点和捕捉点,此时已经hit 1 time,即捕捉到了一次fork事件
Num     Type           Disp Enb Address            What
1       catchpoint     keep y                      fork, process 23873                   #####子进程23873
        catchpoint already hit 1 time
(gdb)

这时使用如下命令查看当前CentOS系统所有进程的状态:发现父进程PID为23869,通过fork产生的子进程为23873:

[vfhky@typecodes ~]$ pstree -pul

pstree -pul查看CentOS系统所有进程信息

同时,使用命令cat /proc/23869/status查看当前进程的详细信息:进程PID为23869,它的父进程(即GDB进程)为23859,同时这也是追踪进程ID,线程数Threads为1(共享使用该信号描述符的线程数,在POSIX多线程序应用程序中,线程组中的所有线程使用同一个信号描述符)。

proc查看进程的状态信息

3.3 设置第一个断点

在程序的第18行设置断点:

(gdb) b gdb_pthread.c:18
Breakpoint 2 at 0x4007b7: file gdb_pthread.c, line 18.
(gdb) info b                        ######列出所有断点和捕捉点
Num     Type           Disp Enb Address            What
1       catchpoint     keep y                      fork, process 23873                 ########子进程23873
        catchpoint already hit 1 time
2       breakpoint     keep y   0x00000000004007b7 in main at gdb_pthread.c:18
(gdb)
3.4 执行到第一个断点
(gdb) c                #####执行到第18行处的断点
Continuing.
[New process 23873]                     #####父进程23869执行完第1个捕捉点的程序,产生子进程23873
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 2, main (argc=1, argv=0x7fffffffe4c8) at gdb_pthread.c:18            ##########父进程执行到第18行处的断点
18              if(pid != 0)
(gdb) info threads                      ####查看所有运行的线程,有父进程23869和子进程23873
  Id   Target Id         Frame 
  2    Thread 0x7ffff7fe1740 (LWP 23873) "gdb_pthread" 0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
* 1    Thread 0x7ffff7fe1740 (LWP 23869) "gdb_pthread" main (argc=1, argv=0x7fffffffe4c8) at gdb_pthread.c:18
(gdb) info inferiors                    #####显示正在调试的进程
  Num  Description       Executable        
  2    process 23873     /home/vfhky/bin/gdb_pthread                ########子进程
* 1    process 23869     /home/vfhky/bin/gdb_pthread                ########父进程
(gdb) info b                    #######查看当前所有的断点
Num     Type           Disp Enb Address            What
1       catchpoint     keep y                      fork, process 23873 
        catchpoint already hit 1 time
2       breakpoint     keep y   <MULTIPLE>         
        breakpoint already hit 1 time
2.1                         y     0x00000000004007b7 in main at gdb_pthread.c:18 inf 2
2.2                         y     0x00000000004007b7 in main at gdb_pthread.c:18 inf 1
(gdb)

截图如下:

执行到第1个断点处查看线程和进程状况

这时使用命令查看当前系统进程的状态:发现此时仍然只有父进程23869和子进程23873。

[vfhky@typecodes ~]$ pstree -pul

pstree -pul查看CentOS系统所有进程信息

3.5 执行到第一个断点此时如果切换到子进程23873
(gdb) inferior 2
[Switching to inferior 2 [process 23873] (/home/vfhky/bin/gdb_pthread)]
[Switching to thread 2 (Thread 0x7ffff7fe1740 (LWP 23873))] 
#0  0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
130       pid = ARCH_FORK ();
(gdb) info inferiors                 #####显示正在调试的进程 
  Num  Description       Executable        
* 2    process 23873     /home/vfhky/bin/gdb_pthread                #####子进程
  1    process 23869     /home/vfhky/bin/gdb_pthread                #####父进程
(gdb)
3.6 重新切换到父进程23869
(gdb) inferior 1
[Switching to inferior 1 [process 23869] (/home/vfhky/bin/gdb_pthread)]
[Switching to thread 1 (Thread 0x7ffff7fe1740 (LWP 23869))] 
#0  main (argc=1, argv=0x7fffffffe4c8) at gdb_pthread.c:18
18              if(pid != 0)
(gdb) info inferiors                 #####显示正在调试的进程
  Num  Description       Executable        
  2    process 23873     /home/vfhky/bin/gdb_pthread 
* 1    process 23869     /home/vfhky/bin/gdb_pthread 
(gdb)
3.7 设置第二个断点并调试

在第50行设置断点继续调试主进程(使父进程产生线程),其中父进程和线程到底是谁先执行是由内核调度控制的。

(gdb) b gdb_pthread.c:50
Breakpoint 3 at 0x4008a7: gdb_pthread.c:50. (2 locations)
(gdb) c                    ######继续执行代码到第50行处的断点
Continuing.
[Parent]: [23869] [step1]                              ######第33行父进程打印Parent()函数中的数据
[New Thread 0x7ffff6fdd700 (LWP 24024)]                ######第35行父进程创建了一个线程24024(LWP表示轻量级进程)
[Switching to Thread 0x7ffff6fdd700 (LWP 24024)]            #####已经自动切换到线程24024(LWP表示轻量。进程),也就是GDB继续调试线程而不是父进程了。

Breakpoint 3, ParentDo (argv=0x7fffffffe390 "Thread") at gdb_pthread.c:50            ######线程24024阻塞在程序的第50行
50              printf( "[%s]: [%d] [%s] [%lu] [%s]\n", argv, pid, tprefix, tid, "step2" );
(gdb)

这时使用命令查看当前系统进程的状态:存在父进程23869和子进程23873以及父进程创建的一个线程24024(线程用大括号{}表示)。

[vfhky@typecodes ~]$ pstree -pul

pstree -pul查看CentOS系统所有进程信息

同时,使用命令cat /proc/23869/status查看当前进程的详细信息:进程PID为23869,它的父进程(即GDB进程)为23859,同时这也是追踪进程ID,线程数Threads为2(当前父进程23869+线程24024)。

proc查看进程的状态信息

3.8 查看第二个断点处的调试信息
(gdb) info inferiors                 #####显示正在调试的进程 
  Num  Description       Executable        
  2    process 23873     /home/vfhky/bin/gdb_pthread                     ###子进程
* 1    process 23869     /home/vfhky/bin/gdb_pthread                     ###父进程 
(gdb) info threads         ####查看所有运行的线程,父进程23869、子进程23873、线程24024,由星号可以发现目前调试已经切换到了线程24024了。 
  Id   Target Id         Frame 
* 3    Thread 0x7ffff6fdd700 (LWP 24024) "gdb_pthread" ParentDo (argv=0x7fffffffe390 "Thread") at gdb_pthread.c:50
  2    Thread 0x7ffff7fe1740 (LWP 23873) "gdb_pthread" 0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
  1    Thread 0x7ffff7fe1740 (LWP 23869) "gdb_pthread" ParentDo (argv=0x7fffffffe3a0 "Parent") at gdb_pthread.c:50
(gdb) info b                #####查看设置的所有的断点breakpoint和捕捉点catchpoint(共3个):
Num     Type           Disp Enb Address            What
1       catchpoint     keep y                      fork, process 23873 
        catchpoint already hit 1 time
2       breakpoint     keep y   <MULTIPLE>         
        breakpoint already hit 1 time
2.1                         y     0x00000000004007b7 in main at gdb_pthread.c:18 inf 2
2.2                         y     0x00000000004007b7 in main at gdb_pthread.c:18 inf 1
3       breakpoint     keep y   <MULTIPLE>         
        breakpoint already hit 1 time
3.1                         y     0x00000000004008a7 in ParentDo at gdb_pthread.c:50 inf 2
3.2                         y     0x00000000004008a7 in ParentDo at gdb_pthread.c:50 inf 1
(gdb)
3.9 如果手动切换到线程24024
(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff6fdd700 (LWP 24024))]
#0  ParentDo (argv=0x7fffffffe390 "Thread") at gdb_pthread.c:50
50              printf( "[%s]: [%d] [%s] [%lu] [%s]\n", argv, pid, tprefix, tid, "step2" );
(gdb) info threads                   #####查看所有运行的线程
  Id   Target Id         Frame 
* 3    Thread 0x7ffff6fdd700 (LWP 24024) "gdb_pthread" ParentDo (argv=0x7fffffffe390 "Thread") at gdb_pthread.c:50
  2    Thread 0x7ffff7fe1740 (LWP 23873) "gdb_pthread" 0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130
  1    Thread 0x7ffff7fe1740 (LWP 23869) "gdb_pthread" ParentDo (argv=0x7fffffffe3a0 "Parent") at gdb_pthread.c:50
(gdb) info inferiors                 #####显示正在调试的进程  
  Num  Description       Executable        
  2    process 23873     /home/vfhky/bin/gdb_pthread 
* 1    process 23869     /home/vfhky/bin/gdb_pthread 
(gdb)
3.10 开始执行第二个断点处的代码
(gdb) c
Continuing.
[Thread]: [23869] [thread] [140737337218816] [step2]            #####线程24024执行第50行处,打印数据
[Thread]: [23869] [thread] [140737337218816] [step3]            #####线程24024执行第51行处,打印数据
[Thread 0x7ffff6fdd700 (LWP 24024) exited]                      #####线程24024退出
[Switching to Thread 0x7ffff7fe1740 (LWP 23869)]                #####切换到父进程中去

Breakpoint 3, ParentDo (argv=0x7fffffffe3a0 "Parent") at gdb_pthread.c:50                #####父进程继续停在第50行处的断点
50              printf( "[%s]: [%d] [%s] [%lu] [%s]\n", argv, pid, tprefix, tid, "step2" );
(gdb) info inferiors                    ######列出正在调试进程(父进程23869和子进程23873),1前面的星号表示当前调试的进程(父进程23869)。 
  Num  Description       Executable        
  2    process 23873     /home/vfhky/bin/gdb_pthread 
* 1    process 23869     /home/vfhky/bin/gdb_pthread 
(gdb) info threads                     ######查看所有运行的线程
  Id   Target Id         Frame 
  2    Thread 0x7ffff7fe1740 (LWP 23873) "gdb_pthread" 0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130            #####子进程23873
* 1    Thread 0x7ffff7fe1740 (LWP 23869) "gdb_pthread" ParentDo (argv=0x7fffffffe3a0 "Parent") at gdb_pthread.c:50          #####父进程23869
(gdb)

这时使用命令查看当前系统进程的状态:存在父进程23869和子进程23873,其中线程24024已经结束了。

[vfhky@typecodes ~]$ pstree -pul

pstree -pul查看CentOS系统所有进程信息

3.11 继续调试父进程

此时,由于线程的退出,父进程作为自动选择的要调试的线程。

(gdb) c
Continuing.
[Parent]: [23869] [thread] [140737354012480] [step2]        #####父进程23869执行第50行
[Parent]: [23869] [thread] [140737354012480] [step3]        #####父进程23869执行第51行
[Inferior 1 (process 23869) exited normally]                #####正在调试的父进程23869退出
(gdb) info inferiors             ######显示正在调试的进程
  Num  Description       Executable        
  2    process 23873     /home/vfhky/bin/gdb_pthread        #####fork创建的子进程23873
* 1    <null>            /home/vfhky/bin/gdb_pthread        #####fork创建的父进程23869已经退出 
(gdb) info threads              ####显示正在运行的线程:只存在子进程23873,父进程23869已经退出 
  Id   Target Id         Frame 
  2    Thread 0x7ffff7fe1740 (LWP 23873) "gdb_pthread" 0x00007ffff709b50c in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/fork.c:130

No selected thread.  See `help thread'.         #####提示没有被选中的要调试的线程
(gdb) info b                                    #####查看所有的断点
Num     Type           Disp Enb Address            What
1       catchpoint     keep y                      fork, process 23873 
        catchpoint already hit 1 time
2       breakpoint     keep y   <MULTIPLE>         
        breakpoint already hit 1 time
2.1                         y     0x00000000004007b7 in main at gdb_pthread.c:18 inf 2
2.2                         y     0x00000000004007b7 in main at gdb_pthread.c:18 inf 1
3       breakpoint     keep y   <MULTIPLE>         
        breakpoint already hit 2 times
3.1                         y     0x00000000004008a7 in ParentDo at gdb_pthread.c:50 inf 2      #####子进程23873
3.2                         y     0x00000000004008a7 in ParentDo at gdb_pthread.c:50 inf 1      #####父进程23869
(gdb)

这时使用命令查看当前系统进程的状态:只有子进程23873(由内核init进程接管这个孤儿进程),父进程23869也已经结束了。

[vfhky@typecodes ~]$ pstree -pul

pstree -pul查看CentOS系统所有进程信息

再用ps ux命令查看子进程23873:

ps ux命令查看子进程信息

4 附录

在gdb中,经常用到的恢复程序运行和单步调试的命令有:

continue        继续运行程序直到下一个断点(类似于VS里的F5)
next            逐过程步进,不会进入子函数(类似VS里的F10)
setp            逐语句步进,会进入子函数(类似VS里的F11)
until           运行至当前语句块结束
finish          运行至函数结束并跳出,并打印函数的返回值(类似VS的Shift+F11)
打赏支持

Comments »