Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control脚本stop()函数存在Bug导致Agent进程无法正常停止 #59

Open
xiaowing opened this issue Jul 18, 2016 · 1 comment
Open

Comments

@xiaowing
Copy link

xiaowing commented Jul 18, 2016

Agent进程的启停都是通过control脚本来控制的,但是目前control脚本的stop()函数写得有点简单粗暴,在以下场景下会导致Agent进程无法通过control脚本来停止,只能通过手动kill来停止的情况。

再现方法

  1. 以root权限通过contro脚本启动Agent
[root@testvm agent]# ./control start
falcon-agent started..., pid=19849
[root@testvm agent]# ll var/
-rw-rw-r-- 1 foobar foobar 120 7?. 18 15:46 app.log
-rw-r--r-- 1 root root   6 7?. 18 15:46 app.pid
  1. 以非root权限尝试通过contro脚本停止Agent,会因为权限不足而失败。
[foobar@testvm agent]$ ./control stop

./control: line 52: kill: (19849) - Operation not permitted
falcon-agent stoped...
  1. 观察此时的pid文件。会发现虽然上一步kill操作失败了,但是app.pid文件还是被删掉了
[root@testvm agent]# ll var/
-rw-rw-r-- 1 foobar foobar 120 7?. 18 15:46 app.log
  1. 再尝试用root账户通过contro脚本停止Agent,则会因为app.pid删除获取不到进程号而导致kill失败
[root@testvm agent]# ./control stop
cat: var/app.pid: No such file or directory
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
falcon-agent stoped...
  1. 观察agent进程,发现Agent进程仍然存在。此时就只能通过手动kill命令去停止。
[foobar@testvm agent]$ ps -ef | grep falcon-agent
root     19849     1  0 15:46 pts/2    00:00:00 ./falcon-agent -c cfg.json
foobar  19895 19703  0 15:49 pts/0    00:00:00 grep falcon-agent

看了一下control脚本的代码,stop()函数无条件地删除pid文件,但是实际上kill命令本身是依赖pid文件的内容的。虽然从脚本顺序上是kill在前,删pid在后。但是,kill命令本身有可能失败,失败时仍然不计一切地删除pid文件则会导致后续的kill命令永远不可能成功。同理,无条件地输出“$app stoppend...”也并不友好,因为有可能会带来误解。

function stop() {
    pid=`cat $pidfile`
    kill $pid
    rm -f $pidfile
    echo "$app stoped..."
}
@super-go
Copy link

super-go commented Apr 7, 2017

function stop() {
    pid=`ps -ef | grep "falcon-agent -c cfg.json" | grep -v grep| awk '{print $2}'`
    kill -9 $pid
    pid=`ps -ef | grep "falcon-agent -c cfg.json" | grep -v grep| awk '{print $2}'| wc -l`
    if [ $pid = 0 ]; then
        rm -f $pidfile
        echo "$app stoped success..."
    else 
       echo "$app stoped fail..."
       exit 1
    fi 
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants