Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openresty 1.15.3 profile issues. #1096

Open
eloycoto opened this issue Jul 19, 2019 · 1 comment
Open

Openresty 1.15.3 profile issues. #1096

eloycoto opened this issue Jul 19, 2019 · 1 comment

Comments

@eloycoto
Copy link
Contributor

eloycoto commented Jul 19, 2019

The issue is when Openresty 1.15.3 is in use, where lj_trace_exit function is
only called on the first 10 request, after that, the function is never called
again.

The result is that profile jobs are not working correctly, due to not all the
request are using luajit at all:

Output of `stapxx ./samples/ngx-lj-trace-exits.sxx -x $(worker PID)

289 out of 19088 requests used compiled traces generated by LuaJIT.
Distribution of LuaJIT trace exit times per request for 289 sample(s):
(min/avg/max: 1/1/9)
value |-------------------------------------------------- count
    0 |                                                     0
    1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@           160
    2 |@@@@@@@@@@@@@@@@@@@@@@@                             92
    4 |@@@@@@@@@                                           36
    8 |                                                     1
   16 |                                                     0
   32 |                                                     0

To validate this behaviour:

nginx.conf

worker_processes  1;
daemon off;
pid       /tmp/nginx.pid;
error_log logs/error.log;

events {
    worker_connections 1024;
}

http {
  server {
  listen 8080;
  access_log  off;
    location / {
      content_by_lua_block {
        function main()
          local j = 0
          for i=0,1000 do
            j = i+j
          end
        end
        main()
        ngx.say("Hello world")
      }
    }
  }
}

Run openresty in debug mode:

openresty-debug -c nginx.conf -p $(pwd)

For this test, we can test with stapxx, but a simpler systemtap file can be
used:

probe process("/usr/local/openresty-debug/nginx/sbin/nginx").function("ngx_http_free_request")
{
    println("Request finish")
}

probe process("/usr/local/openresty-debug/luajit/lib/libluajit-5.1.so.2.1.0").function("lj_trace_exit") {
    println("Trace exit")
}

And if we run this tap, on the first 10 request, lj_trace_exit probe will
work correctly, but after the 10th request there is no longer called, example:

#stap lj.stp
Trace exit
Request finish
Trace exit
Request finish
Trace exit
Request finish
Trace exit
Request finish
Trace exit
Request finish
Trace exit
Request finish
Trace exit
Request finish
Trace exit
Request finish
Trace exit
Request finish
Request finish
Request finish
Request finish
Request finish

Trying to debug why this happens, there are no suspect code in
ngx-lua-module, but Openresty bump the version of Luajit, and this commit
maybe break something:

openresty/luajit2@864c72e#diff-d232eb340601b1fd167c69f58d20f8f1

Also, try to get a proper traceback on this function, but for reasons all the
systemtap traps only return the hex direction of the Luajit, so I couldn't find
any pattern, and Luajit code is quite hard to understand :-(.

Systemtap taps: https://sourceware.org/systemtap/tapsets/

Example to get the caller:

probe process("/usr/local/openresty-debug/luajit/lib/libluajit-5.1.so.2.1.0").function("lj_trace_exit") {
    println("Trace exit")
    println(symfileline(caller_addr()))
}

Also try to get the human-readable code using addr2line but I have no luck.

Moving forward I try to get trace on perf, maybe systemtap issue, but in perf I
did not see the lj_trace_exit call at all:

perf record -a  -p 19191 -g

perf output: https://gist.github.com/eloycoto/60567b0f6de63a7e86792ad79331c6d6
perf data perf.zip

Also, tried with a custom perf probe, but no calls where seen in lj_trace_exit:

# perf probe -x /usr/local/openresty-debug/luajit/lib/libluajit-5.1.so.2.1.0 --funcs | grep lj_trace_exit
lj_trace_exit
# perf probe -x /usr/local/openresty-debug/luajit/lib/libluajit-5.1.so.2.1.0 lj_trace_exit
Failed to get call frame on 0xa6376
Added new event:
  probe_libluajit:lj_trace_exit (on lj_trace_exit in /usr/local/openresty-debug/luajit/lib/libluajit-5.1.so.2.1.0)

You can now use it in all perf tools, such as:

        perf record -e probe_libluajit:lj_trace_exit -aR sleep 1

next steps:

Installation

These are the steps that I follow to install all the things commented above:

yum install yum-utils
yum-config-manager --add-repo https://openresty.org/package/centos/openresty.repo
yum install openresty openresty-debug-debuginfo openresty-debug openresty-pcre-debuginfo openresty-pcre-devel openresty-zlib-debuginfo

yum install -y kernel-devel kernel-debuginfo kernel-debuginfo-common
debuginfo-install kernel-3.10.0-957.12.2.el7.x86_64
yum install -y systemtap systemtap-runtime

Useful information:

@eloycoto
Copy link
Contributor Author

Ok,

I think that I found the problem, I think that it's related to this commit:

openresty/luajit2@0a9ff94

Now If I run this nginx.conf to update some params, things are getting in a good shape:


^C[vagrant@localhost ~]$ cat /vagrant/nginx.conf
worker_processes  1;
daemon off;
pid       /tmp/nginx.pid;
error_log logs/error.log;

events {
    worker_connections 1024;
}

http {
  server {
  listen 8080;
  access_log  off;
    location / {
      content_by_lua_block {
        require "jit.opt".start("hotexit=100")
        function main()
          local j = 0
          for i=0,100 do
            j = i+j
          end
        end
        main()
        ngx.say("Hello world")
      }
    }
  }
}

So, my understanding is that it's a trace issue, not a Luajit compiling issue, maybe the best way is to add to stapxx the following probe:

probe process("/usr/local/openresty-debug/luajit/lib/libluajit-5.1.so.2.1.0").function("lj_vm_exit_interp") {
    println("VM_exit_interp")
}

Also this commit openresty/luajit2@99304a9 can be a problem, but in Openresty 1.13 that commit is in there.

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant