-
Notifications
You must be signed in to change notification settings - Fork 32
Profiling
- gem detailed_benchmarks
- gem memory_profiler
- gem mini-profiler
- gem ruby-prof
- gem pghero
Tool | Production? | Method | Notes |
---|---|---|---|
detailed_benchmarks | No | Requires some setup to use for production-like conditions | |
memory_profiler | |||
mini-profiler | Yes | ||
ruby-prof | |||
pghero | Yes | SQL Query profiler | Requires postgres superuser access |
https://github.com/schneems/derailed_benchmarks
To see gem memory usage
bundle exec derailed bundle:mem
Runtime memory usage for the app
bundle exec derailed bundle:objects
Convert to MB. Example crowdAI's gems use 105.59MB for gems alone.
https://github.com/presidentbeef/brakeman
Using the derailed_benchmarks gem, run increasing numbers of tests. If the memory usage is not stable there is a leak.
TEST_COUNT=5000 bundle exec derailed exec perf:mem_over_time
TEST_COUNT=10_000 bundle exec derailed exec perf:mem_over_time
TEST_COUNT=20_000 bundle exec derailed exec perf:mem_over_time
Puma is a multithreaded server and MRI is single threaded. Puma will work best with Rubinius or JRuby.
- There is a gem configuring a default plugin, but the contents can be easily added to the initializer https://github.com/puma/puma-heroku/blob/master/lib/puma/plugin/heroku.rb
Because MRI doesn't have "real threads", ideally there would be at least one worker for each CPU core. Heroku 1x, 2x, and Performance-M dynos each have 8 cores. Performance-L dynos have 2 cores.
It's often not possible to have a worker per core, because of memory constraints. For example, a medium to large rails app on a Heroku 2x dyno will take up 300-550 megs, which allows for only running 1-3 workers.
Text from config/puma.rb
Specifies the number of
workers
to boot in clustered mode. Workers are forked webserver processes. If using threads and workers together the concurrency of the application would be maxthreads
*workers
. Workers do not work on JRuby or Windows (both of which do not support processes).
:min_threads => 0,
:max_threads => 16,
:log_requests => false,
:debug => false,
:binds => ["tcp://#{DefaultTCPHost}:#{DefaultTCPPort}"],
:workers => 0,
:daemon => false,
:mode => :http,
:worker_timeout => DefaultWorkerTimeout,
:worker_boot_timeout => DefaultWorkerTimeout,
:worker_shutdown_timeout => DefaultWorkerShutdownTimeout,
:remote_address => :socket,
:tag => method(:infer_tag),
:environment => lambda { ENV['RACK_ENV'] || "development" },
:rackup => DefaultRackup,
:logger => STDOUT,
:persistent_timeout => Const::PERSISTENT_TIMEOUT
https://github.com/puma/puma/blob/master/lib/puma/configuration.rb#L173-L193
https://github.com/ruby-prof/ruby-prof
https://devcenter.heroku.com/articles/dyno-types
The issue with Puma is that for example a 1X Dyno has 512MB of RAM and 8 cores, so depending on the size of the Rails app only 1-2 Puma workers can be run per Dyno, despite the 8 available cores.
The number of cores available on a Heroku Dyno is no longer published and subject to change. It is possible to find the current number of cores using nproc
for your configuration.
$ heroku run bash --app crowdai-prd
Running bash on ⬢ crowdai-prd... up, run.9093 (Standard-1X)
~ $ nproc
8
https://devcenter.heroku.com/articles/dyno-types
Some references
https://github.com/puma/puma-heroku
http://julianee.com/rails-sidekiq-and-heroku/
http://stackoverflow.com/questions/8821864/config-assets-compile-true-in-rails-production-why-not