diff --git a/docs/static/config-details.asciidoc b/docs/static/config-details.asciidoc index 0ff63d6da86..226807b4e1a 100644 --- a/docs/static/config-details.asciidoc +++ b/docs/static/config-details.asciidoc @@ -49,9 +49,19 @@ JVM falls in the inclusive range of the two numbers * all other lines are rejected +[[memory-size]] +==== Setting the memory size + +The memory of the JVM executing {ls} can be divided in two zones: heap and off-heap memory. +In the heap refers to Java heap, which contains all the Java objects created by {ls} during its operation, see <> for +description on how to size it. +What's not part of the heap is named off-heap and consists of memory that can be used and controlled by {ls}, generally +thread stacks, direct memory and memory mapped pages, check <> for comprehensive descriptions. +In off-heap space there is some space which is used by JVM and contains all the data structures functional to the execution +of the virtual machine. This memory can't be controlled by {ls} and the settings are rarely customized. [[heap-size]] -==== Setting the JVM heap size +===== Setting the JVM heap size Here are some tips for adjusting the JVM heap size: @@ -77,15 +87,74 @@ process. info, see <>. // end::heap-size-tips[] - [[off-heap-size]] -==== Setting the off-heap size +===== Setting the off-heap size The operating system, persistent queue mmap pages, direct memory, and other processes require memory in addition to memory allocated to heap size. -Keep the overall memory requirements in mind when you allocate memory. +Internal JVM data structures, thread stacks, memory mapped files and direct memory for input/output (IO) operations are all parts of the off-heap JVM memory. +Memory mapped files are not part of the Logstash's process off-heap memory, but consume RAM when paging files from disk. +These mapped files speed up the access to Persistent Queues pages, a performance improvement - or trade off - to reduce expensive disk operations such as read, write, and seek. +Some network I/O operations also resort to in-process direct memory usage to avoid, for example, copying of buffers between network sockets. Input plugins such as Elastic Agent, Beats, TCP, and HTTP inputs, use direct memory. +The zone for Thread stacks contains the list of stack frames for each Java thread created by the JVM; each frame keeps the local arguments passed during method calls. +Read on <> if the size needs to be adapted to the processing needs. + +Plugins, depending on their type (inputs, filters, and outputs), have different thread models. +Every input plugin runs in its own thread and can potentially spawn others. For example, each JDBC input +plugin launches a scheduler thread. Netty based plugins like TCP, Beats or HTTP input spawn a thread pool with 2 * number_of_cores threads. +Output plugins may also start helper threads, such as a connection management thread for each +{es} output instance. +Every pipeline, also, has its own thread responsible to manage the pipeline lifecycle. + +To summarize, we have 3 categories of memory usage, where 2 can be limited by the JVM and the other relies on available, free memory: + +[cols="<,<,<",options="header",] +|===== +| Memory Type | Configured using | Used by +| JVM Heap | -Xmx | any normal object allocation +| JVM direct memory | -XX:MaxDirectMemorySize | beats, tcp and http inputs +| Native memory | N/A | Persistent Queue Pages, Thread Stacks +|===== + +Keep these memory requirements in mind as you calculate your ideal memory allocation. + +[[memory-size-calculation]] +===== Memory sizing + +Total JVM memory allocation must be estimated and is controlled indirectly using Java heap and direct memory settings. By default, a JVM's off-heap direct memory limit is the same as the heap size. Check out <>. -Consider setting `-XX:MaxDirectMemorySize` to half of the heap size. +Consider setting `-XX:MaxDirectMemorySize` to half of the heap size or any value that can accommodate the load you expect these plugins to handle. + +As you make your capacity calculations, keep in mind that the JVM can't consume the total amount of the host's memory available, +as the Operating System and other processes will require memory too. + +For a {ls} instance with persistent queue (PQ) enabled on multiple pipelines, we could +estimate memory consumption using: + +[source,text] +----- +pipelines number * (pipeline threads * stack size + 2 * PQ page size) + direct memory + Java heap +----- + +NOTE: Each Persistent Queue requires that at least head and tail pages are present and accessible in memory. +The default page size is 64 MB so each PQ requires at least 128 MB of heap memory, which can be a significant source +of memory consumption per pipeline. Note that the size of memory mapped file can't be limited with an upper bound. + +NOTE: Stack size is a setting that depends on the JVM used, but could be customized with `-Xss` setting. + +NOTE: Direct memory space by default is big as much as Java heap, but can be customized with the `-XX:MaxDirectMemorySize` setting. + +**Example** + +Consider a {ls} instance running 10 pipelines, with simple input and output plugins that doesn't start additional threads, +it has 1 pipelines thread, 1 input plugin thread and 12 workers, summing up to 14. +Keep in mind that, by default, JVM allocates direct memory equal to memory allocated for Java heap. + +The calculation results in: + +* native memory: 1.4Gb [derived from 10 * (14 * 1Mb + 128Mb)] +* direct memory: 4Gb +* Java heap: 4Gb [[stacks-size]]