Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method Throttling #295

Merged

Conversation

lukaskabc
Copy link
Collaborator

@lukaskabc lukaskabc commented Sep 11, 2024

This PR addresses performance issues from #287 and #285

Identified problems

  • vocabulary validation - was being executed on a request from the frontend. Results were cached, but the whole cache was evicted on any modification of any vocabulary. Vocabulary validation also requires synchronization as the validator is not thread-safe, blocking each thread assigned to handle the validation request until the previous validation is performed (This was probably also preventing the app from running out of memory when multiple validations would be processed concurrently). Validation is a time and resource-consuming task.
  • asynchronous term occurrence saving - Many individual tasks might get scheduled during occurrence saving, but it's not even guaranteed when they will be saved; it could get to the point where the app was doing nothing but catching up on occurrence saving.
  • text analysis - a similar problem as with validation, it's executed on term modification (or by a user on a file, for example); it is time and resource-consuming, and it results in async term occurrence saving. Text analysis of a term was executed before text analysis of all terms from the vocabulary, resulting in duplicated processing.
  • Single code deployment - TermIt is deployed in a single core environment, eliminating the asynchronous processing benefits.

New features

Throttle & Debounce

This PR introduces an option to throttle and debounce method calls.

method throttling & debouncing

Do not review linked test cases before reading about return type support and throttled futures

The goal of method throttling & debouncing is to execute tasks asynchronously on a fixed thread pool merging often method calls into a single task execution with the newest data from the most recent method call [test case]. Throttling ensures that if the method task were not executed in the last X seconds, it would be scheduled for immediate execution [test case]. Otherwise, its execution will be delayed (debounced) so that it can be merged with potential future calls (it guarantees that when no future call comes, the task will be executed with the data from the last call). Task execution also ensures that when a task is time-consuming and its execution is taking longer than the actual throttle interval, a new call to the throttled method won't result in the concurrent execution of the same task [test case].
When a thread is already executing a throttled task, it will ignore any further throttling and will execute all methods synchronously [test case].

Throttling & Debouncing is realized with the Throttle annotation which is handled by the Throttle aspect.

Aspect is configured using Spring AOC XML syntax to not utilize AspectJ. Once AspectJ is removed from dependencies, it should be possible to replace XML configuration with annotations. Aspect is disabled for the test profile.

Throttle annotation supports methods with void return type out of the box.
The whole method is, in that case, considered a task that should be throttled, and the method itself will be executed asynchronously.

There is also support for methods returning a Future. However, the concrete returned object MUST be ThrottledFuture. Otherwise the aspect will throw appropriate exception on the method call (there is no way to safely check that on application start).
When a method returns future (ThrottledFuture), the method itself will be executed synchronously allowing to prepare the task that should be throttled and also to provide a cached result which may be acquired by a caller method from CacheableFuture interface before the actual future resolution. The actual task and cached result is then provided through the returned ThrottledFuture object.
An example can be seen in updated result caching validator where the method validate will be executed synchronously by the caller thread, checking the cache state and returning already resolved future when the cache is not dirty, or returning the future with the time-consuming task runValidation method and providing the cached result. Method runValidation will be executed asynchronously.

Throttled future also implements a chainable future interface, which allows to chain a task that will be executed once the future is resolved.
This, for example, allows the WebSocket controller to respond with the cached result and set a task that will send a new result to the client once new data are available. This prevents the thread from being blocked while awaiting a future resolution.

Scheduling throttled futures also support their cancellation based on their group.
This, for example, allows to cancel scheduling a task to analyze a definition of a single term while an analysis of all terms from the vocabulary is scheduled.

Disadvantages

  • a throttled method can't use the previous transaction - when the throttled method is called during a transaction, the throttled task is executed asynchronously without access to the original transaction. However, when the Transactional annotation is present on the same method as the Throttled annotation, the task will be executed in transactional context.

Unfortunately, I was not able to make a detection of active transaction context work. It might be a feature missing in Jopa (and TransactionSynchronizationManager), or I might just miss something; anyway, the explicit transactional annotation is required for the transaction to work.

Long-running tasks

As the application will now run some time-consuming tasks in the background, it will push the status of such tasks to the clients via WebSocket, allowing to display information about the activity to the user.

Currently, it's only possible to name the throttled method by a constant. So, the user will know that there is a validation in progress but won't know which vocabulary is being validated. This might be changed by adding a new parameter for additional information (in addition to the name parameter).

Changes

  • Removed periodic task for clearing context scheduled for removal (in case of term occurrence in definitions, removal is fast and will be made synchronously, for files see next point)
  • Concurrent saving and resolving of term occurrences for file analysis. A second thread will be started, one thread will start resolving occurrences, while the second thread will execute the removal of all current ones; after successful removal, it will start to save resolved occurrences from the first thread.
  • Vocabulary validation will be performed asynchronously using throttling.
  • New vocabulary validation results will be pushed to clients via WebSocket.
  • Text analysis will be performed asynchronously using throttling.
  • Clients will be notified about the end of text analysis
  • The frontend is no longer in control when validation and text analysis are performed (unless triggered by an explicit button by a user). The backend will automatically execute these tasks when appropriate (when vocabulary is modified, etc.).
  • The result caching validator was rewritten to provide dirty cached results through throttling and only mark the cache as dirty instead of deleting it. It will also mark only related cache as dirty and won't touch other non-related entries.
  • Removed the option to disable vocabulary analysis as discussed in a meeting

Requirements and notes

  • It is expected that TermIt will be deployed in an environment with at least two cores available to benefit from asynchronous processing (more cores would be, of course, beneficial as we need to handle http, websocket, database and background tasks).

  • The annotation is not prepared for AOT, and a reflection processor registering runtime hints will need to be probably created if the support for AOT is added.

…ulary terms on term edit."

This reverts commit 8c9d086.
…tion event, fix event names, move business vocabulary service test
Copy link
Contributor

@ledsoft ledsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tried, yet so these are purely based on reviewing the code here.

A bunch of inline comments, plus these general remarks:

  • Replace JetBrains annotations (particularly NotNull) with their Spring equivalents (NonNull). I stopped pointing that out after a couple of files. We don't use Nullable, it is assumed by default. I also don't see any reason for using these annotations on fields, use them only on public methods
  • There are several files containing formatting changes only. This makes the already large PR even larger
  • I also have a question regarding the necessity of using XML-configured aspect, but let's discuss that on our meeting

Copy link
Contributor

@ledsoft ledsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, works nicely. I added a small fix for situations where terms are related to each other, as it was throwing exception due to the analysis of the whole vocabulary running in the same transaction.

@ledsoft ledsoft merged commit dd6aff5 into kbss-cvut:development Sep 13, 2024
2 checks passed
ledsoft pushed a commit that referenced this pull request Sep 13, 2024
ledsoft pushed a commit that referenced this pull request Sep 13, 2024
…er interface and fix faulty throttle aspect test
ledsoft pushed a commit that referenced this pull request Sep 13, 2024
ledsoft pushed a commit that referenced this pull request Sep 13, 2024
ledsoft pushed a commit that referenced this pull request Sep 13, 2024
…olve silenced exception from throttled tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance of vocabulary validation Performance when multiple users edit the same vocabulary
2 participants