-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initialize processor on server boot #2059
base: master
Are you sure you want to change the base?
Conversation
I did a quick manual check. On the master branch the test request lasts around 1.7seconds on my laptop. While this branch delivers the same request in 0.24seconds. A nice gain I think. |
When running a local test on my workstation, your implementation of the singleton pattern processes the JSON request (CH452802850772) in 4.37 seconds, down from 6.26 seconds. Looks good to me. |
@lopo977 Thank you for testing. Did you run the request multiple times? For me, the first was a bit longer, the following were much faster than. Can you confirm that? |
I ran the same request multiple times. The first was longer also for me. I confirm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested this branch within our local dev environment. Overall I discovered a 9% improvment over 3 different parcels with each 20 requests. However, the improvement is significantly smaller, the larger the parcel and the associated data get. This makes sense.
However, I discovered in general a large variety in response time (this happens also on the master branch).
Even within a small parcel the response time varies up to 2 seconds between the same request. Does anyone else also finds this behaviour on their systems? I did not connect a db on my machine but on our intranet. The network should actually not be a large factor here.
I also am under the impression our extract creation times do vary for the same pdf even between the 2nd and following extracts. I plan to do some more structured testing next week. |
I finally found time to do some tests:
I made all my tests on my development machine (Win10, no Docker, DB remote (but staging DB, so not a lot of load to be expected)). All values were taken from the Pyramid Debug Toolbar. All requested extracts were XML extracts. All in all a very significant gain! |
I compared the branch of this PR with the master branch on our dev instance. The request of a XML extract is faster using the branch of this PR. The gain is in the order as outlined by @peterschaer. That's great! We use the oereblex source for many oereb topics. In our extended oereblex sources we handle the indexing of the legal provisions which are retrieved for the relevant public law restrictions at each request from oereblex. The index is required for the extract and shall be in the range -1000 to 1000. At least in our instance of oereblex, legal provisions do not have an index (only laws have). When all the sources are initialised only at server start up, we will have to change the reset of this index. That should be faesible. Just important to note that request related information attached to sources may have to be handled differently with this changes. |
Thank you all for testing! |
With the new approach, oereblex links are cached from server start-up ( pyramid_oereb/pyramid_oereb/contrib/data_sources/oereblex/sources/plr_oereblex.py Line 99 in 176b4b3
Shall we change the caching behavior? Any opinions? (e.g. @jwkaltz?) |
@michmuel We already have a caching of oereblex links (see #993), this fills the links as needed. I don't think we should cache all links on server start-up. |
Let me know if I do not understand correctly, but I think, today, it works like this:
Changes / corrections in Oereblex are active at least for the next request. With this pull request the behavior changes as the oereblex classes are initialised only on server start:
Changes / corrections in Oereblex are active immediately for lexlinks that have not been requested since server start and after next server start for lexlinks that have been requested since server start. |
@michmuel You were right with your assumptions. It took me a while to follow the path and find a maybe nice solution without changing too much in the code. However, I would be happy if you could check this out first on some real world data. Is that possible? |
Thanks @vvmruder Looking into the logs I found that a specific geolink is retrieved once for every request. If more than one plr with the same geolink is found for a real estate, the geolink is not retrieved again. However, is it robust in an uwsgi-setting with many threads? Can you provide some reasoning? |
@michmuel You made a valid point. I fixed it now. You may have a look if that is how you expected it. |
Fixes #2056
As discussed this PR changes pyramid_oereb behaviour. It loads Processor and therefore all related theme configuration on server boot time. Before the theme configuration was initialized on every process. This should provide a recognizable gain in performance.
pyramid_oereb will fail to start if Processor could not be initialized for any reason.
TODO: