initialize processor on server boot #2059

vvmruder · 2024-10-28T12:59:08Z

Fixes #2056

As discussed this PR changes pyramid_oereb behaviour. It loads Processor and therefore all related theme configuration on server boot time. Before the theme configuration was initialized on every process. This should provide a recognizable gain in performance.

pyramid_oereb will fail to start if Processor could not be initialized for any reason.

TODO:

fix tests
test in example infrastructures
- BS ✅
- BL
- NE ✅
- TI ✅
- BE ✅

vvmruder · 2024-10-28T13:01:41Z

@michmuel @voisardf @svamaa : I think this is worth a try now in your infra. Please let me know if you need anything.

vvmruder · 2024-10-28T17:09:30Z

I did a quick manual check. On the master branch the test request lasts around 1.7seconds on my laptop. While this branch delivers the same request in 0.24seconds. A nice gain I think.

lopo977 · 2024-10-29T07:05:48Z

When running a local test on my workstation, your implementation of the singleton pattern processes the JSON request (CH452802850772) in 4.37 seconds, down from 6.26 seconds. Looks good to me.

vvmruder · 2024-10-29T07:30:46Z

When running a local test on my workstation, your implementation of the singleton pattern processes the JSON request (CH452802850772) in 4.37 seconds, down from 6.26 seconds. Looks good to me.

@lopo977 Thank you for testing. Did you run the request multiple times? For me, the first was a bit longer, the following were much faster than. Can you confirm that?

lopo977 · 2024-10-29T07:43:27Z

When running a local test on my workstation, your implementation of the singleton pattern processes the JSON request (CH452802850772) in 4.37 seconds, down from 6.26 seconds. Looks good to me.

@lopo977 Thank you for testing. Did you run the request multiple times? For me, the first was a bit longer, the following were much faster than. Can you confirm that?

I ran the same request multiple times. The first was longer also for me. I confirm.

svamaa

I have tested this branch within our local dev environment. Overall I discovered a 9% improvment over 3 different parcels with each 20 requests. However, the improvement is significantly smaller, the larger the parcel and the associated data get. This makes sense.

However, I discovered in general a large variety in response time (this happens also on the master branch).
Even within a small parcel the response time varies up to 2 seconds between the same request. Does anyone else also finds this behaviour on their systems? I did not connect a db on my machine but on our intranet. The network should actually not be a large factor here.

voisardf · 2024-11-01T10:21:01Z

I also am under the impression our extract creation times do vary for the same pdf even between the 2nd and following extracts. I plan to do some more structured testing next week.

peterschaer · 2024-11-15T15:49:54Z

I finally found time to do some tests:

EGRID	Description	No. of DB-Queries master	No. of DB-Queries PR	Processing time master	Processing time PR
CH523822463517	parcel with only one PLR	198	157	4.8s	0.7s
CH743546874207	parcel with two topics	208	167	4.9s	1.2s
CH762046353030	parcel with six topics	261	220	9.1s	4.7s

I made all my tests on my development machine (Win10, no Docker, DB remote (but staging DB, so not a lot of load to be expected)). All values were taken from the Pyramid Debug Toolbar. All requested extracts were XML extracts.

All in all a very significant gain!

michmuel · 2024-11-19T10:48:51Z

I compared the branch of this PR with the master branch on our dev instance. The request of a XML extract is faster using the branch of this PR. The gain is in the order as outlined by @peterschaer. That's great!

We use the oereblex source for many oereb topics. In our extended oereblex sources we handle the indexing of the legal provisions which are retrieved for the relevant public law restrictions at each request from oereblex. The index is required for the extract and shall be in the range -1000 to 1000. At least in our instance of oereblex, legal provisions do not have an index (only laws have). When all the sources are initialised only at server start up, we will have to change the reset of this index. That should be faesible.

Just important to note that request related information attached to sources may have to be handled differently with this changes.

svamaa · 2024-11-19T13:35:40Z

Thank you all for testing!
@vvmruder, I think you can start implementing the testing.

michmuel · 2024-11-22T08:39:24Z

With the new approach, oereblex links are cached from server start-up (

pyramid_oereb/pyramid_oereb/contrib/data_sources/oereblex/sources/plr_oereblex.py

Line 99 in 176b4b3

identifier = '{}{}{}'.format(geolink, law_status.code, params.language)

). This means that changes in oereblex for links that have been invoked by pyramid oereb will not take effect in the app till server restart.

Shall we change the caching behavior? Any opinions? (e.g. @jwkaltz?)

jwkaltz · 2024-11-22T09:27:54Z

With the new approach, oereblex links are cached from server start-up (

pyramid_oereb/pyramid_oereb/contrib/data_sources/oereblex/sources/plr_oereblex.py

Line 99 in 176b4b3

identifier = '{}{}{}'.format(geolink, law_status.code, params.language)

). This means that changes in oereblex for links that have been invoked by pyramid oereb will not take effect in the app till server restart.
Shall we change the caching behavior? Any opinions? (e.g. @jwkaltz?)

@michmuel We already have a caching of oereblex links (see #993), this fills the links as needed. I don't think we should cache all links on server start-up.
Maybe I misunderstand?

michmuel · 2024-11-22T10:00:09Z

With the new approach, oereblex links are cached from server start-up (

pyramid_oereb/pyramid_oereb/contrib/data_sources/oereblex/sources/plr_oereblex.py

Line 99 in 176b4b3

identifier = '{}{}{}'.format(geolink, law_status.code, params.language)

). This means that changes in oereblex for links that have been invoked by pyramid oereb will not take effect in the app till server restart.
Shall we change the caching behavior? Any opinions? (e.g. @jwkaltz?)

@michmuel We already have a caching of oereblex links (see #993), this fills the links as needed. I don't think we should cache all links on server start-up. Maybe I misunderstand?

Let me know if I do not understand correctly, but I think, today, it works like this:

Server start
Request for parcel XY -> get public law restrictions -> download corresponding oereblex xmls (different plrs may have the same lexlinks but every lexlink is downloaded only once due to caching)
Request for parcel XY again -> get public law restrictions -> download corresponding oereblex xmls (different plrs may have the same lexlinks but every lexlink is downloaded only once due to caching)

Changes / corrections in Oereblex are active at least for the next request.

With this pull request the behavior changes as the oereblex classes are initialised only on server start:

Server start
Request for parcel XY -> get public law restrictions -> download corresponding oereblex xmls (different plrs may have the same lexlinks but every lexlink is downloaded only once due to caching)
Request for parcel XY again -> get public law restrictions -> no lexlink is downloaded because all lexlinks are cached

Changes / corrections in Oereblex are active immediately for lexlinks that have not been requested since server start and after next server start for lexlinks that have been requested since server start.

…itialize_things_once

vvmruder · 2024-12-11T08:00:53Z

@michmuel You were right with your assumptions.

It took me a while to follow the path and find a maybe nice solution without changing too much in the code. However, I would be happy if you could check this out first on some real world data. Is that possible?

michmuel · 2024-12-12T13:33:16Z

@michmuel You were right with your assumptions.

It took me a while to follow the path and find a maybe nice solution without changing too much in the code. However, I would be happy if you could check this out first on some real world data. Is that possible?

Thanks @vvmruder

Looking into the logs I found that a specific geolink is retrieved once for every request. If more than one plr with the same geolink is found for a real estate, the geolink is not retrieved again.

However, is it robust in an uwsgi-setting with many threads? Can you provide some reasoning?

vvmruder · 2024-12-16T13:18:18Z

@michmuel You made a valid point. I fixed it now. You may have a look if that is how you expected it.

initialize processor on server boot

64288b1

vvmruder requested a review from peterschaer October 28, 2024 12:59

vvmruder assigned voisardf, michmuel and svamaa Oct 28, 2024

vvmruder requested review from voisardf, michmuel and svamaa October 28, 2024 13:00

vvmruder assigned vvmruder and unassigned vvmruder, voisardf, michmuel and svamaa Oct 28, 2024

vvmruder added the enhancement label Oct 28, 2024

svamaa reviewed Oct 31, 2024

View reviewed changes

vvmruder added 2 commits December 11, 2024 08:57

fix oereblex caching issue

70fa4f2

Merge remote-tracking branch 'origin/master' into improvement/2056_in…

c820dc1

…itialize_things_once

fix lint issues

b820592

make oereblex cache thred safe

7135e15

fix lint issue

471706d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initialize processor on server boot #2059

initialize processor on server boot #2059

vvmruder commented Oct 28, 2024 •

edited

Loading

vvmruder commented Oct 28, 2024

vvmruder commented Oct 28, 2024

lopo977 commented Oct 29, 2024 •

edited

Loading

vvmruder commented Oct 29, 2024

lopo977 commented Oct 29, 2024 •

edited

Loading

svamaa left a comment

voisardf commented Nov 1, 2024

peterschaer commented Nov 15, 2024

michmuel commented Nov 19, 2024

svamaa commented Nov 19, 2024

michmuel commented Nov 22, 2024

jwkaltz commented Nov 22, 2024

michmuel commented Nov 22, 2024

vvmruder commented Dec 11, 2024

michmuel commented Dec 12, 2024

vvmruder commented Dec 16, 2024

initialize processor on server boot #2059

Are you sure you want to change the base?

initialize processor on server boot #2059

Conversation

vvmruder commented Oct 28, 2024 • edited Loading

vvmruder commented Oct 28, 2024

vvmruder commented Oct 28, 2024

lopo977 commented Oct 29, 2024 • edited Loading

vvmruder commented Oct 29, 2024

lopo977 commented Oct 29, 2024 • edited Loading

svamaa left a comment

Choose a reason for hiding this comment

voisardf commented Nov 1, 2024

peterschaer commented Nov 15, 2024

michmuel commented Nov 19, 2024

svamaa commented Nov 19, 2024

michmuel commented Nov 22, 2024

jwkaltz commented Nov 22, 2024

michmuel commented Nov 22, 2024

vvmruder commented Dec 11, 2024

michmuel commented Dec 12, 2024

vvmruder commented Dec 16, 2024

vvmruder commented Oct 28, 2024 •

edited

Loading

lopo977 commented Oct 29, 2024 •

edited

Loading

lopo977 commented Oct 29, 2024 •

edited

Loading