Performance / scalability issues #1927

dennisfang · 2022-07-02T15:25:11Z

dennisfang
Jul 2, 2022

First of all, thanks for providing this excellent tool open source!

I have been using it for about half a year now, but I'm starting to run in to more and more performance issues.

To give some reference about the amount of data:

umami=# select count(*) from event;
  count  
---------
 2686421
(1 row)

umami=# select count(*) from pageview;
  count   
----------
 12066606
(1 row)

umami=# select count(*) from session;
  count  
---------
 1204701
(1 row)

umami=# select count(*) from website;
 count 
-------
     5
(1 row)

umami=# select min(created_at) from pageview;
            min             
----------------------------
 2022-01-25 17:28:59.411+00
(1 row)

I have multiple websites in the database, but 95%+ of the data is from a single website.

I'm running umami 1.33.1 on a Linode VM with SSD and dedicated 4 core CPU option. Both umami and postgres are running in docker with the docker compose file provided with umami.

I noticed a lot of "Something went wrong" messages in the dash boards, especially if you pick periods longer periods. Currently last 7 days works, last 30 days starts to fail on a few of the items, beyond 30 days most things start to fail.

I checked the server and saw a high load from the postgres database. The connections seem to timeout, so I tried adding pool_timeout=60 to the database connection string. This resulted in a similar situation. With long periods such as all-time it to crash with an out of memory error (using over 2gb of memory on the umami container):

umami_1  | error Command failed with signal "SIGKILL".
umami_1  | info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
umami_1  | ERROR: "start-server" exited with 1.
umami_1  | error Command failed with exit code 1.
umami_1  | info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

This indicates that on top of database query performance issues, scaling also seems to be limited by memory as part of the data is probably processed in memory in a non-streaming way?

I took a quick look at the running queries, the first one I saw is:

select os x, count(*) y
from session as x
where x.session_id in (
  select pageview.session_id
  from pageview

  where pageview.website_id=$1
  and pageview.created_at between $2 and $3
)  
group by 1     
order by 2 desc

This query has an inefficient and seemingly superfluous join with the pageview table, removing it makes the query over 10 times faster.

I assume this is the matching nodejs code:

export function getSessionMetrics(website_id, start_at, end_at, field, filters = {}) {
  const params = [website_id, start_at, end_at];
  const { pageviewQuery, sessionQuery, joinSession } = parseFilters('pageview', filters, params);
  
  return rawQuery(
    `
    select ${field} x, count(*) y
    from session as x
    where x.session_id in (
      select pageview.session_id
      from pageview
        ${joinSession}
      where pageview.website_id=$1
      and pageview.created_at between $2 and $3
      ${pageviewQuery}
      ${sessionQuery}
    )
    group by 1
    order by 2 desc
    `,
    params,
  );
}

The inefficient IN join seems to be to allow a generic way of filtering the metrics. Perhaps using a separate query without a join on pageviews would be better when no filters are provided?

In any case, this is just the first query I looked at. There seems to be many more queries that don't perform well.

I did see a branch brian/um-13-query-optimizations. I took one of the optimizations from that branch, changing

to_char(date_trunc('${unit}', ${field}), '${POSTGRESQL_DATE_FORMATS[unit]}')

to:

date_trunc('${unit}', ${field}),

and applied that to one of the other slow queries, which seemed to reduce 50% or so on that query. So applying that optimization throughout the entire application would certainly be helpful.

But even if all queries would become faster, there also seems to be the out of memory issue? I didn't manage to trace that down to a certain piece of code though.

Do you plan to make umami scalable to larger amounts of page views in the future?

Just a thought: umami has charts based on hours and on days. The day based charts are currently calculated on-demand using all the individual sessions/pageviews. For high traffic websites this could easily be hundreds of thousands of records to process for each day in the report. Perhaps it would be an idea to periodically create/update aggregated data on a day (and perhaps also hour) base persisted in the database? You would be able to create much faster reports then.

I'm not very familiar with NodeJS, but I am proficient in JavaScript (I'm an experienced Java / TypeScript developer). I'm willing to help.

mikecao · 2022-07-04T21:31:10Z

mikecao
Jul 4, 2022
Maintainer

There are definitely areas we can improve performance, especially around the queries. However, scalability depends on many factors. Most important is probably what hardware you are running on. Obviously a sharded, horizontally scaled array of database servers with tons of RAM will perform better. It is difficult to create a solution that will handle every case. But even vertically scaling a single database should provide some performance upgrade. But usually scaling is outside the scope of the application because it is generally a hardware issue.

There are also potential issues with Prisma. For instance, if you are using pgbouncer, you need to add pgbouncer=true to your connection string and use transaction mode for you pool. You may also need to adjust your connection limit. See https://www.prisma.io/docs/guides/performance-and-optimization/connection-management.

That being said, we are looking at some additional solutions like Clickhouse and Timescaledb.

0 replies

mikecao · 2022-07-07T18:14:16Z

mikecao
Jul 7, 2022
Maintainer

Update, we are currently working on ClickHouse support.

0 replies

dennisfang · 2022-07-08T16:17:03Z

dennisfang
Jul 8, 2022
Author

Update, we are currently working on ClickHouse support.

Sounds interesting! Let me know if I can help by being be a beta tester.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance / scalability issues #1927

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Performance / scalability issues #1927

dennisfang Jul 2, 2022

Replies: 3 comments

mikecao Jul 4, 2022 Maintainer

mikecao Jul 7, 2022 Maintainer

dennisfang Jul 8, 2022 Author

dennisfang
Jul 2, 2022

mikecao
Jul 4, 2022
Maintainer

mikecao
Jul 7, 2022
Maintainer

dennisfang
Jul 8, 2022
Author