You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yes, I have searched for similar issues on GitHub and found none.
What did you do?
We are encountering a deadlock error in the Evolution API during query execution. The issue arises due to concurrent updates on certain database records, as shown in the error logs below:
ConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(PostgresError { code: "40P01", message: "deadlock detected", severity: "ERROR", detail: Some("Process 4756 waits for ShareLock on transaction 1029970; blocked by process 4745. Process 4745 waits for ShareLock on transaction 1029972; blocked by process 4744. Process 4744 waits for ShareLock on transaction 1029973; blocked by process 4756."), column: None, hint: Some("See server log for query details.") }), transient: false })
The issue occurs when multiple processes attempt to acquire locks on the same transactions, creating a circular wait condition.
Temporary Solution
To mitigate the issue, we implemented a trigger at the database level to reduce concurrent updates. The trigger restricts frequent updates to certain records based on the updatedAt field, allowing updates only if:
Fields other than updatedAt are modified, or
The last update occurred more than 5 minutes ago.
Here is the trigger function applied:
CREATE OR REPLACE FUNCTION public.prevent_frequent_updates() RETURNS trigger LANGUAGE 'plpgsql' COST 100 VOLATILE NOT LEAKPROOF AS $BODY$ BEGIN -- Verifies if ONLY the updatedAtfield is being updated IF (NEW."remoteJid" IS DISTINCT FROM OLD."remoteJid" OR NEW."labels" IS DISTINCT FROM OLD."labels" OR NEW."name" IS DISTINCT FROM OLD."name" OR NEW."unreadMessages" IS DISTINCT FROM OLD."unreadMessages" OR NEW."instanceId" IS DISTINCT FROM OLD."instanceId") THEN -- Allows the update if any field other thanupdatedAt` is modified
RETURN NEW;
END IF;
-- Checks if the record was updated within the last 5 minutes
IF (NEW."updatedAt" <= OLD."updatedAt" + INTERVAL '5 minutes') THEN
-- Returns the current state without applying the update
RETURN OLD;
END IF;
-- Allows the update
RETURN NEW;
END; $BODY$;
ALTER FUNCTION public.prevent_frequent_updates()
OWNER TO postgres;`
What did you expect?
Suggested Solution
A more robust solution would involve enqueuing updates targeting the same remoteJid to prevent concurrent updates on the same row in the chat table. This approach would serialize updates to a specific remoteJid, avoiding transaction contention entirely.
Steps to Reproduce
Execute multiple concurrent updates on records with overlapping lock dependencies.
Observe deadlock errors in the API logs.
Expected Behavior
The system should handle concurrent updates without causing deadlocks, potentially using queueing or transaction management strategies.
Actual Behavior
Deadlock errors prevent some updates from completing successfully.
Possible Solutions
Enqueue updates targeting the same remoteJid, serializing them to ensure only one transaction modifies the row at a time.
Investigate and optimize the Prisma query execution strategy to minimize transaction lock contention.
Implement application-level queuing for updates to avoid simultaneous operations on the same records.
Consider using row-level locks with SELECT FOR UPDATE to prevent deadlocks during concurrent updates.
Introduce retry logic in the application to handle transient deadlock errors gracefully.
What did you observe instead of what you expected?
Expected:
We expected the updates to execute successfully without locking or waiting issues, even with multiple concurrent requests targeting the chat table.
Observed:
Instead, a deadlock occurs during concurrent updates, causing transactions to fail with the following error message:
DOCKER:
Server Environment:
OS: Ubuntu
CPU: 8 vCPUs
RAM: 32GB
If applicable, paste the log output
No response
Additional Notes
While investigating the deadlock issue, we observed several logs indicating potential connection resets from the database client. This might be contributing to or exacerbating the problem. Below are the relevant logs:
2024-11-22 20:20:23.876 UTC [2019] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:20:56.644 UTC [2053] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:20:56.644 UTC [2051] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:20:56.644 UTC [1962] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:28:51.780 UTC [2014] LOG: could not receive data from client: Connection reset by peer
The text was updated successfully, but these errors were encountered:
Welcome!
What did you do?
We are encountering a deadlock error in the Evolution API during query execution. The issue arises due to concurrent updates on certain database records, as shown in the error logs below:
ConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(PostgresError { code: "40P01", message: "deadlock detected", severity: "ERROR", detail: Some("Process 4756 waits for ShareLock on transaction 1029970; blocked by process 4745. Process 4745 waits for ShareLock on transaction 1029972; blocked by process 4744. Process 4744 waits for ShareLock on transaction 1029973; blocked by process 4756."), column: None, hint: Some("See server log for query details.") }), transient: false })
The issue occurs when multiple processes attempt to acquire locks on the same transactions, creating a circular wait condition.
Temporary Solution
To mitigate the issue, we implemented a trigger at the database level to reduce concurrent updates. The trigger restricts frequent updates to certain records based on the updatedAt field, allowing updates only if:
Fields other than updatedAt are modified, or
The last update occurred more than 5 minutes ago.
Here is the trigger function applied:
CREATE OR REPLACE FUNCTION public.prevent_frequent_updates() RETURNS trigger LANGUAGE 'plpgsql' COST 100 VOLATILE NOT LEAKPROOF AS $BODY$ BEGIN -- Verifies if ONLY the
updatedAtfield is being updated IF (NEW."remoteJid" IS DISTINCT FROM OLD."remoteJid" OR NEW."labels" IS DISTINCT FROM OLD."labels" OR NEW."name" IS DISTINCT FROM OLD."name" OR NEW."unreadMessages" IS DISTINCT FROM OLD."unreadMessages" OR NEW."instanceId" IS DISTINCT FROM OLD."instanceId") THEN -- Allows the update if any field other than
updatedAt` is modifiedRETURN NEW;
END IF;
-- Checks if the record was updated within the last 5 minutes
IF (NEW."updatedAt" <= OLD."updatedAt" + INTERVAL '5 minutes') THEN
-- Returns the current state without applying the update
RETURN OLD;
END IF;
-- Allows the update
$BODY$ ;
RETURN NEW;
END;
ALTER FUNCTION public.prevent_frequent_updates()
OWNER TO postgres;`
What did you expect?
Suggested Solution
A more robust solution would involve enqueuing updates targeting the same remoteJid to prevent concurrent updates on the same row in the chat table. This approach would serialize updates to a specific remoteJid, avoiding transaction contention entirely.
Steps to Reproduce
Execute multiple concurrent updates on records with overlapping lock dependencies.
Observe deadlock errors in the API logs.
Expected Behavior
The system should handle concurrent updates without causing deadlocks, potentially using queueing or transaction management strategies.
Actual Behavior
Deadlock errors prevent some updates from completing successfully.
Possible Solutions
Enqueue updates targeting the same remoteJid, serializing them to ensure only one transaction modifies the row at a time.
Investigate and optimize the Prisma query execution strategy to minimize transaction lock contention.
Implement application-level queuing for updates to avoid simultaneous operations on the same records.
Consider using row-level locks with SELECT FOR UPDATE to prevent deadlocks during concurrent updates.
Introduce retry logic in the application to handle transient deadlock errors gracefully.
What did you observe instead of what you expected?
Expected:
We expected the updates to execute successfully without locking or waiting issues, even with multiple concurrent requests targeting the chat table.
Observed:
Instead, a deadlock occurs during concurrent updates, causing transactions to fail with the following error message:
Screenshots/Videos
`version: "3.7"
services:
evolution_s03:
image: atendai/evolution-api:v2.2.0
evolution_s03_postgres:
image: postgres:14
environment:
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- TZ=America/Sao_Paulo
- PGTZ=America/Sao_Paulo
networks:
- evolution
ports:
- 12013:5432
shm_size: 512mb
evolution_s03_redis:
image: redis:latest
command: [
"redis-server",
"--appendonly",
"yes",
"--port",
"6379"
]
volumes:
- redis_data:/data
networks:
- evolution
deploy:
placement:
constraints:
- node.role == manager
resources:
limits:
cpus: "1"
memory: 512M
volumes:
evolution_instances:
external: true
name: evolution_s03_data
postgres_data:
external: true
name: evolution_s03_database
redis_data:
external: true
name: evolution_s03_cache
networks:
evolution:
external: true
name: evolution
`
Which version of the API are you using?
2.2.0
What is your environment?
Docker
Other environment specifications
DOCKER:
Server Environment:
OS: Ubuntu
CPU: 8 vCPUs
RAM: 32GB
If applicable, paste the log output
No response
Additional Notes
While investigating the deadlock issue, we observed several logs indicating potential connection resets from the database client. This might be contributing to or exacerbating the problem. Below are the relevant logs:
2024-11-22 20:20:23.876 UTC [2019] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:20:56.644 UTC [2053] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:20:56.644 UTC [2051] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:20:56.644 UTC [1962] LOG: could not receive data from client: Connection reset by peer 2024-11-22 20:28:51.780 UTC [2014] LOG: could not receive data from client: Connection reset by peer
The text was updated successfully, but these errors were encountered: