You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
elnyry-sam-k opened this issue
Dec 3, 2024
· 1 comment
Labels
bugSomething isn't working or it has wrong behavior on a Mojaloop Core serviceoss-coreThis is an issue - story or epic related to a feature on a Mojaloop core service or related to it
Summary:
During performance testing by one of the community members, a deadlock in timeout handler was encountered.
Severity:
Low
Priority:
Critical
Expected Behavior
No deadlocks with timeout handler but transfers processed as expected and graceful degradation of service in case of load higher than acceptable with the switch settings and resources.
Steps to Reproduce
Setup a switch with Mojaloop v16
Consistent load of about 200 TPS with simulators and handlers scaled accordingly
Sample logs below:
2024-11-29 11:17:30 2024-11-29T09:17:30.361Z - error: "uncaughtException: insert into transferTimeout (transferId, expirationDate) select t.transferId, t.expirationDatefromtransferastinner join (selecttransferId, max(transferStateChangeId) as maxTransferStateChangeIdfromtransferStateChangewheretransferStateChangeId> 508300 andtransferStateChangeId<= 508776 group bytransferId) as tsonts.transferId=t.transferIdinner jointransferStateChangeastscontsc.transferStateChangeId=ts.maxTransferStateChangeIdleft jointransferTimeoutasttontt.transferId=t.transferIdwherett.transferIdis null andtsc.transferStateIdin ('RECEIVED_PREPARE', 'RESERVED') - ER_LOCK_DEADLOCK: Deadlock found when trying to get lock; try restarting transaction\nFSPIOPError: insert into transferTimeout (transferId, expirationDate) selectt.transferId, t.expirationDatefromtransferastinner join (selecttransferId, max(transferStateChangeId) as maxTransferStateChangeIdfromtransferStateChangewheretransferStateChangeId> 508300 andtransferStateChangeId<= 508776 group bytransferId) as tsonts.transferId=t.transferIdinner jointransferStateChangeastscontsc.transferStateChangeId=ts.maxTransferStateChangeIdleft jointransferTimeoutasttontt.transferId=t.transferIdwherett.transferIdis null andtsc.transferStateId in ('RECEIVED_PREPARE', 'RESERVED') - ER_LOCK_DEADLOCK: Deadlock found when trying to get lock; try restarting transaction\n at createFSPIOPError (/opt/app/node_modules/@mojaloop/central-services-error-handling/src/factory.js:198:12)\n at Object.reformatFSPIOPError (/opt/app/node_modules/@mojaloop/central-services-error-handling/src/factory.js:333:12)\n at /opt/app/src/models/transfer/facade.js:781:36\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\nError: ER_LOCK_DEADLOCK: Deadlock found when trying to get lock; try restarting transaction\n at Sequence._packetToError (/opt/app/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)\n at Query.ErrorPacket (/opt/app/node_modules/mysql/lib/protocol/sequences/Query.js:79:18)\n at Protocol._parsePacket (/opt/app/node_modules/mysql/lib/protocol/Protocol.js:291:23)\n at Parser._parsePacket (/opt/app/node_modules/mysql/lib/protocol/Parser.js:433:10)\n at Parser.write (/opt/app/node_modules/mysql/lib/protocol/Parser.js:43:10)\n at Protocol.write (/opt/app/node_modules/mysql/lib/protocol/Protocol.js:38:16)\n at Socket.<anonymous> (/opt/app/node_modules/mysql/lib/Connection.js:88:28)\n at Socket.<anonymous> (/opt/app/node_modules/mysql/lib/Connection.js:526:10)\n at Socket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:324:12)\n --------------------\n at Protocol._enqueue (/opt/app/node_modules/mysql/lib/protocol/Protocol.js:144:48)\n at Connection.query (/opt/app/node_modules/mysql/lib/Connection.js:198:25)\n at /opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/dialects/mysql/index.js:132:18\n at new Promise (<anonymous>)\n at Client_MySQL._query (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/dialects/mysql/index.js:126:12)\n at executeQuery (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/execution/internal/query-executioner.js:37:17)\n at Client_MySQL.query (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/client.js:154:12)\n at /opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/execution/transaction.js:371:24\n at new Promise (<anonymous>)\n at trxClient.query (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/execution/transaction.js:366:12)"
Specifications
Component (if known): Center ledger
Version: v16 of Mojaloop
Platform: on-prem
Subsystem: time-out handler
Type of testing: Performance
Bug found/raised by: Bonaparte and team, @elnyry-sam-k
Notes:
Severity when opened: Low
Priority when opened: Critical
The text was updated successfully, but these errors were encountered:
elnyry-sam-k
added
oss-core
This is an issue - story or epic related to a feature on a Mojaloop core service or related to it
bug
Something isn't working or it has wrong behavior on a Mojaloop Core service
labels
Dec 3, 2024
Here are some related observations, when running the performance tests:
The timeout handler is called by a module named cron with a default to call it every 15 seconds. It looks it is not aware of the fact that this handler is an asynchronous function and it will not await it. This example illustrates that - running it will log every 15 seconds, even though the promise never resolves:
All this means that the timeout handler may be invoked again, before the previous one completed. So if the database is experiencing some higher load and cannot complete all the queries included in the handler, then it will get an additional load every 15 seconds. As the handler at least 4 queries, that can be potentially slow and they are very similar and take similar time to complete, it is enough to hit ~4 seconds execution time per query to cause a problem. The 4 seconds can be hit if the DB is already loaded by something else. It is also likely that the deadlock is caused by invoking the same handler several times, without waiting for it to complete.
The slowness of the queries is probably related to a big number of rows in the transferTimeout table. This is likely related to the fact, that I am running only tests for transfer prepare. So probably all transfers time out and accumulate in that table and contribute to the likelihood of query slowness and a deadlock. But probably something similar can happen in production if some or several of the participants have issues with fulfilling transfers.
When running the performance tests for transfer prepare, it is observed that the time of deadlocks is very close to periods where the processing of transfers seems to freeze or become very slow.
bugSomething isn't working or it has wrong behavior on a Mojaloop Core serviceoss-coreThis is an issue - story or epic related to a feature on a Mojaloop core service or related to it
Summary:
During performance testing by one of the community members, a deadlock in timeout handler was encountered.
Severity:
Low
Priority:
Critical
Expected Behavior
No deadlocks with timeout handler but transfers processed as expected and graceful degradation of service in case of load higher than acceptable with the switch settings and resources.
Steps to Reproduce
2024-11-29 11:17:30 2024-11-29T09:17:30.361Z - error: "uncaughtException: insert into transferTimeout (transferId, expirationDate) select
t.
transferId,
t.
expirationDatefrom
transferas
tinner join (select
transferId, max(
transferStateChangeId) as
maxTransferStateChangeIdfrom
transferStateChangewhere
transferStateChangeId> 508300 and
transferStateChangeId<= 508776 group by
transferId) as
tson
ts.
transferId=
t.
transferIdinner join
transferStateChangeas
tscon
tsc.
transferStateChangeId=
ts.
maxTransferStateChangeIdleft join
transferTimeoutas
tton
tt.
transferId=
t.
transferIdwhere
tt.
transferIdis null and
tsc.
transferStateIdin ('RECEIVED_PREPARE', 'RESERVED') - ER_LOCK_DEADLOCK: Deadlock found when trying to get lock; try restarting transaction\nFSPIOPError: insert into transferTimeout (transferId, expirationDate) select
t.
transferId,
t.
expirationDatefrom
transferas
tinner join (select
transferId, max(
transferStateChangeId) as
maxTransferStateChangeIdfrom
transferStateChangewhere
transferStateChangeId> 508300 and
transferStateChangeId<= 508776 group by
transferId) as
tson
ts.
transferId=
t.
transferIdinner join
transferStateChangeas
tscon
tsc.
transferStateChangeId=
ts.
maxTransferStateChangeIdleft join
transferTimeoutas
tton
tt.
transferId=
t.
transferIdwhere
tt.
transferIdis null and
tsc.
transferStateIdin ('RECEIVED_PREPARE', 'RESERVED') - ER_LOCK_DEADLOCK: Deadlock found when trying to get lock; try restarting transaction\n at createFSPIOPError (/opt/app/node_modules/@mojaloop/central-services-error-handling/src/factory.js:198:12)\n at Object.reformatFSPIOPError (/opt/app/node_modules/@mojaloop/central-services-error-handling/src/factory.js:333:12)\n at /opt/app/src/models/transfer/facade.js:781:36\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\nError: ER_LOCK_DEADLOCK: Deadlock found when trying to get lock; try restarting transaction\n at Sequence._packetToError (/opt/app/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)\n at Query.ErrorPacket (/opt/app/node_modules/mysql/lib/protocol/sequences/Query.js:79:18)\n at Protocol._parsePacket (/opt/app/node_modules/mysql/lib/protocol/Protocol.js:291:23)\n at Parser._parsePacket (/opt/app/node_modules/mysql/lib/protocol/Parser.js:433:10)\n at Parser.write (/opt/app/node_modules/mysql/lib/protocol/Parser.js:43:10)\n at Protocol.write (/opt/app/node_modules/mysql/lib/protocol/Protocol.js:38:16)\n at Socket.<anonymous> (/opt/app/node_modules/mysql/lib/Connection.js:88:28)\n at Socket.<anonymous> (/opt/app/node_modules/mysql/lib/Connection.js:526:10)\n at Socket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:324:12)\n --------------------\n at Protocol._enqueue (/opt/app/node_modules/mysql/lib/protocol/Protocol.js:144:48)\n at Connection.query (/opt/app/node_modules/mysql/lib/Connection.js:198:25)\n at /opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/dialects/mysql/index.js:132:18\n at new Promise (<anonymous>)\n at Client_MySQL._query (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/dialects/mysql/index.js:126:12)\n at executeQuery (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/execution/internal/query-executioner.js:37:17)\n at Client_MySQL.query (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/client.js:154:12)\n at /opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/execution/transaction.js:371:24\n at new Promise (<anonymous>)\n at trxClient.query (/opt/app/node_modules/@mojaloop/database-lib/node_modules/knex/lib/execution/transaction.js:366:12)"
Specifications
Notes:
The text was updated successfully, but these errors were encountered: