feat: support schedule only executes once in random one cluster when it's deployed in k8s clusters mode. #61

oneWalker · 2024-11-12T19:01:23Z

Checklist

npm test passes with github actions
tests and/or benchmarks are included
documentation is changed or added
commit message follows commit guidelines

Affected core subsystem(s)

Description of change

support schedule only executes once when it's deployed in k8s clusters mode.

Summary by CodeRabbit

Release Notes

New Features
- Introduced a locking mechanism for distributed scheduling in clustered environments using Redis.
- Added new configuration options for cluster settings, including properties for enabling clustering and Redis configuration.
Bug Fixes
- Enhanced locking mechanisms to prevent concurrent executions in clustered environments.
Documentation
- Updated README.md with detailed instructions for configuring Redis clustering and new settings.
Chores
- Added ioredis as a new dependency for Redis functionality.
- Introduced mocha for improved testing capabilities.

…8s clusters

coderabbitai · 2024-11-12T19:01:31Z

Walkthrough

The changes in this pull request introduce new clustering capabilities to the egg-schedule project by integrating Redis for scheduling management. Modifications include updates to the README.md for configuration instructions, the addition of a Redis client in agent.js, and new configuration properties in config/config.default.js. The AllStrategy and WorkerStrategy classes have been updated to implement asynchronous handlers with locking mechanisms using Redis. Additionally, the package.json file has been updated to include the ioredis dependency and a testing framework.

Changes

File	Change Summary
README.md	Added new configuration section for clustering, detailing `cluster` object and Redis settings.
config/config.default.js	Introduced new properties under `config.schedule`: `cluster`, `enable`, `lockType`, `lockedTtl`, `redis`, `client`, `port`, `host`, `password`, `db`, `default`, and `prefix`.
agent.js	Imported `RedisLock`, added `lockManager`, and modified `agent.beforeStart` to utilize Redis locking.
lib/strategy/all.js	Updated `handler()` method to be asynchronous and added logic for Redis locking mechanism.
lib/strategy/worker.js	Renamed class to `AllStrategy`, updated `handler()` method to be asynchronous and added logic for Redis locking mechanism.
lib/lock/base.js	Introduced `LockManager` class with methods for managing locks.
lib/lock/redis_lock.js	Added `RedisLock` class extending `LockManager`, implementing Redis-based lock management.
package.json	Added `"ioredis": "^5.4.1"` as a dependency and `"mocha": "^10.8.2"` as a development dependency.

Poem

🐇 In the garden of code, changes bloom bright,
Redis now dances, bringing clusters to light.
With locks in the night, our tasks shall not clash,
Scheduling’s sweet rhythm, a well-timed flash!
So hop along, friends, let’s celebrate this feat,
In the world of egg-schedule, our work is complete! 🌼

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 eslint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

warning [email protected]: This version is no longer supported. Please see https://eslint.org/version-support for other options.
warning eslint > @humanwhocodes/[email protected]: Use @eslint/config-array instead
warning eslint > @humanwhocodes/config-array > @humanwhocodes/[email protected]: Use @eslint/object-schema instead
warning eslint > file-entry-cache > flat-cache > [email protected]: Rimraf versions prior to v4 are no longer supported
warning eslint > file-entry-cache > flat-cache > rimraf > [email protected]: Glob versions prior to v9 are no longer supported
warning eslint > file-entry-cache > flat-cache > rimraf > glob > [email protected]: This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful.
error [email protected]: The engine "node" is incompatible with this module. Expected version "^14 || ^16 || ^17 || ^18 || ^19". Got "22.9.0"
error Found incompatible module.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 16

🧹 Outside diff range and nitpick comments (4)

agent.js (1)

15-20: Add Redis health check mechanism.

Implement a health check mechanism to ensure Redis connection stability.

Consider implementing a periodic health check:

    if (
      agent.config.schedule?.cluster?.enable &&
      agent.config.schedule?.cluster?.redis
    ) {
      agent.redisClient = new Redis(agent.config.schedule.cluster.redis);
+      
+      // Periodic health check
+      const healthCheck = async () => {
+        try {
+          await agent.redisClient.ping();
+        } catch (err) {
+          agent.logger.error('[egg-schedule] Redis health check failed:', err);
+        }
+      };
+      
+      const HEALTH_CHECK_INTERVAL = 30000; // 30 seconds
+      const timer = setInterval(healthCheck, HEALTH_CHECK_INTERVAL);
+      
+      // Clean up timer on agent close
+      agent.beforeClose(async () => {
+        clearInterval(timer);
+      });
    }

README.md (2)

277-279: Clarify prefix and default configuration usage

The documentation should explain:

The purpose and format of the default setting
How the prefix is used in Redis key generation

Add the following explanation:

- `default`: The unique identifier for your application (e.g., 'my-app-name'). Used to prevent key conflicts when multiple applications share the same Redis instance.
- `prefix`: The prefix used for Redis lock keys. The final key format will be `{prefix}:{default}:{schedule-name}`.

264-279: Add troubleshooting section for cluster mode

The documentation should include common issues and solutions when using cluster mode.

Add the following section:

#### Troubleshooting Cluster Mode

Common issues and solutions:

1. **Multiple Task Executions**
   - Verify Redis connection settings
   - Check if lock TTL matches your task duration
   - Ensure clock synchronization across nodes

2. **Tasks Not Executing**
   - Confirm Redis connectivity from all nodes
   - Verify Redis credentials and permissions
   - Check for Redis connection timeouts

3. **Lock Not Released**
   - Implement proper error handling in tasks
   - Consider setting up lock timeouts
   - Monitor Redis memory usage

lib/strategy/worker.js (1)

12-20: Use the path module for cross-platform path handling

Using string manipulation for paths may cause issues on different operating systems (e.g., Windows uses backslashes). Use the Node.js path module to handle paths reliably across platforms.

Apply this diff:
+ const path = require('path');
  const projectName =
    curConfig.default === 'default'
-     ? this.agent.baseDir.split('/').pop()
+     ? path.basename(this.agent.baseDir)
      : curConfig.default;
  const prefix = curConfig.prefix;
- lockedKey = `${projectName}-${prefix}-${this.key.replace(
-   this.agent.baseDir,
-   ''
- )}`;
+ const relativeKey = path.relative(this.agent.baseDir, this.key);
+ lockedKey = `${projectName}-${prefix}-${relativeKey}`;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 78db046 and cf4792b.

📒 Files selected for processing (6)

README.md (1 hunks)
agent.js (2 hunks)
config/config.default.js (1 hunks)
lib/strategy/all.js (1 hunks)
lib/strategy/worker.js (1 hunks)
package.json (2 hunks)

🔇 Additional comments (4)

config/config.default.js (2)

16-28: Verify k8s deployment resilience

The cluster configuration needs additional consideration for k8s deployments:

Ensure Redis connection is resilient to pod restarts
Consider adding Redis sentinel support for HA deployments
Document how the scheduler behaves during pod initialization and termination

Let's verify the k8s deployment considerations:

#!/bin/bash
# Check for k8s lifecycle handling
echo "Checking for pod lifecycle handlers:"
rg "beforeExit|beforeClose|beforeStop" -A 5

# Check for Redis connection error handling
echo "Checking Redis error handling:"
ast-grep --pattern 'catch ($_) {
  $$$
}'

Consider documenting k8s deployment best practices in README.md, including:

Recommended pod anti-affinity rules
Resource requests/limits for Redis
Network policy recommendations

27-28: Document the purpose of new schedule properties

The new properties default and prefix lack documentation explaining their purpose and impact on the scheduling system.

Let's verify how these values are used:

Please add JSDoc comments explaining:

The purpose of each property
How they affect the Redis-based locking mechanism
Whether they can be customized and any constraints on their values

agent.js (1)

5-5: LGTM! Verify ioredis dependency.

The choice of ioredis is appropriate for Redis cluster support.

✅ Verification successful

ioredis dependency is properly declared in package.json.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify ioredis dependency is properly declared
grep -r "ioredis" package.json

Length of output: 55

package.json (1)

29-29: Verify ioredis compatibility

The addition of ioredis for cluster management is appropriate, but let's verify the version compatibility.

✅ Verification successful

ioredis@^5.4.1 is compatible with the project.

Supported Node.js version: >=14.17.0
No known security vulnerabilities.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check ioredis version compatibility and security advisories

# Check npm for latest versions and compatibility
npm view ioredis@^5.4.1 engines.node

# Check for known vulnerabilities
gh api graphql -f query='
{
  securityVulnerabilities(first: 5, ecosystem: NPM, package: "ioredis") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}'

Length of output: 391

Script:

#!/bin/bash
# Description: Retrieve the project's Node.js version

# Check the Node.js version specified in package.json
if [ -f package.json ]; then
  jq -r '.engines.node // empty' package.json
fi

# If not found in package.json, check for a .nvmrc file
if [ -f .nvmrc ]; then
  cat .nvmrc
fi

Length of output: 101

coderabbitai · 2024-11-12T19:07:38Z

config/config.default.js

+    cluster: {
+      enable: false,
+      redis: {
+        client: {
+          port: 6379, // Redis port
+          host: '127.0.0.1', // Redis host
+          password: 'auth',
+          db: 0,
+        },
+      },
+    },


⚠️ Potential issue

Security and k8s deployment concerns in Redis configuration

The Redis configuration has several issues that need to be addressed:

Hardcoded credentials in default configuration pose a security risk

The default host 127.0.0.1 may not work in k8s clusters

Missing important Redis connection settings (timeout, retry policy, TLS)

Consider this revised configuration:

cluster: { enable: false, redis: { client: { - port: 6379, // Redis port - host: '127.0.0.1', // Redis host - password: 'auth', - db: 0, + port: process.env.REDIS_PORT || 6379, + host: process.env.REDIS_HOST || '127.0.0.1', + password: process.env.REDIS_PASSWORD, + db: process.env.REDIS_DB || 0, + connectTimeout: 5000, + retryStrategy: (times) => Math.min(times * 1000, 3000), + tls: process.env.REDIS_TLS === 'true' ? {} : undefined, }, }, },

Additionally:

Document these environment variables in README.md

Consider using k8s secrets for Redis credentials

Add validation for these configuration values

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cluster: {

enable: false,

redis: {

client: {

port: 6379, // Redis port

host: '127.0.0.1', // Redis host

password: 'auth',

db: 0,

},

},

},

cluster: {

enable: false,

redis: {

client: {

port: process.env.REDIS_PORT || 6379,

host: process.env.REDIS_HOST || '127.0.0.1',

password: process.env.REDIS_PASSWORD,

db: process.env.REDIS_DB || 0,

connectTimeout: 5000,

retryStrategy: (times) => Math.min(times * 1000, 3000),

tls: process.env.REDIS_TLS === 'true' ? {} : undefined,

},

},

},

egg-plugin is a special one. if you want to use process.env, you should modify it in config.{default}.js in your project.

@oneWalker, thank you for the clarification. I understand that in egg-plugin development, it's not appropriate to use process.env in the plugin's config/config.default.js. Instead, environment-specific configurations should be set in config.{env}.js in the user's project.

✏️ Learnings added

Learnt from: oneWalker PR: eggjs/egg-schedule#61 File: config/config.default.js:16-26 Timestamp: 2024-11-12T19:37:36.347Z Learning: In egg-plugin development, avoid using `process.env` in the plugin's `config/config.default.js` file. Environment-specific configurations should be set in the user's project `config/config.{env}.js` files.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-11-12T19:07:39Z

agent.js

+    if (
+      agent.config.schedule?.cluster?.enable &&
+      agent.config.schedule?.cluster?.redis
+    ) {
+      agent.redisClient = new Redis(agent.config.schedule.cluster.redis);
+    }


⚠️ Potential issue

Add Redis connection cleanup.

The Redis client connection should be properly closed when the agent stops to prevent resource leaks.

Add cleanup code in the agent's lifecycle:

+ agent.beforeClose(async () => { + if (agent.redisClient) { + await agent.redisClient.quit(); + agent.redisClient = null; + } + });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (

agent.config.schedule?.cluster?.enable &&

agent.config.schedule?.cluster?.redis

) {

agent.redisClient = new Redis(agent.config.schedule.cluster.redis);

}

if (

agent.config.schedule?.cluster?.enable &&

agent.config.schedule?.cluster?.redis

) {

agent.redisClient = new Redis(agent.config.schedule.cluster.redis);

}

agent.beforeClose(async () => {

if (agent.redisClient) {

await agent.redisClient.quit();

agent.redisClient = null;

}

});

⚠️ Potential issue

Add error handling for Redis connection.

The Redis client initialization lacks error handling which could cause silent failures in production.

Apply this diff to add proper error handling:

if ( agent.config.schedule?.cluster?.enable && agent.config.schedule?.cluster?.redis ) { - agent.redisClient = new Redis(agent.config.schedule.cluster.redis); + try { + agent.redisClient = new Redis(agent.config.schedule.cluster.redis); + agent.redisClient.on('error', err => { + agent.logger.error('[egg-schedule] Redis client error:', err); + }); + } catch (err) { + agent.logger.error('[egg-schedule] Failed to initialize Redis client:', err); + throw err; + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (

agent.config.schedule?.cluster?.enable &&

agent.config.schedule?.cluster?.redis

) {

agent.redisClient = new Redis(agent.config.schedule.cluster.redis);

}

if (

agent.config.schedule?.cluster?.enable &&

agent.config.schedule?.cluster?.redis

) {

try {

agent.redisClient = new Redis(agent.config.schedule.cluster.redis);

agent.redisClient.on('error', err => {

agent.logger.error('[egg-schedule] Redis client error:', err);

});

} catch (err) {

agent.logger.error('[egg-schedule] Failed to initialize Redis client:', err);

throw err;

}

}

🛠️ Refactor suggestion

Add Redis configuration validation and connection options.

The Redis configuration should be validated and include important connection options.

Apply this diff to add validation and connection options:

if ( agent.config.schedule?.cluster?.enable && agent.config.schedule?.cluster?.redis ) { + const redisConfig = agent.config.schedule.cluster.redis; + if (!redisConfig.host || !redisConfig.port) { + throw new Error('[egg-schedule] Redis configuration must include host and port'); + } + + const redisOptions = { + retryStrategy: times => Math.min(times * 100, 3000), + maxRetriesPerRequest: 3, + enableReadyCheck: true, + ...redisConfig, + }; + - agent.redisClient = new Redis(agent.config.schedule.cluster.redis); + agent.redisClient = new Redis(redisOptions); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (

agent.config.schedule?.cluster?.enable &&

agent.config.schedule?.cluster?.redis

) {

agent.redisClient = new Redis(agent.config.schedule.cluster.redis);

}

if (

agent.config.schedule?.cluster?.enable &&

agent.config.schedule?.cluster?.redis

) {

const redisConfig = agent.config.schedule.cluster.redis;

if (!redisConfig.host || !redisConfig.port) {

throw new Error('[egg-schedule] Redis configuration must include host and port');

}

const redisOptions = {

retryStrategy: times => Math.min(times * 100, 3000),

maxRetriesPerRequest: 3,

enableReadyCheck: true,

...redisConfig,

};

agent.redisClient = new Redis(redisOptions);

}

coderabbitai · 2024-11-12T19:07:40Z

package.json

@@ -36,7 +37,8 @@
    "egg-mock": "^5.3.0",
    "egg-tracer": "^1.1.0",
    "eslint": "^8.29.0",
-    "eslint-config-egg": "^12.1.0"
+    "eslint-config-egg": "^12.1.0",
+    "mocha": "^10.8.2"


⚠️ Potential issue

Fix incorrect mocha version number

The specified mocha version ^10.8.2 is invalid as the latest stable version of mocha is 10.2.0. This could be causing the npm test failures mentioned in the PR description.

Apply this diff to fix the version:

- "mocha": "^10.8.2" + "mocha": "^10.2.0"

Additionally, since the project already uses egg-bin for testing, please verify if adding mocha directly is necessary.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"mocha": "^10.8.2"

"mocha": "^10.2.0"

It won't pass the ci in node 14 env without mocha module.

@oneWalker, thank you for the clarification. I've noted that the CI in Node 14 environments requires the mocha module to pass.

✏️ Learnings added

Learnt from: oneWalker PR: eggjs/egg-schedule#61 File: package.json:41-41 Timestamp: 2024-11-12T19:35:13.420Z Learning: In the `egg-schedule` project, adding `mocha` as a devDependency is necessary for the CI to pass in Node.js 14 environments, even when `egg-bin` is used for testing.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-11-12T19:07:40Z

README.md

+  // add it when making sure it only running in one cluster. 
+  cluster: {
+    enable: true,
+    // add redis for redis lock
+    redis: {
+      client: {
+        port: 6379, // Redis port
+        host: '127.0.0.1', // Redis host
+        password: 'auth',
+        db: 0,
+      },
+    },
+  },
+  //the prefix for lockedKey for redis lock
+  default: 'default', // default schedule name,like project-name.
+  prefix: 'schedule', // default schedule prefix


🛠️ Refactor suggestion

Document cluster mode behavior and configuration

The current documentation lacks essential information about the cluster mode. Consider adding:

Explanation of how the Redis-based locking mechanism works

Impact on task execution in a Kubernetes environment

Best practices for high availability setups

Add the following section before the configuration example:

### Cluster Mode When running in a Kubernetes cluster or any distributed environment, you might want to ensure that scheduled tasks execute only once across all instances. The cluster mode uses Redis for distributed locking to prevent duplicate task execution. #### How it works - When a task is scheduled to run, the scheduler attempts to acquire a Redis lock - Only one instance in the cluster will successfully acquire the lock and execute the task - The lock is automatically released after task completion or on failure #### Configuration The cluster mode requires Redis for coordination. Here's a complete configuration example:

coderabbitai · 2024-11-12T19:07:40Z

README.md

+  // add it when making sure it only running in one cluster. 
+  cluster: {
+    enable: true,
+    // add redis for redis lock
+    redis: {
+      client: {
+        port: 6379, // Redis port
+        host: '127.0.0.1', // Redis host
+        password: 'auth',
+        db: 0,
+      },
+    },
+  },


⚠️ Potential issue

Enhance security and flexibility of Redis configuration

The Redis configuration example shows hardcoded credentials, which is not a recommended practice. Consider:

Using environment variables for sensitive data

Adding SSL/TLS configuration options

Documenting connection retry strategies

Here's an improved configuration example:

cluster: { enable: true, redis: { client: { - port: 6379, - host: '127.0.0.1', - password: 'auth', - db: 0, + port: process.env.REDIS_PORT || 6379, + host: process.env.REDIS_HOST || '127.0.0.1', + password: process.env.REDIS_PASSWORD, + db: process.env.REDIS_DB || 0, + tls: process.env.REDIS_TLS === 'true' ? {} : undefined, + retryStrategy: (times) => Math.min(times * 50, 2000), }, }, },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// add it when making sure it only running in one cluster.

cluster: {

enable: true,

// add redis for redis lock

redis: {

client: {

port: 6379, // Redis port

host: '127.0.0.1', // Redis host

password: 'auth',

db: 0,

},

},

},

// add it when making sure it only running in one cluster.

cluster: {

enable: true,

// add redis for redis lock

redis: {

client: {

port: process.env.REDIS_PORT || 6379,

host: process.env.REDIS_HOST || '127.0.0.1',

password: process.env.REDIS_PASSWORD,

db: process.env.REDIS_DB || 0,

tls: process.env.REDIS_TLS === 'true' ? {} : undefined,

retryStrategy: (times) => Math.min(times * 50, 2000),

},

},

},

coderabbitai · 2024-11-12T19:07:41Z

lib/strategy/all.js

+      const projectName =
+        curConfig.default === 'default'
+          ? this.agent.baseDir.split('/').pop()
+          : curConfig.default;


🛠️ Refactor suggestion

Use the path module for reliable path handling across platforms.

The current code uses string manipulation to extract the project name from this.agent.baseDir, which may cause issues on Windows systems due to different path separators. Using path.basename() from Node.js's path module ensures consistent behavior across all operating systems.

Import the path module and modify the code:

+ const path = require('path'); const projectName = curConfig.default === 'default' - ? this.agent.baseDir.split('/').pop() + ? path.basename(this.agent.baseDir) : curConfig.default;

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-12T19:07:41Z

lib/strategy/worker.js

+      if (await this.agent.redisClient.get(lockedKey)) {
+        isLocked = true;
+      }
+      await this.agent.redisClient.set(lockedKey, true);


⚠️ Potential issue

Set an expiration on the Redis lock key to prevent stale locks

If the process crashes before deleting the lock key, the lock may remain indefinitely, blocking future executions. Consider setting an expiration time on the Redis key to avoid stale locks.

Apply this diff:

- await this.agent.redisClient.set(lockedKey, true, 'NX'); + const lockExpirationTime = 60; // lock expires in 60 seconds + const lockSet = await this.agent.redisClient.set(lockedKey, true, 'NX', 'EX', lockExpirationTime);

Ensure that lockExpirationTime is set to an appropriate value based on the expected execution time of the task.

Also applies to: 31-33

coderabbitai · 2024-11-12T19:07:41Z

lib/strategy/worker.js

+    if (curConfig.cluster.enable) {
+      await this.agent.redisClient.del(lockedKey);
+    }


🛠️ Refactor suggestion

Consistently use optional chaining for curConfig.cluster

To prevent potential runtime errors when curConfig or curConfig.cluster is undefined, use optional chaining when accessing curConfig.cluster.enable.

Apply this diff:

- if (curConfig.cluster.enable) { + if (curConfig?.cluster?.enable) {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (curConfig.cluster.enable) {

await this.agent.redisClient.del(lockedKey);

}

if (curConfig?.cluster?.enable) {

await this.agent.redisClient.del(lockedKey);

}

coderabbitai · 2024-11-12T19:07:41Z

lib/strategy/worker.js

+      if (await this.agent.redisClient.get(lockedKey)) {
+        isLocked = true;
+      }
+      await this.agent.redisClient.set(lockedKey, true);


⚠️ Potential issue

Use atomic Redis operations to prevent race conditions

Currently, there is a potential race condition between the GET and SET operations. To ensure that only one process can acquire the lock at a time, use an atomic Redis SET operation with the NX (set if not exists) option.

Apply this diff to fix the issue:

- if (await this.agent.redisClient.get(lockedKey)) { - isLocked = true; - } - await this.agent.redisClient.set(lockedKey, true); + const lockSet = await this.agent.redisClient.set(lockedKey, true, 'NX'); + if (!lockSet) { + isLocked = true; + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (await this.agent.redisClient.get(lockedKey)) {

isLocked = true;

}

await this.agent.redisClient.set(lockedKey, true);

const lockSet = await this.agent.redisClient.set(lockedKey, true, 'NX');

if (!lockSet) {

isLocked = true;

}

it ensures the setting in clusters should work When lockedKey is not occupied.

@oneWalker, understood. Thank you for the clarification.

coderabbitai · 2024-11-12T19:07:41Z

lib/strategy/worker.js

+    if (!isLocked) {
+      this.sendOne();
+    }
+
+    if (curConfig.cluster.enable) {
+      await this.agent.redisClient.del(lockedKey);
+    }


⚠️ Potential issue

Ensure Redis lock key is released even if an exception occurs

To prevent the lock from remaining if an exception occurs during task execution, wrap the critical section in a try...finally block. This ensures that the lock key is deleted even if an error is thrown.

Apply this diff:

if (!isLocked) { - this.sendOne(); + try { + await this.sendOne(); + } finally { + if (curConfig?.cluster?.enable) { + await this.agent.redisClient.del(lockedKey); + } + } } - if (curConfig?.cluster?.enable) { - await this.agent.redisClient.del(lockedKey); - }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (!isLocked) {

this.sendOne();

}

if (curConfig.cluster.enable) {

await this.agent.redisClient.del(lockedKey);

}

if (!isLocked) {

try {

await this.sendOne();

} finally {

if (curConfig?.cluster?.enable) {

await this.agent.redisClient.del(lockedKey);

}

}

}

only when the custer mode is enabled, we need to use Redis to achieve the lock.

@oneWalker, understood. Thank you for the clarification that we should use Redis for locking only when cluster mode is enabled.

✏️ Learnings added

Learnt from: oneWalker PR: eggjs/egg-schedule#61 File: lib/strategy/worker.js:27-33 Timestamp: 2024-11-12T19:21:57.336Z Learning: In the `WorkerStrategy` class in `lib/strategy/worker.js`, ensure that Redis locks are used only when cluster mode is enabled.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

fengmk2 · 2024-11-15T12:50:16Z

得抽象一下看看如何实现，目前是强依赖 redis，不是所有公司都是基于 redis 的，还有其他分布式锁的实现。

…isLock of Lock with some security methods

coderabbitai

Actionable comments posted: 15

🧹 Outside diff range and nitpick comments (10)

lib/strategy/all.js (1)

6-23: Consider a more flexible locking mechanism design.

As mentioned in the PR feedback, the current implementation might be too tightly coupled with Redis. Consider implementing a more abstract locking interface that could support different backends (Redis, etcd, Zookeeper, etc.).

Suggestions:

Create an abstract LockProvider interface

Allow configuration of different lock providers

Implement provider-specific logic in separate classes

Use dependency injection to inject the configured provider

This would make the implementation more flexible and maintainable for different deployment scenarios.

lib/strategy/worker.js (1)

5-6: Rename the file to match the class name

The file is named worker.js but contains AllStrategy class. Consider renaming the file to all.js to maintain consistency between file names and class names.
lib/lock/base.js (2)
1-2: Remove redundant 'use strict' directive

The 'use strict' directive is unnecessary in ES modules as they are strict by default.
-'use strict';
-
🧰 Tools

🪛 Biome

[error] 1-1: Redundant use strict directive.

The entire contents of JavaScript modules are automatically in strict mode, with no statement needed to initiate it.
Safe fix: Remove the redundant use strict directive.

(lint/suspicious/noRedundantUseStrict)

3-38: Consider implementing a provider pattern for lock implementations

Based on the PR feedback about Redis dependency, consider enhancing the architecture to support multiple lock providers:

Different organizations might use different distributed lock solutions (Redis, etcd, Zookeeper, etc.)

Some might prefer using k8s native solutions like leader election

Consider implementing a factory pattern:
// lib/lock/factory.js
class LockFactory {
  static createLock(type, agent) {
    switch (type) {
      case 'redis':
        return new RedisLock(agent);
      case 'etcd':
        return new EtcdLock(agent);
      case 'kubernetes':
        return new KubernetesLock(agent);
      default:
        throw new Error(`Unsupported lock type: ${type}`);
    }
  }
}
This would allow users to choose their preferred locking mechanism through configuration:
// config/config.default.js
exports.schedule = {
  lockStrategy: 'redis', // or 'etcd', 'kubernetes'
  // strategy specific config...
};
README.md (1)
264-281: Improve overall documentation structure for cluster mode

The cluster configuration would benefit from better integration into the overall documentation:

Add a "Cluster Mode" section in the Overview

Include a troubleshooting guide for common cluster issues

Add examples of monitoring and debugging cluster deployments

Consider adding these sections before the Configuration section:
## Cluster Mode

When deploying the application in a clustered environment (e.g., Kubernetes), you may want to ensure that scheduled tasks execute only once across all instances. The cluster mode uses Redis for distributed locking to prevent duplicate task execution.

### How it works
- When a task is scheduled to run, the scheduler attempts to acquire a Redis lock
- Only one instance will successfully acquire the lock and execute the task
- The lock is automatically released after task completion or timeout

### Troubleshooting

Common issues and solutions:

1. **Duplicate task execution**
   - Verify Redis connection settings
   - Check `lockedTtl` against task duration
   - Ensure consistent clock synchronization

2. **Tasks not executing**
   - Verify Redis connectivity
   - Check Redis key patterns
   - Monitor lock acquisition attempts
lib/lock/redis_lock.js (5)
1-1: Remove redundant 'use strict' directive

The 'use strict' directive is redundant in ECMAScript modules, as modules are always in strict mode.

Apply this diff to remove the redundant directive:
- 'use strict';
🧰 Tools

🪛 Biome

[error] 1-1: Redundant use strict directive.

The entire contents of JavaScript modules are automatically in strict mode, with no statement needed to initiate it.
Safe fix: Remove the redundant use strict directive.

(lint/suspicious/noRedundantUseStrict)

6-9: Handle Redis client errors gracefully

When initializing the Redis client, consider adding error handling to manage potential connection issues with Redis. This ensures that the application can handle connection failures without crashing.

Example:
constructor(agent) {
  super(agent);
  this.client = new Redis(this.options);

  this.client.on('error', err => {
    this.logger.error(`[egg-schedule] Redis client error: ${err.message}`);
  });
}
16-31: Consider decoupling lock acquisition timeout from lock expiration time

Currently, the expiredTime parameter is used both for the lock's expiration time and the maximum duration to attempt acquiring the lock in the acquire method. This could limit flexibility if different durations are needed for each.

Consider introducing separate parameters for lock TTL and acquisition timeout:
async acquire(
  lockedKey,
  lockTtl = this.agent.config.schedule.cluster.lockedTtl,
  acquireTimeout = 5000 // Default acquisition timeout in milliseconds
) {
  const start = Date.now();
  while (Date.now() - start < acquireTimeout) {
    if (await this.tryAcquire(lockedKey, lockTtl)) {
      return true;
    }
    // Sleep as before
    const randomSleepTime = Math.random() * 900 + 100;
    await new Promise(resolve => setTimeout(resolve, randomSleepTime));
  }
  return false;
}
27-28: Reconsider random sleep time strategy

Using a random sleep time between retries can reduce lock contention, but it may introduce unnecessary delays. Consider implementing an exponential backoff strategy to optimize the retry intervals.

Example:
let retryCount = 0;
const maxRetryDelay = 1000; // Maximum delay of 1 second
while (Date.now() - start < acquireTimeout) {
  if (await this.tryAcquire(lockedKey, lockTtl)) {
    return true;
  }
  const retryDelay = Math.min(100 * 2 ** retryCount, maxRetryDelay);
  await new Promise(resolve => setTimeout(resolve, retryDelay));
  retryCount++;
}
5-70: Consider abstracting the lock mechanism to support multiple backends

The current implementation introduces a strong dependency on Redis for distributed locking. As noted in the PR comments, not all organizations use Redis. To increase flexibility, consider abstracting the lock mechanism to support multiple backends (e.g., Redis, ZooKeeper, etcd).

This can be achieved by defining a common interface or base class for lock managers and allowing users to configure the desired backend.

Example:

Define a generic LockManager interface.

Implement specific lock managers like RedisLockManager, EtcdLockManager, etc.

Allow configuration to select the lock manager type.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between cf4792b and 2f46c2c.

📒 Files selected for processing (7)

README.md (1 hunks)
agent.js (2 hunks)
config/config.default.js (1 hunks)
lib/lock/base.js (1 hunks)
lib/lock/redis_lock.js (1 hunks)
lib/strategy/all.js (1 hunks)
lib/strategy/worker.js (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

config/config.default.js

🧰 Additional context used

📓 Learnings (1)

lib/strategy/worker.js (1)

Learnt from: oneWalker
PR: eggjs/egg-schedule#61
File: lib/strategy/worker.js:27-33
Timestamp: 2024-11-12T19:21:57.482Z
Learning: In the `WorkerStrategy` class in `lib/strategy/worker.js`, ensure that Redis locks are used only when cluster mode is enabled.

🪛 Biome

lib/lock/base.js

[error] 1-1: Redundant use strict directive.

The entire contents of JavaScript modules are automatically in strict mode, with no statement needed to initiate it.
Safe fix: Remove the redundant use strict directive.

(lint/suspicious/noRedundantUseStrict)

lib/lock/redis_lock.js

[error] 1-1: Redundant use strict directive.

The entire contents of JavaScript modules are automatically in strict mode, with no statement needed to initiate it.
Safe fix: Remove the redundant use strict directive.

(lint/suspicious/noRedundantUseStrict)

🔇 Additional comments (3)

lib/strategy/all.js (1)

6-8: LGTM! Good use of async/await and optional chaining.

The method signature and initialization are well implemented with proper error handling through optional chaining.

agent.js (1)

15-22: ⚠️ Potential issue

Improve lock manager initialization with better error handling and cleanup.

The lock manager initialization needs proper error handling, configuration validation, and cleanup.

The previous review comments about Redis connection handling are still valid:

Add Redis connection cleanup
Add error handling for Redis connection
Add Redis configuration validation

Additionally, consider this improved implementation:

    if (agent?.config?.schedule?.cluster?.enable) {
-      if (
-        agent.config.schedule.cluster.lockType === 'redis' &&
-        agent.config.schedule.cluster.redis
-      ) {
-        agent.lockManager = new RedisLock(agent);
-      }
+      const { lockType, redis: redisConfig } = agent.config.schedule.cluster;
+      
+      try {
+        agent.lockManager = createLockManager(lockType, agent);
+        
+        // Register cleanup
+        agent.beforeClose(async () => {
+          if (agent.lockManager) {
+            await agent.lockManager.destroy();
+            agent.lockManager = null;
+          }
+        });
+      } catch (err) {
+        agent.logger.error('[egg-schedule] Failed to initialize lock manager:', err);
+        throw err;
+      }
    }

README.md (1)

270-278: Add cluster-specific Redis configuration guidance

While security improvements were previously suggested, the Redis configuration also needs cluster-specific guidance:

Document Redis connection pool settings for cluster deployments
Add retry/reconnection strategies for network issues
Include high availability configuration examples

Add the following configuration examples:

     redis: {
       client: {
+        // Connection pool settings for cluster deployments
+        maxRetriesPerRequest: 3,
+        enableReadyCheck: true,
+        // High availability configuration
+        sentinels: [
+          { host: 'sentinel-1', port: 26379 },
+          { host: 'sentinel-2', port: 26379 }
+        ],
+        name: 'mymaster',
       },
     },

coderabbitai · 2024-11-16T19:59:10Z

lib/strategy/all.js

+    const curConfig = this.agent?.config?.schedule;
+    let lockedKey = '';
+    if (curConfig?.cluster?.enable) {
+      lockedKey = this.key.replace(this.agent.baseDir, '');


🛠️ Refactor suggestion

Use path manipulation for reliable lock key generation.

The current lock key generation using string replace is fragile and may not work correctly across platforms or when the base directory path appears multiple times in the key.

Consider using path manipulation:

- lockedKey = this.key.replace(this.agent.baseDir, ''); + const path = require('path'); + lockedKey = path.relative(this.agent.baseDir, this.key);

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-16T19:59:11Z

lib/strategy/all.js

+    if (canBeLocked) {
+      this.sendAll();
+    }


⚠️ Potential issue

Add error handling and make execution asynchronous.

The execution block needs proper error handling and should handle potential async operations in sendAll().

Consider adding try-catch and making it async:

if (canBeLocked) { - this.sendAll(); + try { + await this.sendAll(); + } catch (error) { + this.agent.logger.error('[egg-schedule] Failed to execute scheduled task:', error); + throw error; // Re-throw to ensure lock is released in the finally block + } }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-16T19:59:11Z

lib/strategy/all.js

+      if (!(await this.agent.lockManager.tryAcquire(lockedKey))) {
+        canBeLocked = false;
+      }


⚠️ Potential issue

Add error handling and timeout for lock acquisition.

The lock acquisition needs proper error handling and a timeout to prevent indefinite waiting.

Consider adding try-catch and timeout:

- if (!(await this.agent.lockManager.tryAcquire(lockedKey))) { - canBeLocked = false; - } + try { + const timeout = 5000; // 5 seconds timeout + if (!(await this.agent.lockManager.tryAcquire(lockedKey, timeout))) { + canBeLocked = false; + } + } catch (error) { + this.agent.logger.error('[egg-schedule] Failed to acquire lock:', error); + canBeLocked = false; + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (!(await this.agent.lockManager.tryAcquire(lockedKey))) {

canBeLocked = false;

}

try {

const timeout = 5000; // 5 seconds timeout

if (!(await this.agent.lockManager.tryAcquire(lockedKey, timeout))) {

canBeLocked = false;

}

} catch (error) {

this.agent.logger.error('[egg-schedule] Failed to acquire lock:', error);

canBeLocked = false;

}

coderabbitai · 2024-11-16T19:59:11Z

lib/strategy/all.js

+    if (curConfig?.cluster?.enable) {
+      await this.agent.lockManager.release(lockedKey);
+    }


⚠️ Potential issue

Ensure lock release with proper error handling.

The lock release should be in a finally block and include error handling to ensure proper cleanup.

Restructure the code:

- if (curConfig?.cluster?.enable) { - await this.agent.lockManager.release(lockedKey); - } + try { + if (canBeLocked) { + await this.sendAll(); + } + } finally { + if (curConfig?.cluster?.enable) { + try { + await this.agent.lockManager.release(lockedKey); + } catch (error) { + this.agent.logger.error('[egg-schedule] Failed to release lock:', error); + } + } + }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-16T19:59:11Z

lib/strategy/worker.js

+    if (canBeLocked) {
+      this.sendOne();
+    }


⚠️ Potential issue

Add error handling and ensure proper async execution

The task execution block lacks error handling and might not properly handle async operations.

Apply this diff:

if (canBeLocked) { - this.sendOne(); + try { + await this.sendOne(); + } catch (error) { + this.agent.logger.error('[egg-schedule] Task execution failed:', error); + throw error; // Re-throw to trigger lock release in the finally block + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (canBeLocked) {

this.sendOne();

}

if (canBeLocked) {

try {

await this.sendOne();

} catch (error) {

this.agent.logger.error('[egg-schedule] Task execution failed:', error);

throw error; // Re-throw to trigger lock release in the finally block

}

}

coderabbitai · 2024-11-16T19:59:11Z

lib/lock/base.js

+  /**
+   * Require a lock from lock manager
+   *
+   * @param {string} _lockedKey - The key to lock
+   */
+  async acquire() {
+    // Implementation here
+  }
+
+  /**
+   * Release a lock from lock manager
+   *
+   * @param {string} _lockedKey - The key to unlock
+   */
+  async release() {
+    // Implementation here
+  }
+
+  /**
+   * Try to acquire without waiting
+   *
+   * @param {string} _lockedKey - The key to try to lock
+   */
+  async tryAcquire() {
+    // Use _lockedKey in the implementation
+    // Implementation here
+  }


🛠️ Refactor suggestion

Enhance abstract methods documentation and contract

The abstract methods need improvements:

Methods should be truly abstract

JSDoc is incomplete

Parameters are defined but unused

Consider this improved implementation:

/** * Require a lock from lock manager * - * @param {string} _lockedKey - The key to lock + * @param {string} lockedKey - The key to lock + * @returns {Promise<boolean>} True if lock was acquired, false otherwise + * @throws {Error} If lock acquisition fails + * @abstract */ - async acquire() { - // Implementation here - } + async acquire(lockedKey) { + throw new Error('Method not implemented'); + } /** * Release a lock from lock manager * - * @param {string} _lockedKey - The key to unlock + * @param {string} lockedKey - The key to unlock + * @returns {Promise<void>} + * @throws {Error} If lock release fails + * @abstract */ - async release() { - // Implementation here - } + async release(lockedKey) { + throw new Error('Method not implemented'); + } /** * Try to acquire without waiting * - * @param {string} _lockedKey - The key to try to lock + * @param {string} lockedKey - The key to try to lock + * @returns {Promise<boolean>} True if lock was acquired, false if already locked + * @throws {Error} If lock attempt fails + * @abstract */ - async tryAcquire() { - // Use _lockedKey in the implementation - // Implementation here - } + async tryAcquire(lockedKey) { + throw new Error('Method not implemented'); + }

Additionally, consider adding implementation guidance in the class JSDoc:

/** * Base class for lock management implementations. * * Implementations must: * 1. Handle distributed locking mechanism * 2. Ensure atomic lock operations * 3. Handle lock expiration * 4. Implement proper error handling * * @abstract */

coderabbitai · 2024-11-16T19:59:11Z

lib/lock/base.js

+  constructor(agent) {
+    this.agent = agent;
+    this.client = null;
+    this.options = null;
+    this.prefixKey = `${this.agent?.config?.schedule?.default === 'default' ? this.agent.name : this.agent?.config?.schedule?.default}-${this.agent?.config?.schedule?.prefix}`;
+  }


🛠️ Refactor suggestion

Improve constructor's type safety and validation

The constructor needs several improvements:

The prefixKey construction is prone to null reference errors

Missing parameter validation

Lacks comprehensive type definitions

Consider this improved implementation:

+/** + * @typedef {Object} Agent + * @property {Object} config + * @property {Object} config.schedule + * @property {string} config.schedule.default + * @property {string} config.schedule.prefix + * @property {string} name + */ +/** + * Base class for lock management + */ module.exports = class LockManager { + /** + * @param {Agent} agent - The agent instance + * @throws {Error} If agent or required configuration is missing + */ constructor(agent) { + if (!agent) { + throw new Error('Agent is required'); + } this.agent = agent; this.client = null; this.options = null; - this.prefixKey = `${this.agent?.config?.schedule?.default === 'default' ? this.agent.name : this.agent?.config?.schedule?.default}-${this.agent?.config?.schedule?.prefix}`; + const scheduleConfig = agent.config?.schedule; + if (!scheduleConfig?.prefix) { + throw new Error('Schedule prefix configuration is required'); + } + const defaultValue = scheduleConfig.default || 'default'; + const prefix = defaultValue === 'default' ? agent.name : defaultValue; + this.prefixKey = `${prefix}-${scheduleConfig.prefix}`; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

constructor(agent) {

this.agent = agent;

this.client = null;

this.options = null;

this.prefixKey = `${this.agent?.config?.schedule?.default === 'default' ? this.agent.name : this.agent?.config?.schedule?.default}-${this.agent?.config?.schedule?.prefix}`;

}

/**

* @typedef {Object} Agent

* @property {Object} config

* @property {Object} config.schedule

* @property {string} config.schedule.default

* @property {string} config.schedule.prefix

* @property {string} name

*/

/**

* Base class for lock management

*/

module.exports = class LockManager {

/**

* @param {Agent} agent - The agent instance

* @throws {Error} If agent or required configuration is missing

*/

constructor(agent) {

if (!agent) {

throw new Error('Agent is required');

}

this.agent = agent;

this.client = null;

this.options = null;

const scheduleConfig = agent.config?.schedule;

if (!scheduleConfig?.prefix) {

throw new Error('Schedule prefix configuration is required');

}

const defaultValue = scheduleConfig.default || 'default';

const prefix = defaultValue === 'default' ? agent.name : defaultValue;

this.prefixKey = `${prefix}-${scheduleConfig.prefix}`;

}

coderabbitai · 2024-11-16T19:59:12Z

README.md

+  //the prefix for lockedKey for redis lock
+  default: 'default', // default schedule name,like project-name.
+  prefix: 'schedule', // default schedule prefix


🛠️ Refactor suggestion

Document Redis key patterns and management

The default and prefix configuration properties need better documentation to help users understand:

The resulting Redis key pattern

Impact on monitoring and debugging

Considerations for multi-application environments

Add detailed documentation:

- //the prefix for lockedKey for redis lock - default: 'default', // default schedule name,like project-name. - prefix: 'schedule', // default schedule prefix + // Redis key generation configuration + // The final Redis key pattern will be: `${prefix}:${default}:${scheduleName}` + // Example: "schedule:my-app:clean_cache" + default: 'default', // Application identifier, recommended to use your project name + prefix: 'schedule', // Global prefix for all schedule locks + + // Example resulting Redis keys: + // - schedule:my-app:clean_cache + // - schedule:my-app:sync_data

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

//the prefix for lockedKey for redis lock

default: 'default', // default schedule name,like project-name.

prefix: 'schedule', // default schedule prefix

// Redis key generation configuration

// The final Redis key pattern will be: `${prefix}:${default}:${scheduleName}`

// Example: "schedule:my-app:clean_cache"

default: 'default', // Application identifier, recommended to use your project name

prefix: 'schedule', // Global prefix for all schedule locks

// Example resulting Redis keys:

// - schedule:my-app:clean_cache

// - schedule:my-app:sync_data

coderabbitai · 2024-11-16T19:59:12Z

lib/lock/redis_lock.js

+    try {
+      lockedKey = `${this.prefixKey}-${lockedKey}`;
+      await this.client.del(lockedKey);
+    } catch (err) {
+      this.logger.error(
+        `[egg-schedule] ${this.key} release lock error: ${err.message}`
+      );
+    }
+  }


⚠️ Potential issue

Ensure locks are released only by the owner

Currently, the release method deletes the lock key without checking if the releasing instance owns the lock. In distributed systems, it's possible that the lock has expired or was acquired by another process. Consider implementing a mechanism to ensure that only the owner who acquired the lock can release it.

One approach is to store a unique identifier (e.g., a UUID) when setting the lock and check this identifier when releasing the lock.

Example:

// In tryAcquire const lockValue = `${this.key}-${Date.now()}`; const result = await this.client.set(lockedKey, lockValue, 'PX', expiredTime, 'NX'); // In release async release(lockedKey) { try { lockedKey = `${this.prefixKey}-${lockedKey}`; const lockValue = await this.client.get(lockedKey); if (lockValue === this.lockValue) { await this.client.del(lockedKey); } } catch (err) { this.logger.error( `[egg-schedule] ${this.key} release lock error: ${err.message}` ); } }

This ensures that a lock is only released by the process that acquired it.

lib/lock/redis_lock.js

oneWalker · 2024-11-16T20:11:14Z

得抽象一下看看如何实现，目前是强依赖 redis，不是所有公司都是基于 redis 的，还有其他分布式锁的实现。
@fengmk2 实现了一版，看看👀。其他方式做了预留。

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)

README.md (2)

270-278: Add Redis connection pool and cluster configuration options

The Redis configuration should include additional options for production deployments:

Connection pool settings for better performance
Redis cluster/sentinel configuration

Enhance the Redis configuration:

     redis: {
       client: {
         port: 6379,
         host: '127.0.0.1',
         password: 'auth',
         db: 0,
+        // Connection Pool
+        maxRetriesPerRequest: 3,
+        enableReadyCheck: true,
+        connectionName: 'egg-schedule',
+        // Cluster/Sentinel Support
+        sentinels: process.env.REDIS_SENTINELS ? JSON.parse(process.env.REDIS_SENTINELS) : null,
+        sentinelPassword: process.env.REDIS_SENTINEL_PASSWORD,
+        name: process.env.REDIS_SENTINEL_NAME,
       },
     },

279-281: Document key collision prevention strategies

The key pattern configuration should address potential naming conflicts when multiple applications share the same Redis instance.

Add key collision prevention guidance:

-  // The final Redis key pattern will be: `${prefix}-${default}-${scheduleName}`
+  // The final Redis key pattern will be: `${prefix}:${default}:${scheduleName}`
+  // Key Collision Prevention:
+  // 1. Use unique application names in the 'default' field
+  // 2. Consider including environment (dev/staging/prod) in the prefix
+  // 3. Example: "schedule:myapp-prod:clean_cache"
   default: 'default', // default schedule name,like project-name.
   prefix: 'schedule', // default schedule prefix

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2f46c2c and 176ec2c.

📒 Files selected for processing (3)

README.md (1 hunks)
lib/strategy/all.js (1 hunks)
lib/strategy/worker.js (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

lib/strategy/all.js

🧰 Additional context used

📓 Learnings (1)

lib/strategy/worker.js (1)

Learnt from: oneWalker
PR: eggjs/egg-schedule#61
File: lib/strategy/worker.js:27-33
Timestamp: 2024-11-12T19:21:57.482Z
Learning: In the `WorkerStrategy` class in `lib/strategy/worker.js`, ensure that Redis locks are used only when cluster mode is enabled.

🔇 Additional comments (4)

lib/strategy/worker.js (4)

7-10: LGTM! Good use of optional chaining and variable initialization

The initialization of the handler method is well-structured with:

Proper async/await usage
Safe configuration access using optional chaining
Clear variable scoping

18-20: ⚠️ Potential issue

Add proper async/await and error handling for task execution

The task execution needs to:

Properly await the async operation
Include error handling

Apply this diff:

 if (canBeLocked) {
-  this.sendOne();
+  try {
+    await this.sendOne();
+  } catch (error) {
+    this.agent.logger.error('[egg-schedule] Task execution failed:', error);
+    throw error;  // Ensure the error propagates for proper lock cleanup
+  }
 }

Likely invalid or redundant comment.

4-4: Verify the impact of class rename from WorkerStrategy to AllStrategy

The class has been renamed from WorkerStrategy to AllStrategy. This change might affect existing code that imports and uses this class.

Also applies to: 6-6

22-24: ⚠️ Potential issue

Ensure reliable lock cleanup with proper error handling

The lock release logic should:

Be in a try-finally block to guarantee cleanup
Include error handling
Only attempt release if lock was acquired

Restructure the entire handler method with this diff:

 async handler() {
   let canBeLocked = true;
   const curConfig = this.agent?.config?.schedule;
   let lockedKey = '';
+  let lockAcquired = false;

   if (curConfig?.cluster?.enable) {
     try {
       lockedKey = path.relative(this.agent.baseDir, this.key)
         .replace(/[^a-zA-Z0-9]/g, '_');
       
       if (await this.agent.lockManager.tryAcquire(lockedKey, { ttl: 60000 })) {
+        lockAcquired = true;
+      } else {
         canBeLocked = false;
       }
     } catch (error) {
       this.agent.logger.error('[egg-schedule] Failed to acquire lock:', error);
       canBeLocked = false;
     }
   }

   try {
     if (canBeLocked) {
       await this.sendOne();
     }
   } finally {
     if (curConfig?.cluster?.enable && lockAcquired) {
       try {
         await this.agent.lockManager.release(lockedKey);
       } catch (error) {
         this.agent.logger.error('[egg-schedule] Failed to release lock:', error);
       }
     }
   }
 }

Likely invalid or redundant comment.

coderabbitai · 2024-11-16T20:13:16Z

lib/strategy/worker.js

+    if (curConfig?.cluster?.enable) {
+      lockedKey = path.relative(this.agent.baseDir, this.key);
+      if (!(await this.agent.lockManager.tryAcquire(lockedKey))) {
+        canBeLocked = false;
+      }
+    }


🛠️ Refactor suggestion

Enhance lock acquisition robustness and error handling

The lock acquisition logic needs improvements in several areas:

Lock key construction should handle edge cases

Lock acquisition should include error handling

Consider adding a TTL to prevent stale locks

Apply this diff:

if (curConfig?.cluster?.enable) { - lockedKey = path.relative(this.agent.baseDir, this.key); - if (!(await this.agent.lockManager.tryAcquire(lockedKey))) { - canBeLocked = false; + try { + // Ensure safe lock key construction + lockedKey = path.relative(this.agent.baseDir, this.key) + .replace(/[^a-zA-Z0-9]/g, '_'); // sanitize key + + // Add TTL to prevent stale locks + if (!(await this.agent.lockManager.tryAcquire(lockedKey, { ttl: 60000 }))) { + canBeLocked = false; + } + } catch (error) { + this.agent.logger.error('[egg-schedule] Failed to acquire lock:', error); + canBeLocked = false; } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (curConfig?.cluster?.enable) {

lockedKey = path.relative(this.agent.baseDir, this.key);

if (!(await this.agent.lockManager.tryAcquire(lockedKey))) {

canBeLocked = false;

}

}

if (curConfig?.cluster?.enable) {

try {

// Ensure safe lock key construction

lockedKey = path.relative(this.agent.baseDir, this.key)

.replace(/[^a-zA-Z0-9]/g, '_'); // sanitize key

// Add TTL to prevent stale locks

if (!(await this.agent.lockManager.tryAcquire(lockedKey, { ttl: 60000 }))) {

canBeLocked = false;

}

} catch (error) {

this.agent.logger.error('[egg-schedule] Failed to acquire lock:', error);

canBeLocked = false;

}

}

coderabbitai · 2024-11-16T20:13:17Z

README.md

+  // add it when making sure it only running in one cluster. 
+  cluster: {
+    enable: false, // default is false, true is for running
+    lockType: 'redis', //use redis as lock. reserve for other lock type
+    lockedTtl: 6000, // the automatic expired time for lock, avoid lock leak
+    // add redis for redis lock


🛠️ Refactor suggestion

Document failure scenarios and high availability considerations

The cluster configuration documentation should include:

How the system behaves during Redis failures

Recovery mechanisms for lock acquisition failures

Recommendations for high availability setups

Add the following section:

cluster: { enable: false, // default is false, true is for running lockType: 'redis', //use redis as lock. reserve for other lock type lockedTtl: 6000, // the automatic expired time for lock, avoid lock leak + // Failure Handling: + // - If Redis is temporarily unavailable, tasks will execute on all nodes + // - If a node crashes while holding a lock, the lock expires after lockedTtl + // High Availability: + // - Recommended to use Redis sentinel or cluster for HA + // - Consider setting lockedTtl based on your task's typical execution time },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// add it when making sure it only running in one cluster.

cluster: {

enable: false, // default is false, true is for running

lockType: 'redis', //use redis as lock. reserve for other lock type

lockedTtl: 6000, // the automatic expired time for lock, avoid lock leak

// add redis for redis lock

// add it when making sure it only running in one cluster.

cluster: {

enable: false, // default is false, true is for running

lockType: 'redis', //use redis as lock. reserve for other lock type

lockedTtl: 6000, // the automatic expired time for lock, avoid lock leak

// Failure Handling:

// - If Redis is temporarily unavailable, tasks will execute on all nodes

// - If a node crashes while holding a lock, the lock expires after lockedTtl

// High Availability:

// - Recommended to use Redis sentinel or cluster for HA

// - Consider setting lockedTtl based on your task's typical execution time

// add redis for redis lock

oneWalker added 3 commits November 13, 2024 02:33

✨ feat: avoid run many times for the same job when it's deployed in k…

6ec7290

…8s clusters

docs: add readme for the new feature

4fb1881

ci: npm test problem in node 14

cf4792b

coderabbitai bot reviewed Nov 12, 2024

View reviewed changes

oneWalker changed the title ~~feat: support schedule only executes once when it's deployed in k8s clusters mode.~~ feat: support schedule only executes once in random one cluster when it's deployed in k8s clusters mode. Nov 12, 2024

oneWalker added 3 commits November 13, 2024 03:22

fix: the potential error

190a189

fix: occupied system

d73517d

perf: use path.basename to get projectname

201b1d0

oneWalker added 2 commits November 17, 2024 03:50

feat: -1.abstract base lock for other kinds of locks.2. implement red…

aecee0a

…isLock of Lock with some security methods

docs: add usage

2f46c2c

coderabbitai bot reviewed Nov 16, 2024

View reviewed changes

perf: enhanced security with AI suggestions

176ec2c

coderabbitai bot reviewed Nov 16, 2024

View reviewed changes

-  constructor(agent) {
-    this.agent = agent;
-    this.client = null;
-    this.options = null;
-    this.prefixKey = `${this.agent?.config?.schedule?.default === 'default' ? this.agent.name : this.agent?.config?.schedule?.default}-${this.agent?.config?.schedule?.prefix}`;
-  }
+/**
+ * @typedef {Object} Agent
+ * @property {Object} config
+ * @property {Object} config.schedule
+ * @property {string} config.schedule.default
+ * @property {string} config.schedule.prefix
+ * @property {string} name
+ */
+/**
+ * Base class for lock management
+ */
+module.exports = class LockManager {
+  /**
+   * @param {Agent} agent - The agent instance
+   * @throws {Error} If agent or required configuration is missing
+   */
+  constructor(agent) {
+    if (!agent) {
+      throw new Error('Agent is required');
+    }
+    this.agent = agent;
+    this.client = null;
+    this.options = null;
+    const scheduleConfig = agent.config?.schedule;
+    if (!scheduleConfig?.prefix) {
+      throw new Error('Schedule prefix configuration is required');
+    }
+    const defaultValue = scheduleConfig.default || 'default';
+    const prefix = defaultValue === 'default' ? agent.name : defaultValue;
+    this.prefixKey = `${prefix}-${scheduleConfig.prefix}`;
+  }

-  //the prefix for lockedKey for redis lock
-  default: 'default', // default schedule name,like project-name.
-  prefix: 'schedule', // default schedule prefix
+  // Redis key generation configuration
+  // The final Redis key pattern will be: `${prefix}:${default}:${scheduleName}`
+  // Example: "schedule:my-app:clean_cache"
+  default: 'default', // Application identifier, recommended to use your project name
+  prefix: 'schedule', // Global prefix for all schedule locks
+  // Example resulting Redis keys:
+  // - schedule:my-app:clean_cache
+  // - schedule:my-app:sync_data

feat: support schedule only executes once in random one cluster when it's deployed in k8s clusters mode. #61

Are you sure you want to change the base?

feat: support schedule only executes once in random one cluster when it's deployed in k8s clusters mode. #61

Conversation

oneWalker commented Nov 12, 2024 • edited by coderabbitai bot Loading

Checklist

Affected core subsystem(s)

Description of change

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 12, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

oneWalker Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

oneWalker Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

oneWalker Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

oneWalker Nov 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 12, 2024

Choose a reason for hiding this comment

fengmk2 commented Nov 15, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 16, 2024

Choose a reason for hiding this comment

oneWalker commented Nov 16, 2024

coderabbitai bot left a comment

oneWalker commented Nov 12, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 12, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Nov 12, 2024 •

edited

Loading

coderabbitai bot Nov 12, 2024 •

edited

Loading

coderabbitai bot Nov 12, 2024 •

edited

Loading

coderabbitai bot Nov 12, 2024 •

edited

Loading