High CPU Usage During Insertions in milvus-sdk-java #1061

LeePui · 2024-09-09T08:26:43Z

Hi, everyone, thank Hi everyone,

Thank you for providing such a convenient Java SDK; it has been very useful.

While using version 2.4.3 of the milvus-sdk-java, I have encountered some performance issues. Here are some metrics and analysis that I have gathered.

When performing insertions in a single thread, I noticed unusually high CPU usage. After profiling with async-profiler, I pinpointed the most time-consuming operation at this line: AbstractMilvusGrpcClient.java#L1569.

public R<MutationResult> insert(@NonNull InsertParam requestParam) {
        ......
        logDebug(requestParam.toString());
        ......
}

protected void logDebug(String msg, Object... params) {
    if (logLevel.ordinal() <= LogLevel.Debug.ordinal()) {
        logger.debug(msg, params);
    }
}

The attached flame graph can attest to this issue.

The high CPU usage seems to be caused by premature calls to toString. In practice, when I set the log level to INFO, there is no need for the toString method to be called. I suggest checking the log level before calling toString.

Thank you for considering this improvement.

The text was updated successfully, but these errors were encountered:

yhmo · 2024-09-10T08:02:54Z

Thanks for pointing out this problem. I didn't realize it was a problem before.
The InsertParam.toString() is implemented by lombok annotation @tostring, which parses all the vectors to a long text like "[1.1234, 2.2234, ....]". It becomes a bottleneck when the inserted batch is large.

For the requests that could pass large/complicated data, we should manually customize the toString() method. For insertParam, we just want to print out the target collection name, the number of vectors, no need to print out all the vectors.
I will make a change for this, it will take effect in the next minor version.

xiaofan-luan · 2024-09-10T18:53:43Z

good catch!

yhmo · 2024-09-13T02:04:02Z

Fixed by this pr: #1064

yin-bp · 2024-11-01T15:46:30Z

the good code is:

if(logger.isDebugEnabled()){
               String msg = segmentIDs.getDataCount() + " segments of " + collectionName + " has been flushed";
                logDebug(msg);
}
if(logger.isDebugEnabled()){
              logDebug(requestParam.toString());
}

构建调试日志信息之前，就调用isDebugEnabled进行控制，这样性能才佳

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High CPU Usage During Insertions in milvus-sdk-java #1061

High CPU Usage During Insertions in milvus-sdk-java #1061

LeePui commented Sep 9, 2024

yhmo commented Sep 10, 2024

xiaofan-luan commented Sep 10, 2024

yhmo commented Sep 13, 2024 •

edited

Loading

yin-bp commented Nov 1, 2024 •

edited

Loading

High CPU Usage During Insertions in milvus-sdk-java #1061

High CPU Usage During Insertions in milvus-sdk-java #1061

Comments

LeePui commented Sep 9, 2024

yhmo commented Sep 10, 2024

xiaofan-luan commented Sep 10, 2024

yhmo commented Sep 13, 2024 • edited Loading

yin-bp commented Nov 1, 2024 • edited Loading

yhmo commented Sep 13, 2024 •

edited

Loading

yin-bp commented Nov 1, 2024 •

edited

Loading