Client and server interaction has a long latency. #2011

Daniel-blue · 2024-11-08T01:22:13Z

Describe your problem

During testing, it was found that the interaction between the client and server(vineyard instance) has a significant impact on latency, especially in scenarios involving multiple consecutive interactions. What is the purpose of breaking down the process between the client and the server into multiple interactions? Is there potential for improvement?

dashanji · 2024-11-16T09:13:52Z

Hi @Daniel-blue, Thanks for your reporting.

How do you start the vineyard? It's advisable to include the test code here to identify the source of the latency.

Daniel-blue · 2024-11-17T12:36:47Z

put：create_buffer_request-->seal_request-->create_data_request-->persist_request-->put_name_request
get：get_name_request-->get_data_request-->get_buffers_request

[ms]

import datetime
import numpy as np
import vineyard

block_size_MB = 30

def calculate_time_difference(label, start_time, end_time):
    time_diff = (end_time - start_time).total_seconds() * 1000  # Convert to milliseconds
    print(f"{label}: {time_diff:.2f} ms")

def create_data(block_size_MB, num_blocks):
    bytes_per_mb = 1024**2
    num_elements = (block_size_MB * bytes_per_mb) // 8
    data = np.random.rand(num_elements * num_blocks)
    return data

t1 = datetime.datetime.now()
client = vineyard.connect("/var/run/vineyard.sock")
t2 = datetime.datetime.now()
calculate_time_difference("client connect", t1, t2)
t3 = datetime.datetime.now()
data = create_data(block_size_MB, 32)  
object_id1 = client.put(data,persist=True,name="obj1")
t4 = datetime.datetime.now()
calculate_time_difference("put", t3, t4)
client.status
client.clear()
client.close()

import datetime
import numpy as np
import vineyard

def calculate_time_difference(label, start_time, end_time):
    time_diff = (end_time - start_time).total_seconds() * 1000  # Convert to milliseconds
    print(f"{label}: {time_diff:.2f} ms")

client = vineyard.connect("/var/run/vineyard.sock")
t9 = datetime.datetime.now()
object_ = client.get("object_id1")
t10 = datetime.datetime.now()
calculate_time_difference("get", t9, t10)
client.status
client.close()

Daniel-blue · 2024-11-18T02:01:14Z

ipc
4K：30MB32
8K：30MB64

Pre fill 0 data objects (get)

Pre fill 10000 data objects (get)

import numpy as np
import hashlib
import vineyard
client = vineyard.connect('/var/run/vineyard.sock')

def create_block(size_MB):
    bytes_per_mb = 1024**2  
    num_elements = (size_MB * bytes_per_mb) // 8 
    
    block = np.random.rand(num_elements)
    return block

def hash_block(block):
    hasher = hashlib.sha256()
    hasher.update(block.tobytes())
    return hasher.hexdigest()

def process_blocks(num_blocks, block_size_MB):
    hash_list = []
    for _ in range(num_blocks):
        block = create_block(block_size_MB)
        block_hash = hash_block(block)
        client.put(block, name=block_hash, persist=True)
        hash_list.append(block_hash)
    return hash_list


block_size_MB = 30  
num_blocks_4k = 32  
num_blocks_8k = 64  

hash_list_4k = process_blocks(num_blocks_4k, block_size_MB)
print(f"4K scenario data hashes: {hash_list_4k}")

hash_list_8k = process_blocks(num_blocks_8k, block_size_MB)
print(f"8K scenario data hashes: {hash_list_8k}")

import time

def read_blocks(hash_list):
    start_time = time.time()
    for hash_value in hash_list:
        client.get(name=hash_value)
    end_time = time.time()
    elapsed_time = end_time - start_time
    return elapsed_time

time_4k = read_blocks(hash_list_4k)
print(f"Time to read 4K data: {time_4k} seconds")
time_8k = read_blocks(hash_list_8k)
print(f"Time to read 8K data: {time_8k} seconds")

dashanji · 2024-11-18T02:02:18Z

Hi @Daniel-blue, How do you start the vineyardd? Could you please provide some details about it?

dashanji · 2024-11-18T02:07:47Z

Have you started the etcd?

Daniel-blue · 2024-11-18T02:29:31Z

Deploy the Vineyard server and client according to the guide at https://v6d.io/docs.html, and use kubectl exec to enter the client and operate with Python 3.0.
using redis，it is ok。All other components are running normally.

dashanji · 2024-11-18T03:50:55Z

Hi @Daniel-blue. Thanks for the detail.
Basically, the latency comes from two parts, one is memory alloc in vineyard server(put) or vineyard client(get), the other one is the metadata sync(persist/put name).

In the first part, you can reduce the memory alloc in the vineyard server by adding the --reserve_memory=True. As for vineyard client, we don't have the part to pre-alloc memory for vineyard object at present.

In the second part, the persist and name will be converted to call for the metadata service, which will cause high latency.
If your client and server still running in a process, you can just delete the persist and option to reduce the latency.
If you can make sure the objects will be put in one vineyard instance, you can use the stream object to bypass the metadata sync as the following example.
If your client and server is distributed, it may be possible to optimize the latency of get by putting multiple get operations into a single batch, with one metadata sync per batch.

import vineyard
import numpy as np
import time
from threading import Thread

from vineyard.io.recordbatch import RecordBatchStream

chunk_size = 1000

def stream_producer(vineyard_client):
    data = np.random.rand(10, 10).astype(np.float32)
    
    stream = RecordBatchStream.new(vineyard_client)
    vineyard_client.persist(stream.id)
    vineyard_client.put_name(stream.id, "stream11")
    chunk_list = []
    for _ in range(chunk_size):
        chunk_id = vineyard_client.put(data)
        chunk_list.append(chunk_id)
    start = time.time() 
    writer = stream.open_writer(vineyard_client)
    for _ in range(chunk_size):
        writer.append(chunk_id)
    writer.finish()
 
    end = time.time()
    per_chunk = (end - start) / chunk_size
    print(f"Producer sent {chunk_size} chunks in {end - start:.5f} seconds, per chunk cost {per_chunk:.5f} seconds")

def stream_consumer(vineyard_client):
    start = time.time()
    
    stream_id = vineyard_client.get_name("stream11", wait=True)
    stream = vineyard_client.get(stream_id)
    reader = stream.open_reader(vineyard_client)
    
    count = 0
    while True:
        try:
            chunk_id = reader.next_chunk_id()
            # data = vineyard_client.get(chunk_id)
            count += 1
        except StopIteration:
            break
    
    end = time.time()
    per_chunk = (end - start) / chunk_size
    print(f"Consumer received {count} chunks in {end - start:.5f} seconds, per chunk cost {per_chunk:.5f} seconds")

if __name__ == "__main__":
    endpoint = "172.20.6.103:9600"
    rpc_client = vineyard.connect(endpoint=endpoint)

    producer_thread = Thread(target=stream_producer, args=(rpc_client,))
    producer_thread.start()
    producer_thread.join()

    consumer_thread = Thread(target=stream_consumer, args=(rpc_client,))
    consumer_thread.start()
    consumer_thread.join()

Daniel-blue · 2024-11-18T06:47:56Z

Is it effective to merge the process and change the ordermap for name to an unordered_map?
Does the Vineyard client and server support concurrent put and get operations?

Daniel-blue · 2024-11-18T07:35:58Z

Hi @Daniel-blue. Thanks for the detail. Basically, the latency comes from two parts, one is memory alloc in vineyard server(put) or vineyard client(get), the other one is the metadata sync(persist/put name).

In the first part, you can reduce the memory alloc in the vineyard server by adding the --reserve_memory=True. As for vineyard client, we don't have the part to pre-alloc memory for vineyard object at present.

In the second part, the persist and name will be converted to call for the metadata service, which will cause high latency. If your client and server still running in a process, you can just delete the persist and option to reduce the latency. If you can make sure the objects will be put in one vineyard instance, you can use the stream object to bypass the metadata sync as the following example. If your client and server is distributed, it may be possible to optimize the latency of get by putting multiple get operations into a single batch, with one metadata sync per batch.

The scenario may be more suitable for the third case---'the client and server are distributed'. Does 'putting multiple get operations into a single batch' mean that the metadata does not include the data object? How can this be done?

dashanji · 2024-11-18T07:57:35Z

Is it effective to merge the process and change the ordermap for name to an unordered_map?

It's hard to say it can reduce a lot latency.

Does the Vineyard client and server support concurrent put and get operations?

Yes, you could try it in multithreads.

Does 'putting multiple get operations into a single batch' mean that the metadata does not include the data object? How can this be done?

You can replace get_object with get_objects. But unfortunately, we haven't integrated it into get. It could be an enhancement that we can achieve in the future.

https://github.com/v6d-io/v6d/blob/main/python/vineyard/core/client.py#L600-L606

github-actions · 2024-12-12T00:03:27Z

/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.

github-actions bot added the stale label Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client and server interaction has a long latency. #2011

Client and server interaction has a long latency. #2011

Daniel-blue commented Nov 8, 2024 •

edited

Loading

dashanji commented Nov 16, 2024

Daniel-blue commented Nov 17, 2024

Daniel-blue commented Nov 18, 2024 •

edited

Loading

dashanji commented Nov 18, 2024

dashanji commented Nov 18, 2024

Daniel-blue commented Nov 18, 2024

dashanji commented Nov 18, 2024

Daniel-blue commented Nov 18, 2024 •

edited

Loading

Daniel-blue commented Nov 18, 2024 •

edited

Loading

dashanji commented Nov 18, 2024 •

edited

Loading

github-actions bot commented Dec 12, 2024

Client and server interaction has a long latency. #2011

Client and server interaction has a long latency. #2011

Comments

Daniel-blue commented Nov 8, 2024 • edited Loading

Describe your problem

dashanji commented Nov 16, 2024

Daniel-blue commented Nov 17, 2024

Daniel-blue commented Nov 18, 2024 • edited Loading

dashanji commented Nov 18, 2024

dashanji commented Nov 18, 2024

Daniel-blue commented Nov 18, 2024

dashanji commented Nov 18, 2024

Daniel-blue commented Nov 18, 2024 • edited Loading

Daniel-blue commented Nov 18, 2024 • edited Loading

dashanji commented Nov 18, 2024 • edited Loading

github-actions bot commented Dec 12, 2024

Daniel-blue commented Nov 8, 2024 •

edited

Loading

Daniel-blue commented Nov 18, 2024 •

edited

Loading

Daniel-blue commented Nov 18, 2024 •

edited

Loading

Daniel-blue commented Nov 18, 2024 •

edited

Loading

dashanji commented Nov 18, 2024 •

edited

Loading