-
Notifications
You must be signed in to change notification settings - Fork 170
QueuedSink and ThreadPoolQueuedSink
metacret edited this page Sep 9, 2014
·
1 revision
QueuedSink is the abstract class for implementing the sink with the queue. Internally, writeTo(MessageContainer)
method will put the message in the queue and background thread polls messages in a batch way and call the implementation of write(List<Message>)
. The following properties are needed for QueuedSink.
Properties | Description | type | Default |
---|---|---|---|
queue4Sink | Which type of queue should be used | MessageQueue4Sink | Memory based queue with capacity of 10,000 |
batchSize | Same as queue.buffering.max.messages of Kafka Producer config | int | 1000 |
batchTimeout | Same as queue.buffering.max.ms of Kafka Producer config | int | 1000 (ms) |
When the sink IO implementation can be thread safe like ElasticSearch TransportClient or Kafka 0.8 Producer, we can make that IO part as thread pool. ThreadPoolQueuedSink is extending QueuedSink. LocalFileSink is not ThreadPoolQueuedSink because it has Hadoop SequenceFile.Writer which is not thread safe.
Properties | Description | type | Default |
---|---|---|---|
jobQueueSize | Capacity of workQueue argument in ThreadPoolExecutor constructor | int | 1000 |
corePoolSize | corePoolSize argument in ThreadPoolExecutor constructor | int | 3 |
maxPoolSize | maximumPoolSize argument in ThreadPoolExecutor constructor | int | 10 |
jobTimeout | On closing the sink, how long it should wait before terminating the job forcefully in millisecond | int | INFINITE |