-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(query) Option to disable Lucene caching #1709
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -133,7 +133,8 @@ class PartKeyLuceneIndex(ref: DatasetRef, | |
retentionMillis: Long, // only used to calculate fallback startTime | ||
diskLocation: Option[File] = None, | ||
val lifecycleManager: Option[IndexMetadataStore] = None, | ||
useMemoryMappedImpl: Boolean = true | ||
useMemoryMappedImpl: Boolean = true, | ||
disableIndexCaching: Boolean = false | ||
) extends StrictLogging { | ||
|
||
import PartKeyLuceneIndex._ | ||
|
@@ -240,7 +241,26 @@ class PartKeyLuceneIndex(ref: DatasetRef, | |
private val utf8ToStrCache = concurrentCache[UTF8Str, String](PartKeyLuceneIndex.MAX_STR_INTERN_ENTRIES) | ||
|
||
//scalastyle:off | ||
private val searcherManager = new SearcherManager(indexWriter, null) | ||
private val searcherManager = | ||
if (disableIndexCaching) { | ||
new SearcherManager(indexWriter, | ||
new SearcherFactory() { | ||
override def newSearcher(reader: IndexReader, previousReader: IndexReader): IndexSearcher = { | ||
val indexSearcher = super.newSearcher(reader, previousReader) | ||
indexSearcher.setQueryCache(null) | ||
indexSearcher.setQueryCachingPolicy(new QueryCachingPolicy() { | ||
override def onUse(query: Query): Unit = { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. quick question: what does onUse do ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Taken from Lucene documentation
|
||
|
||
} | ||
|
||
override def shouldCache(query: Query): Boolean = false | ||
}) | ||
indexSearcher | ||
} | ||
}) | ||
} else { | ||
new SearcherManager(indexWriter, null) | ||
} | ||
//scalastyle:on | ||
|
||
//start this thread to flush the segments and refresh the searcher every specific time period | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -284,6 +284,8 @@ class TimeSeriesShard(val ref: DatasetRef, | |
filodbConfig.getBoolean("memstore.index-faceting-enabled-shard-key-labels") | ||
private val indexFacetingEnabledAllLabels = filodbConfig.getBoolean("memstore.index-faceting-enabled-for-all-labels") | ||
private val numParallelFlushes = filodbConfig.getInt("memstore.flush-task-parallelism") | ||
private val disableIndexCaching = filodbConfig.getBoolean("memstore.disable-index-caching") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we also need this in downsample time series shard? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure its needed yet but not ruling out the possibility |
||
|
||
|
||
/////// END CONFIGURATION FIELDS /////////////////// | ||
|
||
|
@@ -311,7 +313,7 @@ class TimeSeriesShard(val ref: DatasetRef, | |
*/ | ||
private[memstore] final val partKeyIndex = new PartKeyLuceneIndex(ref, schemas.part, | ||
indexFacetingEnabledAllLabels, indexFacetingEnabledShardKeyLabels, shardNum, | ||
storeConfig.diskTTLSeconds * 1000) | ||
storeConfig.diskTTLSeconds * 1000, disableIndexCaching = disableIndexCaching) | ||
|
||
private val cardTracker: CardinalityTracker = initCardTracker() | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to still set the cache to null if shouldCache returns false ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe why it is detrimental too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question @sandeep6189 Initially I set the caching policy and disable caching anything but later found setting cache to null also disables cache (at least from unit tests) and also eliminates the check for whether to cache for not. However, I left both changes in.
Hello @whizkido good to see your comments :)
When we profile a FIloDB instance with a large index and which sees a lot of repeating queries, we see the following flame graph (as taken by async profiler)
We see almost 80% stack traces are related to index searches and about 50% of the are related to caching. Also this archive mentions disabling caching which may or may not work in our case (also they mention disabling cache in a different way than what this PR does). The idea is to have that knob to let us disable the caching and then profile again. It might be worse but we can measure and check.
Let me add the above details to PR description too