Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge integration to Main #1676

Merged
merged 47 commits into from
Sep 27, 2023
Merged

Merge integration to Main #1676

merged 47 commits into from
Sep 27, 2023

Conversation

yu-shipit
Copy link
Contributor

Pull Request checklist

  • The commit(s) message(s) follows the contribution guidelines ?
  • Tests for the changes have been added (for bug fixes / features) ?
  • Docs have been added / updated (for bug fixes / features) ?

Current behavior : (link exiting issues here : https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests)

New behavior :

BREAKING CHANGES

If this PR contains a breaking change, please describe the impact and migration
path for existing applications.
If not please remove this section.

Breaking changes may include:

  • Any schema changes to any Cassandra tables
  • The serialized format for Dataset and Column (see .toString methods)
  • Over the wire formats for Akka messages / case classes
  • Changes to the HTTP public API
  • Changes to query parsing / PromQL parsing

Other information:

alextheimer and others added 30 commits July 11, 2023 15:38
…db#1610)

Today scaling filoDB horizontally involves re-calculation of memory settings in a manual way involving tribal knowledge.
This PR aims to automate it and make it simple math. It is backward compatible and behind feature flag.
* Min-num-nodes moved from dataset into server config.
* New server config will drive automatic memory calculation
* Each dataset requires another configuration that determines what fraction of resources each dataset gets.
* filodb(core) add debugging info for empty histogram.
Some queries occasionally hit exceptions because of empty histogram.
However, the same exception could not be reproduced later.
The hunch is that the bug is caused by a race condition.
So, adding additional debugging log to print out the chunk id chunk info and the memory dump.
---------

Co-authored-by: Yu Zhang <[email protected]>
Co-authored-by: alextheimer <[email protected]>
Co-authored-by: sandeep6189 <[email protected]>
…ata scans (filodb#1628)

Double.isNan involves conversion to boxed java Double
Local heap profiling showed that this is a significant allocation.
Conversion to the static java.lang.Double.isNan removes these.
skip busting the problematic entry.

Co-authored-by: Yu Zhang <[email protected]>
Some queries occasionally hit exceptions because of empty histogram.
However, the same exception could not be reproduced later.
The hunch is that the bug is caused by a race condition.
So, adding additional debugging log to print out the chunk id chunk info and the memory dump.

Co-authored-by: Yu Zhang <[email protected]>
There are two configs for num-nodes used by clustering-v2 and automatic memory alloc code.
Consolidating them.
Default alloc configs should sum to 100.
filodb#1629)

* fix(core) fix the unless operator for aggregators.

For regex shard key we need to aggregate across all nodes.
InProcessPlanDispatcher is needed.
---------

Co-authored-by: Yu Zhang <[email protected]>
feat(query): Cardinality V2 API Query Plan changes
convert unary expressions through binary expressions.

Co-authored-by: Yu Zhang <[email protected]>
)

There was a bug in calculating size of SRV.
Earlier, for efficiency purposes, we were calculating the size of the containers associated with the SRV.
But actually, the container can home multiple SRVs. So the calculated size for several SRVs at a time can end up wrong with addition of cumulative counts.

The fix for now is to calculate the size by going through the records. It introduces a small inefficiency here, but submitting this PR for now since other ways to calculate this were more invasive and risk regression. We can have an optimization of this if really needed later. I have also reduced the number of calls to this method from two to one.

The unit tests didn't catch this since earlier since they played with one SRV only.
I have now added a unit test that adds multiple SRVs. It failed with earlier code.
… enum and prefix regex filters (filodb#1641)

Production profiling is showing that Lucene Regex automata is creating a hotspot in method and allocation profiling.

This PR optimizes two kinds of queries
* Regex with enumerated values are converted to TermInSetQuery
* Regex with prefix is converted to PrefixQuery
It also wraps Lucene queries in ConstantScoreQuery to prevent any scoring that may be happening.

Observed
2.2x performance improvement in JMH benchmark for specific enum regex query
1.5x performance improvement in JMH benchmark for specific prefix regex query
…ix (filodb#1645)

* fix(core) make the error message more frendly to users. (filodb#1593)

Co-authored-by: Yu Zhang <[email protected]>
(cherry picked from commit 5b05779)

* fix nullpointer happened in cardinality busting job. (filodb#1631)

skip busting the problematic entry.

Co-authored-by: Yu Zhang <[email protected]>
(cherry picked from commit 6ac0255)
…ilodb#1649)

* filodb(core) add debugging info for empty histogram.
Some queries occasionally hit exceptions because of empty histogram.
However, the same exception could not be reproduced later.
The hunch is that the bug is caused by a race condition.
So, adding additional debugging log to print out the chunk id chunk info and the memory dump.
---------
Co-authored-by: Yu Zhang <[email protected]>
(cherry picked from commit 90303aa)
feat(query): Cardinality V2 API Query Plan changes
feat(query): Cardinality V2 API Query Plan changes (filodb#1637)
…c calls (filodb#1651)

* Adding UserDatasets for remote calls

* Updating UT
…c calls (filodb#1651)

* Adding UserDatasets for remote calls

* Updating UT
sandeep6189 and others added 17 commits August 18, 2023 09:53
fix(query): Cardinality multi partition queries
Adding CPU Nanos Time consumed for Lucene index lookups in query stats,
especially since we are seeing this in frequently as a hotspot in CPU profile information
…s for cardinality calculation time (filodb#1666)

* Adding config support for DS Card flushCount and perf logs for cardinality calculation time
filodb#1629) (filodb#1668)

* fix(core) fix the unless operator for aggregators.

For regex shard key we need to aggregate across all nodes.
InProcessPlanDispatcher is needed.
---------

Co-authored-by: Yu Zhang <[email protected]>
(cherry picked from commit f5018ae)
Merge branch 'develop' into integration
@yu-shipit yu-shipit changed the title merge integration to main Merge integration to Main Sep 27, 2023
@yu-shipit yu-shipit merged commit b8f7dca into filodb:main Sep 27, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants