diff --git a/docs/configs/janusgraph-cfg.md b/docs/configs/janusgraph-cfg.md index 288e7bb1e1..1cb1097a9b 100644 --- a/docs/configs/janusgraph-cfg.md +++ b/docs/configs/janusgraph-cfg.md @@ -346,10 +346,7 @@ Configuration options for query processing | Name | Description | Datatype | Default Value | Mutability | | ---- | ---- | ---- | ---- | ---- | -| query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties. -This setting is applicable to direct vertex properties access (like `vertex.properties("foo")` but not to `vertex.properties("foo","bar")` because the latter case is not a singular property access). -This setting is not applicable to the next Gremlin steps: `valueMap`, `propertyMap`, `elementMap`, `properties`, `values` (configuration option `query.batch.properties-mode` should be used to configure their behavior). -When `true` this setting overwrites `query.batch.has-step-mode` to `all_properties` unless `none` mode is used. | Boolean | true | MASKABLE | +| query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties.
This setting is applicable to direct vertex properties access (like `vertex.properties("foo")` but not to `vertex.properties("foo","bar")` because the latter case is not a singular property access).
This setting is not applicable to the next Gremlin steps: `valueMap`, `propertyMap`, `elementMap`, `properties`, `values` (configuration option `query.batch.properties-mode` should be used to configure their behavior).
When `true` this setting overwrites `query.batch.has-step-mode` to `all_properties` unless `none` mode is used. | Boolean | true | MASKABLE | | query.force-index | Whether JanusGraph should throw an exception if a graph query cannot be answered using an index. Doing so limits the functionality of JanusGraph's graph queries but ensures that slow graph queries are avoided on large graphs. Recommended for production use of JanusGraph. | Boolean | false | MASKABLE | | query.hard-max-limit | If smart-limit is disabled and no limit is given in the query, query optimizer adds a limit in light of possibly large result sets. It works in the same way as smart-limit except that hard-max-limit is usually a large number. Default value is Integer.MAX_VALUE which effectively disables this behavior. This option does not take effect when smart-limit is enabled. | Integer | 2147483647 | MASKABLE | | query.ignore-unknown-index-key | Whether to ignore undefined types encountered in user-provided index queries | Boolean | false | MASKABLE | @@ -365,14 +362,7 @@ Configuration options to configure batch queries optimization behavior | Name | Description | Datatype | Default Value | Mutability | | ---- | ---- | ---- | ---- | ---- | | query.batch.enabled | Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend. If `false` then all other configuration options under `query.batch` namespace are ignored. | Boolean | true | MASKABLE | -| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when `query.batch.enabled` is `true`.
Supported modes:
- `all_properties` - Pre-fetch all vertex properties on any property access (fetches all vertex properties in a single slice query)
- `required_properties_only` - Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps (uses a separate slice query per each required property)
- `required_and_next_properties` - Prefetch the same properties as with `required_properties_only` mode, but also prefetch -properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`. -In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`. -In case the next step is one of the properties access steps with limited scope of properties, those properties will be -pre-fetched together in the same multi-query. -In case the next step is one of the properties access steps with unspecified scope of property keys then this mode -behaves same as `all_properties`.
- `required_and_next_properties_or_all` - Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not -`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.
- `none` - Skips `has` step batch properties pre-fetch optimization.
| String | required_and_next_properties | MASKABLE | +| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when `query.batch.enabled` is `true`.
Supported modes:
- `all_properties` - Pre-fetch all vertex properties on any property access (fetches all vertex properties in a single slice query)
- `required_properties_only` - Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps (uses a separate slice query per each required property)
- `required_and_next_properties` - Prefetch the same properties as with `required_properties_only` mode, but also prefetch
properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`.
In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`.
In case the next step is one of the properties access steps with limited scope of properties, those properties will be
pre-fetched together in the same multi-query.
In case the next step is one of the properties access steps with unspecified scope of property keys then this mode
behaves same as `all_properties`.
- `required_and_next_properties_or_all` - Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not
`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.
- `none` - Skips `has` step batch properties pre-fetch optimization.
| String | required_and_next_properties | MASKABLE | | query.batch.label-step-mode | Labels pre-fetching mode for `label()` step. Used only when `query.batch.enabled` is `true`.
Supported modes:
- `all` - Pre-fetch labels for all vertices in a batch.
- `none` - Skips vertex labels pre-fetching optimization.
| String | all | MASKABLE | | query.batch.limited | Configure a maximum batch size for queries against the storage backend. This can be used to ensure responsiveness if batches tend to grow very large. The used batch size is equivalent to the barrier size of a preceding `barrier()` step. If a step has no preceding `barrier()`, the default barrier of TinkerPop will be inserted. This option only takes effect if `query.batch.enabled` is `true`. | Boolean | true | MASKABLE | | query.batch.limited-size | Default batch size (barrier() step size) for queries. This size is applied only for cases where `LazyBarrierStrategy` strategy didn't apply `barrier` step and where user didn't apply barrier step either. This option is used only when `query.batch.limited` is `true`. Notice, value `2147483647` is considered to be unlimited. | Integer | 2500 | MASKABLE | @@ -485,33 +475,12 @@ Configuration options for controlling CQL queries grouping | Name | Description | Datatype | Default Value | Mutability | | ---- | ---- | ---- | ---- | ---- | -| storage.cql.grouping.keys-allowed | If `true` this allows multiple partition keys to be grouped together into a single CQL query via `IN` operator based on the keys grouping strategy provided (usually grouping is done by same token-ranges or same replica sets, but may also involve shard ids for custom implementations). -Notice, that any CQL query grouped with more than 1 key will require to return a row key for any column fetched. -This option is useful when less amount of CQL queries is desired to be sent for read requests in expense of fetching more data (partition key per each fetched value). -Notice, different storage backends may have different way of executing multi-partition `IN` queries (including, but not limited to how the checksum queries are sent for different consistency levels, processing node CPU usage, disk access pattern, etc.). Thus, a proper benchmarking is needed to determine if keys grouping is useful or not per case by case scenario. -This option can be enabled only for storage backends which support `PER PARTITION LIMIT`. As such, this feature can't be used with Amazon Keyspaces because it doesn't support `PER PARTITION LIMIT`. -If this option is `false` then each partition key will be executed in a separate asynchronous CQL query even when multiple keys from the same token range are queried. -Notice, the default grouping strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend. ScyllaDB specific keys grouping strategy should be implemented after the resolution of the [ticket #232](https://github.com/scylladb/java-driver/issues/232). | Boolean | false | MASKABLE | -| storage.cql.grouping.keys-class | Full class path of the keys grouping execution strategy. The class should implement `org.janusgraph.diskstorage.cql.strategy.GroupedExecutionStrategy` interface and have a public constructor with two arguments `org.janusgraph.diskstorage.configuration.Configuration` and `org.janusgraph.diskstorage.cql.CQLStoreManager`. -Shortcuts available: -- `tokenRangeAware` - groups partition keys which belong to the same token range. Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend. -- `replicasAware` - groups partition keys which belong to the same replica sets (same nodes). Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend. - -Usually `tokenRangeAware` grouping strategy provides more smaller groups where each group contain keys which are stored close to each other on a disk and may cause less disk seeks in some cases. However `replicasAware` grouping strategy groups keys per replica set which usually means fewer bigger groups to be used (i.e. less CQL requests). -This option takes effect only when `storage.cql.grouping.keys-allowed` is `true`. | String | replicasAware | MASKABLE | -| storage.cql.grouping.keys-limit | Maximum amount of the keys which can be grouped together into a single CQL query. If more keys are queried, they are going to be grouped into separate CQL queries. -Notice, for ScyllaDB this option should not exceed the maximum number of distinct clustering key restrictions per query which can be changed by ScyllaDB configuration option `max-partition-key-restrictions-per-query` (https://enterprise.docs.scylladb.com/branch-2022.2/faq.html#how-can-i-change-the-maximum-number-of-in-restrictions). For AstraDB this limit is set to 20 and usually it's fixed. However, you can ask customer support for a possibility to change the default threshold to your desired configuration via `partition_keys_in_select_failure_threshold` and `in_select_cartesian_product_failure_threshold` threshold configurations (https://docs.datastax.com/en/astra-serverless/docs/plan/planning.html#_cassandra_yaml). -Ensure that your storage backend allows more IN selectors than the one set via this configuration. -This option takes effect only when `storage.cql.grouping.keys-allowed` is `true`. | Integer | 20 | MASKABLE | -| storage.cql.grouping.keys-min | Minimum amount of keys to consider for grouping. Grouping will be skipped for any multi-key query which has less than this amount of keys (i.e. a separate CQL query will be executed for each key in such case). -Usually this configuration should always be set to `2`. It is useful to increase the value only in cases when queries with more keys should not be grouped, but be performed separately to increase parallelism in expense of the network overhead. -This option takes effect only when `storage.cql.grouping.keys-allowed` is `true`. | Integer | 2 | MASKABLE | -| storage.cql.grouping.slice-allowed | If `true` this allows multiple Slice queries which are allowed to be performed as non-range queries (i.e. direct equality operation) to be grouped together into a single CQL query via `IN` operator. Notice, currently only operations to fetch properties with Cardinality.SINGLE are allowed to be performed as non-range queries (edges fetching or properties with Cardinality SET or LIST won't be grouped together). -If this option is `false` then each Slice query will be executed in a separate asynchronous CQL query even when grouping is allowed. | Boolean | true | MASKABLE | -| storage.cql.grouping.slice-limit | Maximum amount of grouped together slice queries into a single CQL query. -Notice, for ScyllaDB this option should not exceed the maximum number of distinct clustering key restrictions per query which can be changed by ScyllaDB configuration option `max-partition-key-restrictions-per-query` (https://enterprise.docs.scylladb.com/branch-2022.2/faq.html#how-can-i-change-the-maximum-number-of-in-restrictions). For AstraDB this limit is set to 20 and usually it's fixed. However, you can ask customer support for a possibility to change the default threshold to your desired configuration via `partition_keys_in_select_failure_threshold` and `in_select_cartesian_product_failure_threshold` threshold configurations (https://docs.datastax.com/en/astra-serverless/docs/plan/planning.html#_cassandra_yaml). -Ensure that your storage backend allows more IN selectors than the one set via this configuration. -This option is used only when `storage.cql.grouping.slice-allowed` is `true`. | Integer | 20 | MASKABLE | +| storage.cql.grouping.keys-allowed | If `true` this allows multiple partition keys to be grouped together into a single CQL query via `IN` operator based on the keys grouping strategy provided (usually grouping is done by same token-ranges or same replica sets, but may also involve shard ids for custom implementations).
Notice, that any CQL query grouped with more than 1 key will require to return a row key for any column fetched.
This option is useful when less amount of CQL queries is desired to be sent for read requests in expense of fetching more data (partition key per each fetched value).
Notice, different storage backends may have different way of executing multi-partition `IN` queries (including, but not limited to how the checksum queries are sent for different consistency levels, processing node CPU usage, disk access pattern, etc.). Thus, a proper benchmarking is needed to determine if keys grouping is useful or not per case by case scenario.
This option can be enabled only for storage backends which support `PER PARTITION LIMIT`. As such, this feature can't be used with Amazon Keyspaces because it doesn't support `PER PARTITION LIMIT`.
If this option is `false` then each partition key will be executed in a separate asynchronous CQL query even when multiple keys from the same token range are queried.
Notice, the default grouping strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend. ScyllaDB specific keys grouping strategy should be implemented after the resolution of the [ticket #232](https://github.com/scylladb/java-driver/issues/232). | Boolean | false | MASKABLE | +| storage.cql.grouping.keys-class | Full class path of the keys grouping execution strategy. The class should implement `org.janusgraph.diskstorage.cql.strategy.GroupedExecutionStrategy` interface and have a public constructor with two arguments `org.janusgraph.diskstorage.configuration.Configuration` and `org.janusgraph.diskstorage.cql.CQLStoreManager`.
Shortcuts available:
- `tokenRangeAware` - groups partition keys which belong to the same token range. Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend.
- `replicasAware` - groups partition keys which belong to the same replica sets (same nodes). Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend.

Usually `tokenRangeAware` grouping strategy provides more smaller groups where each group contain keys which are stored close to each other on a disk and may cause less disk seeks in some cases. However `replicasAware` grouping strategy groups keys per replica set which usually means fewer bigger groups to be used (i.e. less CQL requests).
This option takes effect only when `storage.cql.grouping.keys-allowed` is `true`. | String | replicasAware | MASKABLE | +| storage.cql.grouping.keys-limit | Maximum amount of the keys which can be grouped together into a single CQL query. If more keys are queried, they are going to be grouped into separate CQL queries.
Notice, for ScyllaDB this option should not exceed the maximum number of distinct clustering key restrictions per query which can be changed by ScyllaDB configuration option `max-partition-key-restrictions-per-query` (https://enterprise.docs.scylladb.com/branch-2022.2/faq.html#how-can-i-change-the-maximum-number-of-in-restrictions). For AstraDB this limit is set to 20 and usually it's fixed. However, you can ask customer support for a possibility to change the default threshold to your desired configuration via `partition_keys_in_select_failure_threshold` and `in_select_cartesian_product_failure_threshold` threshold configurations (https://docs.datastax.com/en/astra-serverless/docs/plan/planning.html#_cassandra_yaml).
Ensure that your storage backend allows more IN selectors than the one set via this configuration.
This option takes effect only when `storage.cql.grouping.keys-allowed` is `true`. | Integer | 20 | MASKABLE | +| storage.cql.grouping.keys-min | Minimum amount of keys to consider for grouping. Grouping will be skipped for any multi-key query which has less than this amount of keys (i.e. a separate CQL query will be executed for each key in such case).
Usually this configuration should always be set to `2`. It is useful to increase the value only in cases when queries with more keys should not be grouped, but be performed separately to increase parallelism in expense of the network overhead.
This option takes effect only when `storage.cql.grouping.keys-allowed` is `true`. | Integer | 2 | MASKABLE | +| storage.cql.grouping.slice-allowed | If `true` this allows multiple Slice queries which are allowed to be performed as non-range queries (i.e. direct equality operation) to be grouped together into a single CQL query via `IN` operator. Notice, currently only operations to fetch properties with Cardinality.SINGLE are allowed to be performed as non-range queries (edges fetching or properties with Cardinality SET or LIST won't be grouped together).
If this option is `false` then each Slice query will be executed in a separate asynchronous CQL query even when grouping is allowed. | Boolean | true | MASKABLE | +| storage.cql.grouping.slice-limit | Maximum amount of grouped together slice queries into a single CQL query.
Notice, for ScyllaDB this option should not exceed the maximum number of distinct clustering key restrictions per query which can be changed by ScyllaDB configuration option `max-partition-key-restrictions-per-query` (https://enterprise.docs.scylladb.com/branch-2022.2/faq.html#how-can-i-change-the-maximum-number-of-in-restrictions). For AstraDB this limit is set to 20 and usually it's fixed. However, you can ask customer support for a possibility to change the default threshold to your desired configuration via `partition_keys_in_select_failure_threshold` and `in_select_cartesian_product_failure_threshold` threshold configurations (https://docs.datastax.com/en/astra-serverless/docs/plan/planning.html#_cassandra_yaml).
Ensure that your storage backend allows more IN selectors than the one set via this configuration.
This option is used only when `storage.cql.grouping.slice-allowed` is `true`. | Integer | 20 | MASKABLE | ### storage.cql.internal Advanced configuration of internal DataStax driver. Notice, all available configurations will be composed in the order. Non specified configurations will be skipped. By default only base configuration is enabled (which has the smallest priority. It means that you can overwrite any configuration used in base programmatic configuration by using any other configuration type). The configurations are composed in the next order (sorted by priority in descending order): `file-configuration`, `resource-configuration`, `string-configuration`, `url-configuration`, `base-programmatic-configuration` (which is controlled by `base-programmatic-configuration-enabled` property). Configurations with higher priority always overwrite configurations with lower priority. I.e. if the same configuration parameter is used in both `file-configuration` and `string-configuration` the configuration parameter from `file-configuration` will be used and configuration parameter from `string-configuration` will be ignored. See available configuration options and configurations structure here: https://docs.datastax.com/en/developer/java-driver/4.13/manual/core/configuration/reference/ diff --git a/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java b/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java index 0d8c408a3e..e2dde93a50 100644 --- a/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java +++ b/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java @@ -262,12 +262,12 @@ public class GraphDatabaseConfiguration { public static final ConfigOption PROPERTY_PREFETCHING = new ConfigOption<>(QUERY_NS,"fast-property", "Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent " + "property access for the same vertex at the expense of retrieving all properties at once. This can be " + - "expensive for vertices with many properties. \n" + + "expensive for vertices with many properties.
" + "This setting is applicable to direct vertex properties access " + "(like `vertex.properties(\"foo\")` but not to `vertex.properties(\"foo\",\"bar\")` because the latter case " + - "is not a singular property access). \n" + + "is not a singular property access).
" + "This setting is not applicable to the next Gremlin steps: `valueMap`, `propertyMap`, `elementMap`, `properties`, `values` " + - "(configuration option `query.batch.properties-mode` should be used to configure their behavior).\n" + + "(configuration option `query.batch.properties-mode` should be used to configure their behavior).
" + "When `true` this setting overwrites `query.batch.has-step-mode` to `"+MultiQueryHasStepStrategyMode.ALL_PROPERTIES.getConfigName()+"` unless `"+ MultiQueryHasStepStrategyMode.NONE.getConfigName()+"` mode is used.", ConfigOption.Type.MASKABLE, true); @@ -342,14 +342,14 @@ public class GraphDatabaseConfiguration { "Supported modes:
" + "- `%s` - Pre-fetch all vertex properties on any property access (fetches all vertex properties in a single slice query)
" + "- `%s` - Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps (uses a separate slice query per each required property)
" + - "- `%s` - Prefetch the same properties as with `%s` mode, but also prefetch\n" + - "properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`.\n" + - "In case the next step is not one of those properties access steps then this mode behaves same as `%s`.\n" + - "In case the next step is one of the properties access steps with limited scope of properties, those properties will be\n" + - "pre-fetched together in the same multi-query.\n" + - "In case the next step is one of the properties access steps with unspecified scope of property keys then this mode\n" + + "- `%s` - Prefetch the same properties as with `%s` mode, but also prefetch
" + + "properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`.
" + + "In case the next step is not one of those properties access steps then this mode behaves same as `%s`.
" + + "In case the next step is one of the properties access steps with limited scope of properties, those properties will be
" + + "pre-fetched together in the same multi-query.
" + + "In case the next step is one of the properties access steps with unspecified scope of property keys then this mode
" + "behaves same as `%s`.
"+ - "- `%s` - Prefetch the same properties as with `%s`, but in case the next step is not\n" + + "- `%s` - Prefetch the same properties as with `%s`, but in case the next step is not
" + "`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `%s`.
"+ "- `%s` - Skips `has` step batch properties pre-fetch optimization.
", MultiQueryHasStepStrategyMode.ALL_PROPERTIES.getConfigName(), diff --git a/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/CQLConfigOptions.java b/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/CQLConfigOptions.java index 1a08f1725a..603f55d783 100644 --- a/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/CQLConfigOptions.java +++ b/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/CQLConfigOptions.java @@ -764,7 +764,7 @@ public interface CQLConfigOptions { "(https://enterprise.docs.scylladb.com/branch-2022.2/faq.html#how-can-i-change-the-maximum-number-of-in-restrictions). " + "For AstraDB this limit is set to 20 and usually it's fixed. However, you can ask customer support for a possibility to change " + "the default threshold to your desired configuration via `partition_keys_in_select_failure_threshold` " + - "and `in_select_cartesian_product_failure_threshold` threshold configurations (https://docs.datastax.com/en/astra-serverless/docs/plan/planning.html#_cassandra_yaml).\n" + + "and `in_select_cartesian_product_failure_threshold` threshold configurations (https://docs.datastax.com/en/astra-serverless/docs/plan/planning.html#_cassandra_yaml).
" + "Ensure that your storage backend allows more IN selectors than the one set via this configuration."; ConfigOption SLICE_GROUPING_ALLOWED = new ConfigOption<>( @@ -773,7 +773,7 @@ public interface CQLConfigOptions { "If `true` this allows multiple Slice queries which are allowed to be performed as non-range " + "queries (i.e. direct equality operation) to be grouped together into a single CQL query via `IN` operator. " + "Notice, currently only operations to fetch properties with Cardinality.SINGLE are allowed to be performed as non-range queries " + - "(edges fetching or properties with Cardinality SET or LIST won't be grouped together).\n" + + "(edges fetching or properties with Cardinality SET or LIST won't be grouped together).
" + "If this option is `false` then each Slice query will be executed in a separate asynchronous CQL query even when " + "grouping is allowed.", ConfigOption.Type.MASKABLE, @@ -782,8 +782,8 @@ public interface CQLConfigOptions { ConfigOption SLICE_GROUPING_LIMIT = new ConfigOption<>( CQL_GROUPING_NS, "slice-limit", - "Maximum amount of grouped together slice queries into a single CQL query.\n" + MAX_IN_CONFIG_MSG + - "\nThis option is used only when `"+SLICE_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", + "Maximum amount of grouped together slice queries into a single CQL query.
" + MAX_IN_CONFIG_MSG + + "
This option is used only when `"+SLICE_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", ConfigOption.Type.MASKABLE, 20); @@ -791,14 +791,14 @@ public interface CQLConfigOptions { CQL_GROUPING_NS, "keys-allowed", "If `true` this allows multiple partition keys to be grouped together into a single CQL query via `IN` operator based on the keys grouping strategy provided " + - "(usually grouping is done by same token-ranges or same replica sets, but may also involve shard ids for custom implementations).\n" + - "Notice, that any CQL query grouped with more than 1 key will require to return a row key for any column fetched.\n" + - "This option is useful when less amount of CQL queries is desired to be sent for read requests in expense of fetching more data (partition key per each fetched value).\n" + + "(usually grouping is done by same token-ranges or same replica sets, but may also involve shard ids for custom implementations).
" + + "Notice, that any CQL query grouped with more than 1 key will require to return a row key for any column fetched.
" + + "This option is useful when less amount of CQL queries is desired to be sent for read requests in expense of fetching more data (partition key per each fetched value).
" + "Notice, different storage backends may have different way of executing multi-partition `IN` queries " + "(including, but not limited to how the checksum queries are sent for different consistency levels, processing node CPU usage, disk access pattern, etc.). Thus, a proper " + - "benchmarking is needed to determine if keys grouping is useful or not per case by case scenario.\n" + - "This option can be enabled only for storage backends which support `PER PARTITION LIMIT`. As such, this feature can't be used with Amazon Keyspaces because it doesn't support `PER PARTITION LIMIT`.\n" + - "If this option is `false` then each partition key will be executed in a separate asynchronous CQL query even when multiple keys from the same token range are queried.\n" + + "benchmarking is needed to determine if keys grouping is useful or not per case by case scenario.
" + + "This option can be enabled only for storage backends which support `PER PARTITION LIMIT`. As such, this feature can't be used with Amazon Keyspaces because it doesn't support `PER PARTITION LIMIT`.
" + + "If this option is `false` then each partition key will be executed in a separate asynchronous CQL query even when multiple keys from the same token range are queried.
" + "Notice, the default grouping strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend. " + "ScyllaDB specific keys grouping strategy should be implemented after the resolution of the [ticket #232](https://github.com/scylladb/java-driver/issues/232).", ConfigOption.Type.MASKABLE, @@ -809,13 +809,13 @@ public interface CQLConfigOptions { "keys-class", "Full class path of the keys grouping execution strategy. The class should implement " + "`org.janusgraph.diskstorage.cql.strategy.GroupedExecutionStrategy` interface and have a public constructor " + - "with two arguments `org.janusgraph.diskstorage.configuration.Configuration` and `org.janusgraph.diskstorage.cql.CQLStoreManager`.\n" + - "Shortcuts available:\n" + - "- `"+GroupedExecutionStrategyBuilder.TOKEN_RANGE_AWARE +"` - groups partition keys which belong to the same token range. Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend.\n" + - "- `"+GroupedExecutionStrategyBuilder.REPLICAS_AWARE +"` - groups partition keys which belong to the same replica sets (same nodes). Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend.\n" + - "\nUsually `"+GroupedExecutionStrategyBuilder.TOKEN_RANGE_AWARE+"` grouping strategy provides more smaller groups where each group contain keys which are stored close to each other on a disk and may cause less disk seeks in some cases. " + + "with two arguments `org.janusgraph.diskstorage.configuration.Configuration` and `org.janusgraph.diskstorage.cql.CQLStoreManager`.
" + + "Shortcuts available:
" + + "- `"+GroupedExecutionStrategyBuilder.TOKEN_RANGE_AWARE +"` - groups partition keys which belong to the same token range. Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend.
" + + "- `"+GroupedExecutionStrategyBuilder.REPLICAS_AWARE +"` - groups partition keys which belong to the same replica sets (same nodes). Notice, this strategy does not take shards into account. Thus, this might be inefficient with ScyllaDB storage backend.
" + + "
Usually `"+GroupedExecutionStrategyBuilder.TOKEN_RANGE_AWARE+"` grouping strategy provides more smaller groups where each group contain keys which are stored close to each other on a disk and may cause less disk seeks in some cases. " + "However `"+GroupedExecutionStrategyBuilder.REPLICAS_AWARE+"` grouping strategy groups keys per replica set which usually means fewer bigger groups to be used (i.e. less CQL requests)." + - "\nThis option takes effect only when `"+KEYS_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", + "
This option takes effect only when `"+KEYS_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", ConfigOption.Type.MASKABLE, GroupedExecutionStrategyBuilder.REPLICAS_AWARE); @@ -823,9 +823,9 @@ public interface CQLConfigOptions { CQL_GROUPING_NS, "keys-limit", "Maximum amount of the keys which can be grouped together into a single CQL query. " + - "If more keys are queried, they are going to be grouped into separate CQL queries.\n" + "If more keys are queried, they are going to be grouped into separate CQL queries.
" + MAX_IN_CONFIG_MSG + - "\nThis option takes effect only when `"+KEYS_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", + "
This option takes effect only when `"+KEYS_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", ConfigOption.Type.MASKABLE, 20); @@ -833,10 +833,10 @@ public interface CQLConfigOptions { CQL_GROUPING_NS, "keys-min", "Minimum amount of keys to consider for grouping. Grouping will be skipped for any multi-key query " + - "which has less than this amount of keys (i.e. a separate CQL query will be executed for each key in such case).\n" + + "which has less than this amount of keys (i.e. a separate CQL query will be executed for each key in such case).
" + "Usually this configuration should always be set to `2`. It is useful to increase the value only in cases when queries " + "with more keys should not be grouped, but be performed separately to increase parallelism in expense of the network overhead." + - "\nThis option takes effect only when `"+KEYS_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", + "
This option takes effect only when `"+KEYS_GROUPING_ALLOWED.toStringWithoutRoot()+"` is `true`.", ConfigOption.Type.MASKABLE, 2); }