RunParameters - Query/Run Configuration

The RunParameters class contains a set of parameters that configure the query and run details for the Correlation Detective algorithm. These parameters allow you to customize the behavior of the algorithm to suit your specific needs. We refer to our paper for more details about the parameters and their effects on the algorithm.

Parameters Overview

Below is an overview of the parameters in the RunParameters class:

Name	Domain	Default Value	Accessibility	Description
inputPath	String (File path)	N/A	Read and Write	Path to the input dataset.
simMetricName	PEARSON_CORRELATION, SPEARMAN_CORRELATION, MULTIPOLE, EUCLIDEAN_SIMILARITY, MANHATTAN_SIMILARITY, TOTAL_CORRELATION	N/A	Read and Write	Similarity metric to use.
maxPLeft	Integer (Between 1 and 10)	N/A	Read and Write	Maximum set size for the left side of the correlation pattern.
maxPRight	Integer (Between 0 and 10)	N/A	Read and Write	Maximum set size for the right side of the correlation pattern.
logLevel	Level (Enumeration)	INFO	Read and Write	Logging level.
dateTime	String	Current date	Read-only	Date and time string.
monitorStats	boolean	true	Read and Write	Flag to enable monitoring OF statistics.
threads	int (Between 1 and 80)	Min(80, CPU cores * 4)	Read-only	Number of threads to use.
parallel	boolean	true	Read and Write	Flag to enable parallel execution.
random	boolean	true	Read and Write	Flag to enable randomized execution (non-seeded).
seed	int	0	Read and Write	Random seed value.
queryType	TOPK, THRESHOLD	TOPK	Read and Write	Type of query to run.
tau	double	inferred from simMetric	Read and Write	Correlation Threshold value (only if queryType == THRESHOLD).
runningThreshold	RunningThreshold (Enumeration)	tau	Read and Write	Running correlation threshold (only used if queryType == TOPK).
minJump	double (Between 0 and Double.MAX_VALUE)	0	Read and Write	Minimum jump value.
irreducibility	boolean	false	Read and Write	Flag to enable irreducibility constraint.
topK	int (Between 0 and 100000)	100	Read and Write	The maximum number of top results to retrieve.
allowVectorOverlap	boolean	false	Read and Write	Flag to allow vector overlap in the correlation pattern.
nVectors	int (Between 1 and Integer.MAX_VALUE)	all	Read and Write	Number of vectors to read from the dataset.
nDimensions	int (Between 1 and Integer.MAX_VALUE)	all	Read and Write	Number of dimensions to read per vector.
partition	int (Between 0 and Integer.MAX_VALUE)	0	Read and Write	Dataset partition identifier.
dimensionalityReduction	boolean	false	Read and Write	Flag to enable dimensionality reduction.
dimredEpsilon	double (Between 0 and 1)	0.1	Read and Write	Epsilon value for dimensionality reduction.
dimredDelta	double (Between 0 and 1)	0.8	Read and Write	Delta value for dimensionality reduction.
dimredCorrect	boolean	true	Read and Write	Flag to enable dimensionality reduction correction.
dimredComponents	Integer (Between 1 and Integer.MAX_VALUE)	0.1 * nDimensions	Read and Write	Number of dimensionality reduction components.
discounting	boolean	false	Read and Write	Flag to enable bound discounting.
discountThreshold	double (Between 0 and 2)	0.7	Read and Write	Discount threshold value.
discountTopK	int (Between 1 and Integer.MAX_VALUE)	10	Read and Write	Number of extrema distances to store for each CC.
discountStep	int (Between 1 and Integer.MAX_VALUE)	1	Read and Write	Discounting step value.
empiricalBounding	boolean	true	Read and Write	Flag to enable empirical bounding (only if simMetric supports).
kMeans	Integer (Between 1 and Integer.MAX_VALUE)	inferred from simMetric	Read and Write	K-means parameter for Hierarchical Clustering algorithm .
geoCentroid	boolean	false	Read and Write	Flag to enable usage of geometric centroid in clusters.
startEpsilon	double (Between 0 and Double.MAX_VALUE)	inferred from simMetric	Read and Write	Starting epsilon value for clustering.
epsilonMultiplier	double (Between 0 and 1)	0.8	Read and Write	Epsilon multiplier for clustering.
maxLevels	int (Between 1 and Integer.MAX_VALUE)	20	Read and Write	Maximum levels in the cluster hierarchy.
clusteringAlgorithm	KMEANS	KMEANS	Read and Write	Clustering algorithm to use.
breakFirstKLevelsToMoreClusters	int (Between 0 and Integer.MAX_VALUE)	0	Read and Write	Number of levels to break into more clusters.
clusteringRetries	int (Between 1 and Integer.MAX_VALUE)	20	Read and Write	Number of clustering tries per cluster level.
hashSize	int (Between 1 and Integer.MAX_VALUE)	inferred from query	Read and Write	Hash size for caches (centroids and cluster combinations).
BFSRatio	double (Between 0 and 1)	0.5	Read and Write	BFS ratio for traversal of the comparison tree.
BFSFactor	double	inferred from BFSRatio	Read and Write	BFS factor for traversal of the comparison tree (based on ratio).
shrinkFactor	double	0	Read and Write	Shrink factor $\gamma$ for top-k queries.
statBag	StatBag (Object)	constructed after init()	Read-only	Statistics bag.
randomGenerator	Random (Object)	constructed after init()	Read-only	Random number generator.
pairwiseDistances	double[][] (2D Array)	constructed after init()	Read-only	Pairwise distances cache.

How to Use

To use these parameters and run a query with the Correlation Detective algorithm, refer to the README.md file for usage instructions and examples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PARAMETERS.md

PARAMETERS.md

RunParameters - Query/Run Configuration

Parameters Overview

How to Use

Files

PARAMETERS.md

Latest commit

History

PARAMETERS.md

File metadata and controls

RunParameters - Query/Run Configuration

Parameters Overview

How to Use