This module contains resource files and example variable definition files for deployment of Cloudera Data Platform (CDP) Public Cloud environment and Datalake creation on AWS, Azure or GCP.
The examples directory has example CDP deployments:
-
ex01-aws-basic
creates a basic CDP deployment on AWS. This example makes use of the terraform-cdp-aws-pre-reqs module to create the required cloud resources. -
ex02-azure-basic
creates a basic CDP deployment on Azure. This example makes use of the terraform-cdp-azure-pre-reqs module to create the required cloud resources. -
ex02-gcp-basic
creates a basic CDP deployment on GCP. This example makes use of the terraform-cdp-gcp-pre-reqs module to create the required cloud resources.
In each directory an example terraform.tfvars.sample
values file is included to show input variable values.
Name | Version |
---|---|
terraform | >= 1.3.0 |
cdp | >= 0.6.1 |
No providers.
Name | Source | Version |
---|---|---|
cdp_on_aws | ./modules/aws | n/a |
cdp_on_azure | ./modules/azure | n/a |
cdp_on_gcp | ./modules/gcp | n/a |
No resources.
Name | Description | Type | Default | Required |
---|---|---|---|---|
backup_storage_location | Backup storage location. The location has to be in uri format for the cloud provider - i.e. s3a:// for AWS, abfs:// for Azure, gs:// | string |
n/a | yes |
data_storage_location | Data storage location. The location has to be in uri format for the cloud provider - i.e. s3a:// for AWS, abfs:// for Azure, gs:// | string |
n/a | yes |
deployment_template | Deployment Pattern to use for Cloud resources and CDP | string |
n/a | yes |
env_prefix | Shorthand name for the environment. Used in CDP resource descriptions. This will be used to construct the value of where any of the CDP resource variables (e.g. environment_name, cdp_iam_admin_group_name) are not defined. | string |
n/a | yes |
environment_cascading_delete | Flag to enable cascading delete of environment and associated resources | bool |
n/a | yes |
environment_description | Description of CDP environment | string |
n/a | yes |
infra_type | Cloud Provider to deploy CDP. | string |
n/a | yes |
log_storage_location | Log storage location. The location has to be in uri format for the cloud provider - i.e. s3a:// for AWS, abfs:// for Azure, gs:// | string |
n/a | yes |
region | Region which cloud resources will be created | string |
n/a | yes |
agent_source_tag | Tag to identify deployment source | map(any) |
{ |
no |
aws_datalake_admin_role_arn | Datalake Admin Role ARN. Required for CDP deployment on AWS. | string |
null |
no |
aws_idbroker_instance_profile_arn | IDBroker Instance Profile ARN. Required for CDP deployment on AWS. | string |
null |
no |
aws_log_instance_profile_arn | Log Instance Profile ARN. Required for CDP deployment on AWS. | string |
null |
no |
aws_private_subnet_ids | List of private subnet ids. Required for CDP deployment on AWS. | list(string) |
null |
no |
aws_public_subnet_ids | List of public subnet ids. Required for CDP deployment on AWS. | list(string) |
null |
no |
aws_ranger_audit_role_arn | Ranger Audit Role ARN. Required for CDP deployment on AWS. | string |
null |
no |
aws_raz_role_arn | ARN for Ranger Authorization Service (RAZ) role. Only applicable for CDP deployment on AWS. | string |
null |
no |
aws_security_access_cidr | CIDR range for inbound traffic. With this option security groups will be automatically created. Only used for CDP deployment on AWS. Note it is recommended to specify pre-existing security groups instead of this option. | string |
null |
no |
aws_security_group_default_id | ID of the Default Security Group for CDP environment. Required for CDP deployment on AWS. | string |
null |
no |
aws_security_group_knox_id | ID of the Knox Security Group for CDP environment. Required for CDP deployment on AWS. | string |
null |
no |
aws_vpc_id | AWS Virtual Private Network ID. Required for CDP deployment on AWS. | string |
null |
no |
aws_xaccount_role_arn | Cross Account Role ARN. Required for CDP deployment on AWS. | string |
null |
no |
azure_accept_image_terms | Flag to automatically accept Azure Marketplace image terms during CDP cluster deployment. | bool |
true |
no |
azure_aks_private_dns_zone_id | The ID of an existing private DNS zone used for the AKS. | string |
null |
no |
azure_cdp_gateway_subnet_names | List of Azure Subnet Names CDP Endpoint Access Gateway. Required for CDP deployment on Azure. | list(any) |
null |
no |
azure_cdp_subnet_names | List of Azure Subnet Names for CDP Resources. Required for CDP deployment on Azure. | list(any) |
null |
no |
azure_create_private_endpoints | Flag to specify that Azure Postgres will be configured with Private Endpoint and a Private DNS Zone. | bool |
null |
no |
azure_database_private_dns_zone_id | The ID of an existing private DNS zone used for the database. | string |
null |
no |
azure_datalake_flexible_server_delegated_subnet_name | The subnet ID for the subnet within which you want to configure your Azure Flexible Server for the CDP datalake | string |
null |
no |
azure_datalakeadmin_identity_id | Datalake Admin Managed Identity ID. Required for CDP deployment on Azure. | string |
null |
no |
azure_environment_flexible_server_delegated_subnet_names | List of Azure Subnet Names delegated for Private Flexible servers. Required for CDP deployment on Azure. | list(any) |
null |
no |
azure_idbroker_identity_id | IDBroker Managed Identity ID. Required for CDP deployment on Azure. | string |
null |
no |
azure_load_balancer_sku | The Azure load balancer SKU type. Possible values are BASIC, STANDARD or None. The current default is BASIC. To disable the load balancer, use type NONE. | string |
null |
no |
azure_log_identity_id | Log Data Access Managed Identity ID. Required for CDP deployment on Azure. | string |
null |
no |
azure_ranger_audit_identity_id | Ranger Audit Managed Identity ID. Required for CDP deployment on Azure. | string |
null |
no |
azure_raz_identity_id | RAZ Managed Identity ID. Required for CDP deployment on Azure. | string |
null |
no |
azure_resource_group_name | Azure Resource Group name. Required for CDP deployment on Azure. | string |
null |
no |
azure_security_access_cidr | CIDR range for inbound traffic. With this option security groups will be automatically created. Only used for CDP deployment on Azure. Note it is recommended to specify pre-existing security groups instead of this option. | string |
null |
no |
azure_security_group_default_uri | Azure Default Security Group URI. Required for CDP deployment on Azure. | string |
null |
no |
azure_security_group_knox_uri | Azure Knox Security Group URI. Required for CDP deployment on Azure. | string |
null |
no |
azure_subscription_id | Subscription ID where the Azure pre-reqs are created. Required for CDP deployment on Azure. | string |
null |
no |
azure_tenant_id | Tenant ID where the Azure pre-reqs are created. Required for CDP deployment on Azure. | string |
null |
no |
azure_vnet_name | Azure Virtual Network ID. Required for CDP deployment on Azure. | string |
null |
no |
azure_xaccount_app_pword | Password for the Azure AD Cross Account Application. Required for CDP deployment on Azure. | string |
null |
no |
azure_xaccount_app_uuid | UUID for the Azure AD Cross Account Application. Required for CDP deployment on Azure. | string |
null |
no |
cdp_admin_group_name | Name of the CDP IAM Admin Group associated with the environment. Defaults to '<env_prefix>-cdp-admin-group' if not specified. | string |
null |
no |
cdp_user_group_name | Name of the CDP IAM User Group associated with the environment. Defaults to '<env_prefix>-cdp-user-group' if not specified. | string |
null |
no |
cdp_xacccount_credential_name | Name of the CDP Cross Account Credential. Defaults to '<env_prefix>-xaccount-cred' if not specified. If create_cdp_credential is set to false then this should should be a valid pre-existing credential. | string |
null |
no |
create_cdp_credential | Flag to specify if the CDP Cross Account Credential should be created. If set to false then cdp_xacccount_credential_name should be a valid pre-existing credential. | bool |
true |
no |
datalake_async_creation | Flag to specify if Terraform should wait for CDP datalake resource creation/deletion | bool |
false |
no |
datalake_call_failure_threshold | Threshold value that specifies how many times should a single CDP Datalake API call failure happen before giving up the polling | number |
3 |
no |
datalake_image | The image to use for the datalake. Can only be used when the 'datalake_version' parameter is set to null. You can use 'catalog' name and/or 'id' for selecting an image. | object({ |
null |
no |
datalake_java_version | The Java major version to use on the datalake cluster. | number |
null |
no |
datalake_name | Name of the CDP datalake. Defaults to '<env_prefix>-<aw|az|gc|>-dl' if not specified. | string |
null |
no |
datalake_polling_timeout | Timeout value in minutes for how long to poll for CDP datalake resource creation/deletion | number |
90 |
no |
datalake_recipes | Additional recipes that will be attached on the datalake instances | set( |
null |
no |
datalake_scale | The scale of the datalake. Valid values are LIGHT_DUTY, ENTERPRISE. | string |
null |
no |
datalake_version | The Datalake Runtime version. Valid values are latest or a semantic version, e.g. 7.2.17 | string |
"latest" |
no |
enable_ccm_tunnel | Flag to enable Cluster Connectivity Manager tunnel. If false then access from Cloud to CDP Control Plane CIDRs is required from via SG ingress | bool |
true |
no |
enable_outbound_load_balancer | Create outbound load balancers for Azure environments. Only applicable for CDP deployment on Azure. | bool |
null |
no |
enable_raz | Flag to enable Ranger Authorization Service (RAZ) | bool |
true |
no |
encryption_at_host | Provision resources with host encryption enabled. Only applicable for CDP deployment on Azure. | bool |
null |
no |
encryption_key_arn | ARN of the AWS KMS CMK to use for the server-side encryption of AWS storage resources. Only applicable for CDP deployment on AWS. | string |
null |
no |
encryption_key_resource_group_name | Name of the existing Azure resource group hosting the Azure Key Vault containing customer managed key which will be used to encrypt the Azure Managed Disk. Only applicable for CDP deployment on Azure. | string |
null |
no |
encryption_key_url | URL of the key which will be used to encrypt the Azure Managed Disks. Only applicable for CDP deployment on Azure. | string |
null |
no |
encryption_user_managed_identity | Managed Identity ID for encryption | string |
"" |
no |
endpoint_access_scheme | The scheme for the workload endpoint gateway. PUBLIC creates an external endpoint that can be accessed over the Internet. PRIVATE which restricts the traffic to be internal to the VPC / Vnet. Relevant in Private Networks. | string |
null |
no |
env_tags | Tags applied to provisioned resources | map(any) |
null |
no |
environment_async_creation | Flag to specify if Terraform should wait for CDP environment resource creation/deletion | bool |
false |
no |
environment_call_failure_threshold | Threshold value that specifies how many times should a single CDP Environment API call failure happen before giving up the polling | number |
3 |
no |
environment_name | Name of the CDP environment. Defaults to '<env_prefix>-cdp-env' if not specified. | string |
null |
no |
environment_polling_timeout | Timeout value in minutes for how long to poll for CDP Environment resource creation/deletion | number |
60 |
no |
freeipa_catalog | Image catalog to use for FreeIPA image selection | string |
null |
no |
freeipa_image_id | Image ID to use for creating FreeIPA instances | string |
null |
no |
freeipa_instance_type | Instance Type to use for creating FreeIPA instances | string |
null |
no |
freeipa_instances | The number of FreeIPA instances to create in the environment | number |
3 |
no |
freeipa_os | The Operating System to be used for the FreeIPA instances | string |
null |
no |
freeipa_recipes | The recipes for the FreeIPA cluster | set(string) |
null |
no |
gcp_availability_zones | The zones of the environment in the given region. Multi-zone selection is not supported in GCP yet. It accepts only one zone until support is added. | list(string) |
null |
no |
gcp_cdp_subnet_names | List of GCP Subnet Names for CDP Resources. Required for CDP deployment on GCP. | list(any) |
null |
no |
gcp_datalake_admin_service_account_email | Email id of the service account for Datalake Admin. Required for CDP deployment on GCP. | string |
null |
no |
gcp_encryption_key | Key Resource ID of the customer managed encryption key to encrypt GCP resources. Only applicable for CDP deployment on GCP. | string |
null |
no |
gcp_firewall_default_id | Default Firewall for CDP environment. Required for CDP deployment on GCP. | string |
null |
no |
gcp_firewall_knox_id | Knox Firewall for CDP environment. Required for CDP deployment on GCP. | string |
null |
no |
gcp_idbroker_service_account_email | Email id of the service account for IDBroker. Required for CDP deployment on GCP. | string |
null |
no |
gcp_log_service_account_email | Email id of the service account for Log Storage. Required for CDP deployment on GCP. | string |
null |
no |
gcp_network_name | GCP Network VPC name. Required for CDP deployment on GCP. | string |
null |
no |
gcp_project_id | GCP project to deploy CDP environment. Required for CDP deployment on GCP. | string |
null |
no |
gcp_ranger_audit_service_account_email | Email id of the service account for Ranger Audit. Required for CDP deployment on GCP. | string |
null |
no |
gcp_raz_service_account_email | Email id of the service account for Ranger Authorization Service (RAZ). Only applicable for CDP deployment on GCP. | string |
null |
no |
gcp_xaccount_service_account_private_key | Base64 encoded private key of the GCP Cross Account Service Account Key. Required for CDP deployment on GCP. | string |
null |
no |
keypair_name | SSH Keypair name in Cloud Service Provider. For CDP deployment on AWS, either 'keypair_name' or 'public_key_text' needs to be set. | string |
null |
no |
multiaz | Flag to specify that the FreeIPA and DataLake instances will be deployed across multi-availability zones. | bool |
true |
no |
proxy_config_name | Name of the proxy config to use for the environment. | string |
null |
no |
public_key_text | SSH Public key string for the nodes of the CDP environment. Required for CDP deployment on Azure. For CDP deployment on AWS, either 'keypair_name' or 'public_key_text' needs to be set. | string |
null |
no |
s3_guard_table_name | Name for the DynamoDB table backing S3Guard. Only applicable for CDP deployment on AWS. | string |
null |
no |
use_public_ips | Use public ip's for the CDP resources created within the Cloud network. Required for CDP deployment on Azure and GCP. | bool |
null |
no |
use_single_resource_group | Use a single resource group for all provisioned CDP resources. Required for CDP deployment on Azure. | bool |
true |
no |
workload_analytics | Flag to specify if workload analytics should be enabled for the CDP environment | bool |
true |
no |
Name | Description |
---|---|
cdp_environment_crn | CDP Environment CRN |
cdp_environment_name | CDP Environment Name |