This document is the configuration documentation for ZettaStor Distributed Block Storage product, for the branch v1.0_OpenSource
. The configuration of other branches is only for reference. This document is used to guide engineers to understand the parameters and functions of each module in detail when using the product, and to make reasonable configurations according to user needs, on-site hardware environment, and demand planning.
2022-08-30 Explain in combination with the latest configuration of the product
2022-09-30 release after review
This chapter introduces some configurable parameters. Users can change these parameters according to their own needs and environmental specifications, so that the product can meet the needs of users and achieve optimal performance.
Under normal circumstances, the parameters not mentioned in the document do not affect the functionality of the product after deployment, there is no need to add them explicitly to the configuration file.
The product environment configuration refers to the configuration files and contents under /opt/pengyun-deploy/config
directory. Modification of these configuration data should be done before the deployment or upgrade operation is performed. If the xml configuration file is modified, you should execute perl bin/update_system_config.pl
to update the local configuration, before the deployment or upgrade operation is performed. The new configuration will be applied and transferred to the corresponding nodes and services.
The environment configuration consists of the following files in the /opt/pengyun-deploy/config
directory:
Configuration File | Description |
---|---|
zookeeper.properties | ZooKeeper service deployment |
deploy.properties | Service name, version, node information, service port and service deployment time |
module_settings.xml | Modify property values in the corresponding*.properties, so that the service runs according to the specified properties |
This chapter explains the fields that need to be updated during the configuration process of deploy.properties
.
Attribute Name | Default Value | Description |
---|---|---|
remote.network | Net mask | The network used to access cluster nodes for deployment and other related operations from the deployment node, usually it is the Control Network. |
remote.user | root | The name of the user who logs in to the cluster node for deployment and other privileged operations, usually root or other users with root privileges. |
remote.password | The password of the corresponding user. Do not include special characters such as single quotes, double quotes, backquote, slashes, backslashes, etc. that may cause characters to escape. | |
remote.platform | x86_64 | The platform architecture of the cluster node, this storage product supports mainstream platforms. |
platform.update | false | Whether to update the platform architecture according to the actual hardware information, for the mainstream platform, it can be set to true to automatically update the platform architecture field. |
jdbc.type | postgresql | The database type used by the cluster needs to be consistent with the content in the xml file configuration, usually postgresql. |
jdbc.user | py | The user name for database access. |
jdbc.password | 312 | Password for database access. |
XXX.dir.name | The running directory of the service, usually in the /var/testing/packages directory of the node |
|
XXX.version | Usually no changes needed. | |
XXX.deploy.host.list | The node information of the service, a colon indicates continuous IP addresses, and a comma indicates discrete IP addresses | |
XXX.deploy.port | Usually no changes needed. | |
XXX. deploy.agent.jmx.port | Usually no changes needed. | |
DIH.center.host.list | DIH service-specific fields, need to be consistent with the IP address of the center.dih.endpoint . |
|
XXX.remote.timeout | The timeout period for related operations such as service deployment, if the service has not reached the desired state after set time, it will be considered as timeout. This field is used to prevent the operation from waiting indefinitely. When the timeout expires, no additional processing will be performed on the executed operations. Usually no changes needed. | |
DataNode.deployment.host.group.enabled | false | DataNode service-specific field, whether to specify the Group ID of the DataNode service of the storage node. If it is set to false , the product will automatically group the storage nodes; If it is set to true , it will be grouped according to the specified IP addresses. |
DataNode.deployment.host.group.x=****** | This configuration takes effect when DataNode.deployment.host.group.enabled is set to true , and is used to specify the Group ID of the storage node. It is necessary to configure 3 or more different Groups and corresponding node IP addresses. |
For the service name XXX in the above table, please refer to the following information for planning.
Service Name | Distribution Planning |
---|---|
DIH | All nodes in the cluster |
InfoCenter | Usually 2-3 nodes in the cluster, it is recommended that the selected nodes should be in different groups |
DriverContainer | Configure as required, usually all nodes in the cluster |
DataNode | The node that provides storage resources, usually all nodes in the cluster |
Console | Usually one or more nodes. If multiple nodes are configured, it is recommended that the selected nodes should be in different groups |
deployment_daemon | All nodes of the cluster |
Coordinator | Consistent with configuration in the DriverContainer |
This chapter explains the fields that need to be updated during the configuration process of the zookeeper.properties
.
Attribute Name | Default Value | Description |
---|---|---|
remote.user | root | The name of the user who logs in to the cluster node for Zookeeper component deployment and other related operations, usually the root user or other users with root privileges |
remote.password | Deployment and other privileged user's password. Do not include special characters such as single quotes, double quotes, backquote, slashes, backslashes, etc. that may cause characters to escape. | |
server.X | The IP address of the node where the ZooKeeper service is deployed, usually the same as the IP address of the Control Network, X starts from 1, the number must be odd, usually 3-7 nodes, and the number of nodes accounts for more than half of the total number of nodes in the clusters, no more than 7 is recommended. ZooKeeper nodes are evenly distributed in different groups, and the node information is consistent with the ZooKeeper-related configuration in the xml file |
This chapter introduces the function and configuration of each parameter in the xml file under the config
directory.
The format of the attribute value is as follows:
<property name="name_value" value="value_string"/>
The content in the first double quotation mark is the attribute name, and the content in the second double quotation mark is the attribute value.
Usually, in a physical machine environment, there is no need to make many changes to the configuration in the xml file except for the parameters related to the IP address, the number of groups, and the number of disks.
If some configuration items are missing in the xml file, please add manually.
Some configuration parameters are mandatory in every module, it is not necessary to set them separately in each module. In this case, set <project name="*">
means it will take effect globally for all modules. If a parameter is set separately in a module, the values set in the module override global values. If you want to override values in a module for certain parameters, usually jvm.properties
, you may set it separately.
Attribute Name | Default Value | Description |
---|---|---|
log4j.rootLogger | WARN, FILE | The log level recorded in the log file. By default, only logs higher than this level are printed. The valid levels are ERROR, FILE, DEBUG, FILE, INFO, FILE |
log4j.appender.FILE.MaxFileSize | 200MB | The maximum size of each log file is adjusted according to the actual environment of the maximum space available for the storage node /var/testing directory. The valid units are GB, MB |
log4j.appender.FILE.MaxBackupIndex | 15 | Set the number of log files. The name of the current log file is *.log, and the name of the historical log file is *.log.x. The maximum value of x is the value of this parameter |
Attribute Name | Default Value | Description |
---|---|---|
control.flow.subnet | Control Network configuration. | |
monitor.flow.subnet | Monitor Network configuration, usually consistent with the value of control.flow.subnet. | |
enable.data.depart.from.control | true | Whether the Control, Monitoring, Data and Business Networks are separated. When using 2 or more network segments, set it to true |
data.flow.subnet | Storage Private Network configuration, used for data communication between storage nodes, usually 10 Gigabit network or IB network | |
outward.flow.subnet | Business Network configuration, the network used by computing nodes and storage for I/O transmission |
Attribute Name | Default Value | Description |
---|---|---|
page.size.byte | 4096 | page size, if the physical size of the disk used in the actual environment is 4096, the configuration should be 4096; if the physical size of the disk used is 512, and the configuration should be 8192. This configuration is related to page.metadata.need.flush.to.disk configuration |
segment.size.byte | 1073741824 | Segment size in bytes, the default size is 1G, it is the smallest unit of volume capacity, the ideal value of the number of segments of a single node is recommended not to exceed about 5000, and the theoretical maximum limit is 15000. In order to make full use of the space on the disk, use the total disk capacity of a single node, divide it by the number of segments (usually 5000), and select the closest value to the nth power of 2, such as 2G, 4G, 8G etc. The total number of segment units in the system should not exceed 100,000. When configuring this parameter, it is also necessary to consider the possibility of storage expansion in the future . |
io.timeout.ms | 60000 | Normally, there is no need to adjust the I/O timeout, but it can be adjusted appropriately for the virtual machine environment or when the network delay is relatively large. |
The main function of jvm.properties
is to limit (or configure) the memory size used by each service (each service is composed of one or more JAVA processes). jvm.properties
needs to be configured in all modules except Console. The values that need to be set for each module are different, and have to be configured individually for each service according to the environment hardware specifications. For the DriverContainer module, an additional file coordinator-jvm.properties
exists. The configuration parameters are similar to those in jvm.properties
. For the DataNode module, additional memory resources need to be allocated when the service is running. The configuration is relatively complicated. For details, refer to DataNode module in detail.
Attribute Name | Default Value | Description |
---|---|---|
initial.mem.pool.size | xxm/xxg | The initial memory allocated at runtime |
max.mem.pool.size | yym/yyg | The maximum memory allocated at runtime |
Here we focus on the configurations in multiple modules, such as database and ZooKeeper configurations. These configurations may be used in InfoCenter, Coordinator and other modules, the configuration content shall be consistent.
In order to ensure the stability of the system, the InfoCenter service is usually deployed on two or more nodes, one of which is the Master node, and the rest are Slave nodes. When the Master node fails, a Slave node turns into a Master node, and the switch between the active and standby states is implemented by ZooKeeper.
Attribute Name | Default Value | Description |
---|---|---|
zookeeper.connection.string | ZK_IP1:2181, ZK_IP2:2181, ZK_IP3:2181 | ZooKeeper cluster configuration, ZK_IP is the IP configured in the zookeeper.properties file, the number of nodes must be an odd number, usually using the IP address of the Control Network. |
zookeeper.election.switch | true | Enable/disable ZooKeeper, must be enabled when deploying active and standby services. |
Currently the database uses PostgreSQL (whether it is a single node or a cluster method). This configuration is for the InfoCenter module.
Configuration in PostgreSQL mode:
Property Name | Default | Description |
jdbc.driver.class | org.postgresql.Driver | Database driver |
jdbc.url | InfoCenter module configuration: jdbc: postgresql://IP_address:5432/controlandinfodb |
PostgreSQL database IP address and name |
hibernate.dialect | py.db.sqlite.dialect.PostgresCustomDialect | PostgreSQL database dialect |
package.hbm | hibernate-config | Configuration in PostgreSQL mode |
Attribute Name | Default Value | Description |
io.depth | 128 | Queue depth, usually no need to modify |
enable.logger.tracer | false | Log tracing, used only for debugging, in production environment must be disabled |
trace.all.logs | false | Trace all logs, used only for debugging, in production environment must be disabled |
debug.io.timeout.threshold.ms | 1500 | I/O timeout threshold |
network.checksum.algorithm | DUMMY | Network checksum algorithm, consistent with page.checksum.algorithm and network.checksum.algorithm values in datanode.properties , valid values are DUMMY, ALDER32, DIGEST, CRC32, CRC32C |
ping.host.timeout.ms | 500 | Coordinator to DataNode network detection timeout, in a POC environment it is recommended to be 100, in a production environment it is recommended to be 500 |
network.connection.detect.retry.maxtimes | 3 | Network detection retry times |
network.healthy.check.time | 3 | Network check frequency |
Attribute Name | Default Value | Description |
center.dih.endpoint | IP_Address:10000 | Center DIH node configuration, IP_Address is consistent with the IP configured in DIH.center.host.list in the deploy.properties file |
Attribute Name | Default Value | Description |
threshold.to.request.for.new.member.ms | 1800000 | The waiting time for a new Secondary to join a segment, in milliseconds |
wait.time.ms.to.move.segment.to.deleted | 300000 | The duration of time that the volume can be recovered after being deleted. After this time, the volume cannot be recovered. The unit is in milliseconds |
memory.size.for.data.logs.mb.per.archive | 300 | The fastbuffer space size required by each disk, the total fastbuffer space size on the node is (the number of disks + 1) multiplied by the value of this parameter, usually no need modify, the unit is in MB |
page.system.memory.cache.size | 2G | The memory space required by the system page, 0 means to adapt according to the actual physical memory size, it is not recommended to configure it as 0, if there is not enough memory, it can be reduced according to the actual environment |
page.metadata.need.flush.to.disk | false | Flushing metadata pages to disk, for 0.5k sector set to true , for 4k sector set to false , this configuration is related to the page.size.byte |
delay.record.storage.exception.ms | 2000 | |
page.checksum.algorithm | DUMMY | Paging verification algorithm, consistent with the value of network.checksum.algorithm in coordinator.properties and network.checksum.algorithm in datanode.properties , valid values are DUMMY, ALDER32, DIGEST, CRC32 , CRC32C |
network.checksum.algorithm | DUMMY | Network checksum algorithm, consistent with the value of page.checksum.algorithm in network.checksum.algorithm in coordinator.properties and page.checksum.algorithm in datanode.properties , valid values are DUMMY, ALDER32, DIGEST, CRC32 , CRC32C |
archive.init.mode | overwrite | DataNode initialization mode, optional values are append , overwrite , usually no need to change |
max.io.pending.requests | 5000 | Pending queue, usually no need to modify |
max.io.depth.per.hdd.storage | 64 | The maximum iodepth of the disk, usually no need to modify |
The JVM configuration of DataNode is explained separately. The following are suggested values for the production environment, which need to be slightly adjusted.
Attribute Name | Default Value | Description |
initial.mem.pool.size | 40G | The initial memory allocated at runtime |
max.mem.pool.size | 40G | The maximum memory allocated during runtime |
parallel.gc.threads | 15 | The number of parallel threads, it is configured as about 5/8 of the number of CPU threads on a single node |
conc.gc.threads | 4 | This parameter is about 1/4 of the value of parallel.gc.threads |
This attribute is used to configure the memory settings of the Coordinator, corresponding to the driver listed on Web Console, and each driver needs to consume a certain amount of memory resources on the corresponding container.
Attribute Name | Default Value | Description |
initial.mem.pool.size | 1024m | The initial memory allocated for each driver |
min.mem.pool.size | 1024m | Minimum memory size for each driver |
max.mem.pool.size | 1024m | Maximum memory size for each driver |
max.direct.memory.size | 512m | |
netty.allocator.maxOrder | 7 |
Attribute Name | Default Value | Description |
system.memory.force.reserved | 512M | The system-reserved memory, driver mounting and unmounting operations are performed by DriverContainer. Before mounting a driver, DriverContainer will calculate whether the remaining system memory is still greater than this value after the driver is mounted. If the remaining system content is less than the value set, it will prompt that the memory resources are insufficient, and the mounting operation will not be performed, so as to prevent insufficient system resources from affecting other services after the mounting is completed. In the production environment, this value can be adjusted appropriately, such as 2048M |
iet.target.flag | false | iSCSI mode, this field is set to true for IET mode, false for cli mode |
iscsi.portal.type | IPV6 | Coordinator listening address type, if this field is set to IPV4 , it only monitors IPv4 addresses, and IPV6 means it monitors both IPv4 and IPv6 addresses |
Attribute Name | Default Value | Description |
default.saveConfig.file.path | /etc/target/saveconfig.json | The path of this file is not the same for RPM and DEB environments, please pay attention. |
Attribute Name | Default Value | Description |
saveconfig.path | /etc/target/saveconfig.json | saveconfig path, the path is different under CentOS and Ubuntu |
restore.command | /usr/bin/targetcli restore /etc/target/saveconfig.json | restore executable path, the path is different under the CentOS and Ubuntu |
Attribute Name | Default Value | Description |
is.arbiter.group.set | false | Whether to use a special group for the arbiter function, when this field is set to true , an arbiter group number should be specified in arbiter.group.id |
arbiter.group.id | 0 | Arbiter group number, effective when is.arbiter.group.set is true |
page.wrapp.count | 128 | Stripe width, when reading and writing sequentially, the data size of each segment is page.wrapp.count * segment.size , then the system switches to the next segment. |
segment.wrapp.count | 8 | The number of data disks on a single node usually needs to be adjusted according to the actual number. If there are multiple nodes in a group in the environment, this value needs to be modified to the number of data disks in each group. |
group.count | 4 | The number of storage node groups, usually between 3 and 5. For 2-copy or 3-copy volumes, each segment should be distributed in 3 different groups. This configuration cannot be modified after deployment. |
jdbc.url | Please refer to the database configuration description. | |
max.rebalance.task.count.volume.datanode | 50 | The number of rebalance concurrency is calculated according to the number of segment units. It is recommended to set this value to 50 in a production environment. |
This is a separate description for the JVM configuration of InfoCenter, which is related to the segment unit.
Attribute Name | Default Value | Description |
initial.mem.pool.size | 1024m | Initial memory allocation size at runtime |
max.mem.pool.size | 2048m | The maximum memory allocatable during runtime. This value is related to the number of segment units. If the number of segment units is less than 20,000, it is recommended to configure this value to 2G. If the number of segment units exceeds 20,000, increase 1G on every 10,000 units, and the total number of segment units should not exceed 100,000 |
Parameters may be removed as the requirement changes. Due to the limitations, this document only lists certain parameters that may need to be modified (set explicitly), other parameters are not specified in the file (set implicitly).
In the case of parameter removal, redundant parameters will not affect the functionality of the product.
In the case of new parameter addition, you need to add the corresponding parameter to the xml file and set it explicitly.
Parameters are explicitly set as follows:
- Locate in the xml file which need to add or modify parameters
- Set the attribute name and corresponding value in the xml file
For your convenience, it is recommended that when adding attributes, add them in alphabetical order in the xml file. After setting, make sure to execute perl bin/update_system_config.pl
to update the configuration in the local software package before deploying or upgrading.