Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the issue that ShardingSphere cannot connect to HiveServer2 using remote Hive Metastore Server #33837

Merged
merged 1 commit into from
Nov 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
1. Encrypt: Fix merge exception without encrypt rule in database - [#33708](https://github.com/apache/shardingsphere/pull/33708)
1. SQL Parser: Fix mysql parse zone unreserved keyword error - [#33720](https://github.com/apache/shardingsphere/pull/33720)
1. Proxy: Fix BatchUpdateException when execute INSERT INTO ON DUPLICATE KEY UPDATE in proxy adapter - [#33796](https://github.com/apache/shardingsphere/pull/33796)
1. Infra: Fix the issue that ShardingSphere cannot connect to HiveServer2 using remote Hive Metastore Server - [#33837](https://github.com/apache/shardingsphere/pull/33837)

### Change Logs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,17 @@ ShardingSphere 对 HiveServer2 JDBC Driver 的支持位于可选模块中。
<artifactId>hive-service</artifactId>
<version>4.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -81,6 +92,17 @@ ShardingSphere 对 HiveServer2 JDBC Driver 的支持位于可选模块中。
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -427,8 +449,31 @@ ShardingSphere 仅针对 HiveServer2 `4.0.1` 进行集成测试。
### Hadoop 限制

用户仅可使用 Hadoop `3.3.6` 来作为 HiveServer2 JDBC Driver `4.0.1` 的底层 Hadoop 依赖。
HiveServer2 JDBC Driver `4.0.1` 不支持 Hadoop `3.4.1`,
参考 https://github.com/apache/hive/pull/5500 。
HiveServer2 JDBC Driver `4.0.1` 不支持 Hadoop `3.4.1`, 参考 https://github.com/apache/hive/pull/5500 。

对于 HiveServer2 JDBC Driver `org.apache.hive:hive-jdbc:4.0.1` 或 `classifier` 为 `standalone` 的 `org.apache.hive:hive-jdbc:4.0.1`,
实际上并不额外依赖 `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`。

但 `org.apache.shardingsphere:shardingsphere-infra-database-hive` 的
`org.apache.shardingsphere.infra.database.hive.metadata.data.loader.HiveMetaDataLoader` 会使用 `org.apache.hadoop.hive.conf.HiveConf`,
这进一步使用了 `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6` 的 `org.apache.hadoop.mapred.JobConf` 类。

ShardingSphere 仅需要使用 `org.apache.hadoop.mapred.JobConf` 类,
因此排除 `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6` 的所有额外依赖是合理行为。

```xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
```

### SQL 限制

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,17 @@ The possible Maven dependencies are as follows.
<artifactId>hive-service</artifactId>
<version>4.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -83,6 +94,17 @@ The following is an example of a possible configuration,
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -433,8 +455,31 @@ Reference https://issues.apache.org/jira/browse/HIVE-28418.
### Hadoop Limitations

Users can only use Hadoop `3.3.6` as the underlying Hadoop dependency of HiveServer2 JDBC Driver `4.0.1`.
HiveServer2 JDBC Driver `4.0.1` does not support Hadoop `3.4.1`,
Reference https://github.com/apache/hive/pull/5500.
HiveServer2 JDBC Driver `4.0.1` does not support Hadoop `3.4.1`. Reference https://github.com/apache/hive/pull/5500 .

For HiveServer2 JDBC Driver `org.apache.hive:hive-jdbc:4.0.1` or `org.apache.hive:hive-jdbc:4.0.1` with `classifier` as `standalone`,
there is actually no additional dependency on `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`.

But `org.apache.shardingsphere:shardingsphere-infra-database-hive`'s
`org.apache.shardingsphere.infra.database.hive.metadata.data.loader.HiveMetaDataLoader` uses `org.apache.hadoop.hive.conf.HiveConf`,
which further uses `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`'s `org.apache.hadoop.mapred.JobConf` class.

ShardingSphere only needs to use the `org.apache.hadoop.mapred.JobConf` class,
so it is reasonable to exclude all additional dependencies of `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`.

```xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
```

### SQL Limitations

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[
{
"condition":{"typeReachable":"org.apache.hadoop.security.UserGroupInformation"},
"name":"org.apache.hadoop.security.UserGroupInformation$UgiMetrics",
"allDeclaredFields": true,
"allDeclaredMethods": true
}
]
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,6 @@
"includes":[{
"condition":{"typeReachable":"org.apache.hadoop.conf.Configuration"},
"pattern":"\\Qhadoop-site.xml\\E"
}, {
"condition":{"typeReachable":"org.apache.hadoop.conf.Configuration"},
"pattern":"\\Qcore-default.xml\\E"
}, {
"condition":{"typeReachable":"org.apache.hadoop.conf.Configuration"},
"pattern":"\\Qcore-site.xml\\E"
}]},
"bundles":[]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[
{
"condition":{"typeReachable":"org.apache.hadoop.hive.metastore.HiveMetaStoreClient"},
"name":"org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl",
"methods":[{"name":"<init>","parameterTypes":["org.apache.hadoop.conf.Configuration"] }]
},
{
"condition":{"typeReachable":"org.apache.hadoop.hive.metastore.conf.MetastoreConf"},
"name":"org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl"
}
]
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
"condition":{"typeReachable":"org.apache.shardingsphere.proxy.initializer.BootstrapInitializer"},
"interfaces":["java.sql.Connection"]
},
{
"condition":{"typeReachable":"org.apache.shardingsphere.infra.database.hive.metadata.data.loader.HiveMetaDataLoader"},
"interfaces":["org.apache.hadoop.metrics2.MetricsSystem$Callback"]
},
{
"condition":{"typeReachable":"org.apache.shardingsphere.driver.jdbc.core.datasource.ShardingSphereDataSource"},
"interfaces":["org.apache.hive.service.rpc.thrift.TCLIService$Iface"]
Expand Down
Loading