Skip to content

[Bug]: hive的hdfs写不支持多分区 想写hive的时侯支持动态分区 #1127

Answered by wgzhao
laixueyong asked this question in Q&A
Discussion options

You must be logged in to vote
  1. 如果你这里指的使用 hdfswriter 来保存数据的话,那么写入 HDFS 和 Hive 没有关系。也就是说,hdfswriter 只会检查要写入的目录是否存在,至于这个目录如何创建,它并不关心。
  2. 你说的动态插入实际上是 Hive 的功能,而不是 HDFS 的功能,是 Hive 在插入数据之前自动创建分区。这个不是 hdfswriter 插件所需要解决的问题。
  3. “数据都是从关系型数据库读取存储到hive中 有些是做为ods层的表” 这的确是很常见的场景,所以就我在生产环境的做法是采集分成了两个步骤,第一个步骤调用 hive 创建分区,第二步是通过动态传递参数的方式来告诉 hdfswriter 要写入的 HDFS 目录位置。举例如下:
"writer": {
        "name": "hdfswriter",
        "parameter": {
          "defaultFS": "hdfs://cluster",
          "fileType": "orc",
          "path": "/ods/odstl/account_info/logdate=${logdate}",
          "fileName": "addax",
          "column": [
            {
              "name": "id",
              "type": "bigint"
            },
            {
              "name": 

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by wgzhao
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
enhancement New feature or request
2 participants
Converted from issue

This discussion was converted from issue #1126 on September 18, 2024 07:59.