Solr Index on Hive - Loading data into External table fails #42

gseshwar · 2018-03-15T15:48:09Z

Hello,
I created a Solr Index on a Hive table with below steps. When I try to load rows from the Hive Internal table to the Hive External table, it fails. Pls help.

CREATE TABLE ER_ENTITY1000(entityid INT,claimid_s INT,firstname_s STRING,lastname_s STRING,addrline1_s STRING, addrline2_s STRING, city_s STRING, state_S STRING, country_s STRING, zipcode_s STRING, dob_s STRING, ssn_s STRING, dl_num_s STRING, proflic_s STRING, policynum_s STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
LOAD DATA LOCAL INPATH '/home/Solr1.csv' OVERWRITE INTO TABLE ER_ENTITY1;
add jar /home/solr-hive-serde-3.0.0.jar;

CREATE EXTERNAL TABLE SOLR_ENTITY999(entityid INT,claimid_s INT,firstname_s STRING,lastname_s STRING,ssn_s STRING,dl_num_s STRING,city_s STRING,state_s STRING,country_s STRING,zipcode_s STRING)
> STORED BY 'com.lucidworks.hadoop.hive.LWStorageHandler'
> LOCATION '/user/i98779/SOLR_ENTITY1'
> TBLPROPERTIES('solr.server.url' = 'http://10.52.192.108:8983/solr','solr.collection' = 'er_entity','solr.query' = ':');

********** All above steps work fine **********

********** This step fails **********
INSERT OVERWRITE TABLE SOLR_ENTITY999 SELECT * FROM ER_ENTITY1000;

... With error:
hive> INSERT OVERWRITE TABLE SOLR_ENTITY999 SELECT * FROM ER_ENTITY1000;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = i98779_20180308085142_3918b9ea-2158-4b0e-865f-2fcdefc17e4b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2018-03-08 08:51:45,993 Stage-1 map = 0%, reduce = 0%
Ended Job = job_local1283927429_0001 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: MAPRFS Read: 0 MAPRFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

********** ERROR FROM HIVE JOB LOG is as below **********
java.lang.Exception: Unknown container. Container either has not started or has already completed or doesn't belong to this node at all.

acesar · 2018-03-15T17:18:40Z

@gseshwar Not sure if it is a typo but the second step you have a different table:

LOAD DATA LOCAL INPATH '/home/Solr1.csv' OVERWRITE INTO TABLE ER_ENTITY1;

ER_ENTITY1 -> ER_ENTITY11000

this log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
you need to check the yarn logs.

ctargett added the information needed More information is needed to answer label May 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solr Index on Hive - Loading data into External table fails #42

Solr Index on Hive - Loading data into External table fails #42

gseshwar commented Mar 15, 2018

acesar commented Mar 15, 2018

Solr Index on Hive - Loading data into External table fails #42

Solr Index on Hive - Loading data into External table fails #42

Comments

gseshwar commented Mar 15, 2018

acesar commented Mar 15, 2018