Tuesday, October 15, 2019

While running a Tez job, it fails with 'vertex failure' error

The error below is seen while a Hive query runs in TEZ execution mode:

Logs:


Vertex failed, vertexName=Reducer 34, vertexId=vertex_1424999265634_0222_1_23, diagnostics=[Task failed, taskId=task_1424999265634_0222_1_23_000007, diagnostics=[AttemptID:attempt_1424999265634_0222_1_23_000007_0 Info:Error: java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.se
... 6 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: : init not supported
at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStreamingEvaluator.init(GenericUDAFStreamingEvaluator.java:70)

at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:160)
... 7 more

CAUSE:

Tez containers are not allocating enough memory to run the query.

SOLUTION:

Set the following configurations values in the Hive query to increase the memory:

tez.am.resource.memory.mb=4096
tez.am.java.opts=-server -Xmx3276m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC

hive.tez.container.size=4096
hive.tez.java.opts=-server -Xmx3276m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC

Note:
- Ensure that the *.opts values are 80% of the *.mbs and there the *mb values can be allocated by the NodeManagers.
- If the issue happens again, please increase the above values by adding the value of the min container size and rerun the query.

No comments:

Post a Comment

Hive Architecture

Hive Architecture in One Image