HiveServer2 常见异常和处理方法

1. Connection timed out

java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000/default: java.net.ConnectException: Connection timed out (Connection timed out)
	at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:256)
	at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:247)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.singleConnectionForAllStatementsExecute(MultiThreadStatementTest.java:88)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.execute(MultiThreadStatementTest.java:73)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.lambda$parallelExecute$0(MultiThreadStatementTest.java:55)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

可能原因:

  1. HiveServer 连接数达到上限。因为每个 jdbc,HiveServer 需要一个线程。
   hive.server2.thrift.max.worker.threads
   500
   Maximum number of Thrift worker threads
 
  1. HiveServer OOM 不能及时处理客户请求
  2. Hive Metastore 没有响应或者后台数据库卡住。

2. Connection reset by peer

Caused by: java.net.SocketException: Connection reset by peer (connect failed)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
	... 14 more

HiveServer2 Server socket 的 backlog 默认是 0,在centos 系统中查看 HiveServer2 的 backlog 是50。当Socket accept 的速度变慢,操作系统接收的新的请求满的时候。就会丢掉新的请求,报这个错误。

查看当前HiveServer 的 backlog 命令如下:

ss -antp > antp

3. Running, pool size = 100, active threads = 100, queued tasks = 100, completed tasks = xxx

异常信息如下

org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@6ef9d564 rejected from java.util.concurrent.ThreadPoolExecutor@6e2c02d2[Running, pool size = 100, active threads = 100, queued tasks = 100, completed tasks = 234]
	at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)
	at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)
	at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:324)
	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:265)
	at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:497)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.execute(MultiThreadStatementTest.java:82)
	at com.baidu.hive.jdbc.MultiThreadStatementTest.lambda$parallelExecute$0(MultiThreadStatementTest.java:56)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

原因是队列满了,调整以下2个参数可以解决。

   hive.server2.async.exec.threads
   100
   Number of threads in the async thread pool for HiveServer2
 
 
   hive.server2.async.exec.wait.queue.size
   100
   
     Size of the wait queue for async thread pool in HiveServer2.
     After hitting this limit, the async thread pool will reject new requests.
   
 

本文来自网络,不代表协通编程立场,如若转载,请注明出处:https://net2asp.com/e9cf7e3749.html