1、ERROR: org.apache.hadoop.hbase.MasterNotRunningException
问题:
运行hbase的时候发现这个错误:
- ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times
原因:
查看log,发现大量的
2012-04-26 08:13:39,600 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting for dfs to exit safe mode...
原来hdfs还处于安全模式
- ./hadoop fsck //hbase/.logs/slave1,60020,1333159627316/slave1%2C60020%2C1333159627316.1333159637444: Under replicated blk_-4160280099734447327_1626. Target Replicas is 3 but found 2 replica(s).
- ....
- /home/hadoop/tmp/mapred/staging/hadoop/.staging/job_201203211238_0002/job.jar: Under replicated blk_-7807519084475423360_1012. Target Replicas is 10 but found 2 replica(s).
- ......................................................................Status: HEALTHY
- Corrupt blocks: 0
- Missing replicas: 9 (3.0612245 %)
Number of data-nodes: 2没有损坏的block,有9个丢失的replicas,状态健康
所以可以强制离开安全模式
解决:
- hadoop dfsadmin -safemode leave
- Safe mode is OFF
运行hbase命令成功
2、HMaster启动后自动关闭
问题:
一启动就出了问题,原先调试好的分布式平台却提示了错误:
- Zookeeper available but no active master location found
原因:
HMaster的问题,JPS查看发现没有了HMaster进程,进入到hbase-master日志中查看,发现了以下错误提示:
- Could not obtain block: blk_number... ...file=/hbase/hbase.version
无法访问数据块的原因无非有两个:一是该数据块不存在;二是该数据块没有权限。自己去HDFS下查看发现了/hbase目录,也有hbase.version文件,虽然该文件大小为0kb。于是自己首先想到是权限问题,接下来开始为/hbase修改权限:
- %hadoop fs -chmod 777 /hbase
- %hadoop fs -chmod -R 777 /hbase (修改目录权限)
但是试过之后结果依旧。这时自己确定HMaster自动关闭的问题不是因为目录权限拒绝访问,那么是什么呢?之前也发生过HMaster启动后自动关闭的问题,自己当时的解决办法是格式化namenode即可:
- %hadoop namenode -format
但是这次试过之后仍旧不成功,于是自己考虑会不会是由于分布式环境下不同节点的hdfs的重复工作导致的不一致使得HMaster无法正常启动呢?
解决:
抱着这样的想法删掉了各个节点和master上的hdfs数据,在master上重新启动hbase结果成功,HMaster不再自动关闭。
这时我们需要重新复制生成HDFS干净的HDFS:
- %rm -Rf reports/data
- %hadoop fs -copyFromLocal reports /texaspete/templates/reports
- %hadoop fs -put match/src/main/resources/regexes /texaspete/regexes
3、Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: te
问题:
- Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: te
原因:
hbase on hadoop2时,配置的hdfs路径是HA的映射目录,而这个路径并不是一个ip:port的格式,hbase在查找主机名的时候并不知道,就把路径中的目录当成了一个ip,无法找到
解决:
把hadoop的hdfs-site.xml和core-site.xml 放到hbase/conf下
4、hive+hbase得不到返回结果
问题:
在hive中建立外表链接到hbase的表,在做复杂查询时发现得不到结果返回。都是hive 0.9 版本。
远程客户端错误:
- Caused by: java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
- at org.apache.hadoop.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:177)
- at org.apache.hadoop.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:140)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
- at java.lang.reflect.Method.invoke(Unknown Source)
- at org.hibernate.engine.jdbc.internal.proxy.AbstractStatementProxyHandler.continueInvocation(AbstractStatementProxyHandler.java:122)
- ... 88 more
- Caused by: HiveServerException(message:Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask, errorCode:9, SQLState:08S01)
- at org.apache.hadoop.hive.service.ThriftHive$execute_result.read(ThriftHive.java:1318)
- at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
- at org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:105)
- at org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:92)
- at org.apache.hadoop.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:175)
- ... 94 more
原因:
查看hadoop日志:缺少hbase的包
- java.io.IOException: Cannot create an instance of InputSplit class = org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit >
- at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:145) >
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) >
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) >
- at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:348) >
- at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:364) >
- at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) >
- at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >
- at java.security.AccessController.doPrivileged(Native Method) >
- at javax.security.auth.Subject.doAs(Subject.java:415) >
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) >
- at org.apache.hadoop.mapred.Child.main(Child.java:262)
解决:
把相关的包导入进hive
修改hive-site.xml
添加:
- <property>
- <name>hive.aux.jars.path</name>
- <value>file:///opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hbase/hbase.jar,file:///opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.4.0.jar,file:///opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/zookeeper/zookeeper.jar</value>
- </property>
- <property>
- <name>hbase.zookeeper.quorum</name>
- <value>zookeeper的主机名</value>
- </property>
5、在通过JDBC访问Hive+HBase做统计查询时报错HBaseSplit not found
问题:在通过JDBC访问Hive+HBase做统计查询时报错HBaseSplit not found
原因:Hive集成HBase,通过JDBC访问HBase映射Hive的表做统计查询时报错(报错信息如下),
- java.io.IOException: Cannot create an instance of InputSplit class = org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplitat org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:146)
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
- at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:396)
- at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
- at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
- at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:396)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
- at org.apache.hadoop.mapred.Child.main(Child.java:249)
- Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HBaseSplit
- at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
- at java.security.AccessController.doPrivileged(Native Method)
- at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
- at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
- at java.lang.Class.forName0(Native Method)
- at java.lang.Class.forName(Class.java:249)
- at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
- at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
错误提示是说HBaseSplit类找不到,但是在classpath中有这个类。但是还需要提供auxpath jar包
解决:
修改一下配置文件hive-site.xml,添加以下配置,问题即解决。
- <property>
- <name>hive.aux.jars.path</name>
- <value>file:///home/用户目录/hive-0.10.0/lib/hive-hbase-handler-0.10.0.jar,file:///home/用户目录/hive-0.10.0/lib/hbase-0.92.0.jar,file:///home/用户目录/hive-0.10.0/lib/zookeeper-3.4.3.jar</value>
- </property>
6、org.apache.hadoop.hbase.ClockOutOfSyncException,Reported time is too far out of sync with master
在启动Hbase的过程中,有的报出了以下的错误:
- org.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hadoop2,16020,1470107202883 has been rejected; Reported time is too far out of sync with master. Time difference of 53999521ms > max allowed of 30000ms
- at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:407)
- at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:273)
- at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:360)
- at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615)
- at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2180)
- at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
- at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
- at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
- at java.lang.Thread.run(Thread.java:744)
- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
- at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
- at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
- at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
- at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
- at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
- at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:329)
- at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2288)
- at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:907)
- at java.lang.Thread.run(Thread.java:744)
- Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ClockOutOfSyncException): org.apache.hadoop.hbase.ClockOutOfSyncException: Server hadoop2,16020,1470107202883 has been rejected; Reported time is too far out of sync with master. Time difference of 53999521ms > max allowed of 30000ms
- at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:407)
- at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:273)
错误解释:
当一个RegionServer始终偏移太大时,master节点结将会抛出此异常.
解决方法:
1、确认几台机器所属的时区 时间是否一致,不一致的情况下要同步一致
2、可以适当增加hbase.master.maxclockskew时间
- <property>
- <name>hbase.master.maxclockskew</name>
- <value>180000</value>
- </property>
【本文为51CTO专栏作者“王森丰”的原创稿件,转载请注明出处】