`
leongfans
  • 浏览: 85246 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

HBase运维-节点故障Server REPORT rejected;

阅读更多

hbase性能测试,加载了一个晚上的数据,早上来时发现一个节点挂掉了,其他一切正常。

查看日志,发下如下问题

12/01/04 09:45:39 FATAL regionserver.HRegionServer: ABORTING region server serverName=hadoop5.site,60020,1325663355680, load=(requests=983, regions=252, usedHeap=3085, maxHeap=4983): Unhandled exception: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing hadoop5.site,60020,1325663355680 as dead server
org.apache.hadoop.hbase.YouAreDeadException: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing hadoop5.site,60020,1325663355680 as dead server
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
    at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:735)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:596)
    at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing hadoop5.site,60020,1325663355680 as dead server
    at org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:204)
    at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:262)
    at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:669)
    at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)

    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
    at $Proxy6.regionServerReport(Unknown Source)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:729)
    ... 2 more

 

再往上找,可以看到

2012-01-04T09:42:27.829-0500: 24795.829: [GC 24795.829: [ParNew: 151317K->10586K(153344K), 0.5282750 secs] 4970251K->4832124K(5102976K) icms_dc=0 , 0.5284260 secs] [Times: user=3.29 sys=0.01, real=0.53 secs]
2012-01-04T09:42:28.721-0500: 24796.721: [GC 24796.721: [ParNew (promotion failed): 146906K->140702K(153344K), 0.5622020 secs]24797.283: [CMS: 4824062K->3150755K(4949632K), 189.5658760 secs] 4968444K->3150755K(5102976K), [CMS Perm : 20156K->20153K(33704K)] icms_dc=0 , 190.1283170 secs] [Times: user=7.43 sys=0.96, real=190.14 secs]
2012-01-04T09:45:38.852-0500: 24986.852: [GC [1 CMS-initial-mark: 3150755K(4949632K)] 3152726K(5102976K), 0.0015480 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
2012-01-04T09:45:38.853-0500: 24986.854: [CMS-concurrent-mark-start]

12/01/04 09:45:38 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 237682ms for sessionid 0x34a7b17bf80004, closing socket connection and attempting reconnect
12/01/04 09:45:38 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 237682ms for sessionid 0x34a7b17bf80003, closing socket connection and attempting reconnect

12/01/04 09:45:38 WARN ipc.HBaseServer: IPC Server Responder, call getClosestRowBefore([B@166cb249, [B@3a2ce21f, [B@58b17f0f) from 192.9.200.164:34106: output error
12/01/04 09:45:38 WARN ipc.HBaseServer: IPC Server Responder, call getClosestRowBefore([B@6816a498, [B@26902c8b, [B@435c6d74) from 192.9.200.238:38457: output error
12/01/04 09:45:38 WARN ipc.HBaseServer: IPC Server Responder, call getClosestRowBefore([B@23b4b286, [B@2c348dba, [B@2e44c502) from 192.9.200.164:34106: output error
12/01/04 09:45:38 WARN ipc.HBaseServer: PRI IPC Server handler 6 on 60020 caught: java.nio.channels.ClosedChannelException
    at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
    at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083)

 

系统在挂掉前进行了一次长达190s+的gc,导致长时间未与ZooKeeper通信,系统就认为这个节点挂掉了。

 

再分析这次fullgc的原因ParNew (promotion failed)

 

这个问题的产生是由于救助空间不够,从而向年老代转移对象,年老代没有足够的空间来容纳这些对象,导致一次full gc的产生。 解决这个问题的办法有两种完全相反的倾向:增大救助空间、增大年老代或者去掉救助空间 增大救助空间就是调整-XX:SurvivorRatio参数,这个参数是Eden区和Survivor区的大小比值,默认是32,也就是说Eden区是 Survivor区的32倍大小,要注意Survivo是有两个区的,因此Surivivor其实占整个young genertation的1/34。调小这个参数将增大survivor区,让对象尽量在survitor区呆长一点,减少进入年老代的对象。去掉救助空 间的想法是让大部分不能马上回收的数据尽快进入年老代,加快年老代的回收频率,减少年老代暴涨的可能性,这个是通过将-XX:SurvivorRatio 设置成比较大的值(比如65536)来做到。

 

还有一个系统的原因,那是因为这台机器比别的节点多部署了一个约占2G内存的应用,导致这台机器挂掉,但是其他机器没有出现问题

 

 

 

 

1
0
分享到:
评论

相关推荐

    ycsb-hbase14-binding-0.17.0

    ycsb-hbase14-binding-0.17.0

    HBase(hbase-2.4.9-bin.tar.gz)

    HBase(hbase-2.4.9-bin.tar.gz)是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System...

    hbase-server-1.4.3-API文档-中文版.zip

    赠送jar包:hbase-server-1.4.3.jar; 赠送原API文档:hbase-server-1.4.3-javadoc.jar; 赠送源代码:hbase-server-1.4.3-sources.jar; 赠送Maven依赖信息文件:hbase-server-1.4.3.pom; 包含翻译后的API文档:...

    Hbase运维手册.pdf

    Hbase运维手册.pdf

    hbase的hbase-1.2.0-cdh5.14.2.tar.gz资源包

    hbase的hbase-1.2.0-cdh5.14.2.tar.gz资源包

    hbase-meta-repair-hbase-2.0.2.jar

    HBase 元数据修复工具包。 ①修改 jar 包中的application.properties,重点是 zookeeper.address、zookeeper.nodeParent、hdfs....③开始修复 `java -jar -Drepair.tableName=表名 hbase-meta-repair-hbase-2.0.2.jar`

    hbase-prefix-tree-1.4.3-API文档-中文版.zip

    赠送jar包:hbase-prefix-tree-1.4.3.jar; 赠送原API文档:hbase-prefix-tree-1.4.3-javadoc.jar; 赠送源代码:hbase-prefix-tree-1.4.3-sources.jar; 赠送Maven依赖信息文件:hbase-prefix-tree-1.4.3.pom; ...

    hbase-server-1.1.3-API文档-中文版.zip

    赠送jar包:hbase-server-1.1.3.jar; 赠送原API文档:hbase-server-1.1.3-javadoc.jar; 赠送源代码:hbase-server-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-server-1.1.3.pom; 包含翻译后的API文档:...

    phoenix-hbase-2.2-5.1.2-bin.tar.gz

    phoenix-hbase-2.2-5.1.2-bin.tar.gz

    hive-hbase-handler-1.2.1.jar

    被编译的hive-hbase-handler-1.2.1.jar,用于在Hive中创建关联HBase表的jar,解决创建Hive关联HBase时报FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop....

    hbase-hadoop-compat-1.1.3-API文档-中文版.zip

    赠送jar包:hbase-hadoop-compat-1.1.3.jar; 赠送原API文档:hbase-hadoop-compat-1.1.3-javadoc.jar; 赠送源代码:hbase-hadoop-compat-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-hadoop-compat-1.1.3....

    hbase-prefix-tree-1.1.3-API文档-中文版.zip

    赠送jar包:hbase-prefix-tree-1.1.3.jar; 赠送原API文档:hbase-prefix-tree-1.1.3-javadoc.jar; 赠送源代码:hbase-prefix-tree-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-prefix-tree-1.1.3.pom; ...

    hbase-server-1.2.12-API文档-中文版.zip

    赠送jar包:hbase-server-1.2.12.jar; 赠送原API文档:hbase-server-1.2.12-javadoc.jar; 赠送源代码:hbase-server-1.2.12-sources.jar; 赠送Maven依赖信息文件:hbase-server-1.2.12.pom; 包含翻译后的API文档...

    phoenix-client-hbase-2.2-5.1.2.jar

    phoenix-client-hbase-2.2-5.1.2.jar

    hbase-client-2.1.0-cdh6.3.0.jar

    hbase-client-2.1.0-cdh6.3.0.jar

    hbase-metrics-api-1.4.3-API文档-中文版.zip

    赠送jar包:hbase-metrics-api-1.4.3.jar; 赠送原API文档:hbase-metrics-api-1.4.3-javadoc.jar; 赠送源代码:hbase-metrics-api-1.4.3-sources.jar; 赠送Maven依赖信息文件:hbase-metrics-api-1.4.3.pom; ...

    hbase-1.2.1-bin.tar.gz.zip

    hbase-1.2.1-bin.tar.gz.zip 提示:先解压再使用,最外层是zip压缩文件

    hbase-server-1.2.12-API文档-中英对照版.zip

    赠送jar包:hbase-server-1.2.12.jar; 赠送原API文档:hbase-server-1.2.12-javadoc.jar; 赠送源代码:hbase-server-1.2.12-sources.jar; 赠送Maven依赖信息文件:hbase-server-1.2.12.pom; 包含翻译后的API文档...

    hbase-hadoop-compat-1.1.3-API文档-中英对照版.zip

    赠送jar包:hbase-hadoop-compat-1.1.3.jar; 赠送原API文档:hbase-hadoop-compat-1.1.3-javadoc.jar; 赠送源代码:hbase-hadoop-compat-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-hadoop-compat-1.1.3....

    hbase-server-1.1.3-API文档-中英对照版.zip

    赠送jar包:hbase-server-1.1.3.jar; 赠送原API文档:hbase-server-1.1.3-javadoc.jar; 赠送源代码:hbase-server-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-server-1.1.3.pom; 包含翻译后的API文档:...

Global site tag (gtag.js) - Google Analytics