存档

文章标签 ‘hbase’

Setup Multi Hbase master on Hadoop Cluster

2012年7月24日 评论已被关闭

Hbase

        Setup Multi Hbase master on Hadoop Cluster to avoid single point failure. When active master failed/down for some reason exceed timeout we expected, backup master will be active and take over the role of master, see the value of zookeeper.session.timeout.

$cat /usr/lib/hbase/conf/hbase-site.xml
<name>zookeeper.session.timeout</name>
<value>180000</value>
<description>ZooKeeper session timeout. HBase passes this to the zk quorum as suggested maximum time for a session.In milliseconds.</description>
</property>

  • Test scenarios:

hadoop01.cluster.2hei.net (Hbase master)
hadoop02.cluster.2hei.net (Hbase backup master)

阅读全文…

分类: hbase 标签: ,

hbase Daughter regiondir does not exist

2012年5月14日 评论已被关闭
Hbase master logs:
12/05/14 13:33:44 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26
12/05/14 13:33:44 WARN master.CatalogJanitor: Daughter regiondir does not exist: hdfs://2hei.net:8020/hbase/RecSys_Catalog/7d100af9ac714de605efc9da89a817b3
12/05/14 13:33:44 WARN master.CatalogJanitor: Daughter regiondir does not exist: hdfs://2hei.net:8020/hbase/Track/d87d503a2996200cfa3aae8906767f81
12/05/14 13:33:44 WARN master.CatalogJanitor: Daughter regiondir does not exist: hdfs://2hei.net:8020/hbase/type_subgenre_uniqueId_CatalogIndex/4602f165ca87f345a1f62a48c5677e55
Resons:
May caused by I restart zookeeper first, then stop master and regionserver. when restart Hbase master,it filed init region servers.
other resons:
– region server crashed
– lease timed out
– master starts recovery (can take quite a while to complete)
– region server restarts
– region server sends region server startup message to master
– master waits in rpc handler for old server cleanup (because it
cannot differentiate the new instance from the old).
– ipc from region server to master times out
– region server sends a new startup message. The master thread starts
waiting in the rpc handler for old server cleanup.
– ipc from region server to master times out
Can’t find below folders in HDFS
hadoop fs -ls /hbase/RecSys_Catalog/7d100af9ac714de605efc9da89a817b3
hadoop fs -ls /hbase/type_subgenre_uniqueId_CatalogIndex/4602f165ca87f345a1f62a48c5677e55
hadoop fs -ls /hbase/Track/d87d503a2996200cfa3aae8906767f81
Resolve:
find key from .META., delete them
echo “scan ‘.META.'” | hbase shell|grep 7d100af9ac714de605efc9da89a817b3
echo “scan ‘.META.'” | hbase shell|grep 4602f165ca87f345a1f62a48c5677e55
echo “scan ‘.META.'” | hbase shell|grep d87d503a2996200cfa3aae8906767f81
scan ‘.META.’, {COLUMNS => ‘info:splitA’,TIMESTAMP => 1335245847237}
delete ‘.META.’, ‘Track,,1335245609463.1e40dbd0fb394c05fdaf30ca5f933ea8.’,’info:splitA’ 
delete ‘.META.’, ‘RecSys_Catalog,,1336785348151.1250fbc6334c578629f95113b2a3ba7b.’,’info:splitA’ 
delete ‘.META.’, ‘type_subgenre_uniqueId_CatalogIndex,,1336785356835.d4af2bae84156905367579bc44fcdd97.’,’info:splitA’ 
Hbase logs back to normal:
12/05/14 14:18:44 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26
12/05/14 14:23:44 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26
12/05/14 14:28:45 INFO master.LoadBalancer: Skipping load balancing.  servers=10 regions=261 average=26.1 mostloaded=27 leastloaded=26
分类: hbase 标签: