Sunday, 15 January 2012

hadoop - Hbase master keeps dying, claims a hbase:namespace already exists -



hadoop - Hbase master keeps dying, claims a hbase:namespace already exists -

in todays episode of hbase bringing me wits end have issue hbase master starts , dies. master log so:

2014-06-20 12:52:40,469 fatal [master:hdev01:60000] master.hmaster: master serve r abort: loaded coprocessors are: [] 2014-06-20 12:52:40,470 fatal [master:hdev01:60000] master.hmaster: unhandled ex ception. starting shutdown. org.apache.hadoop.hbase.tableexistsexception: hbase:namespace @ org.apache.hadoop.hbase.master.handler.createtablehandler.prepare(cre atetablehandler.java:120) @ org.apache.hadoop.hbase.master.tablenamespacemanager.createnamespacet able(tablenamespacemanager.java:232) @ org.apache.hadoop.hbase.master.tablenamespacemanager.start(tablenames pacemanager.java:86) @ org.apache.hadoop.hbase.master.hmaster.initnamespace(hmaster.java:106 2) @ org.apache.hadoop.hbase.master.hmaster.finishinitialization(hmaster.j ava:926) @ org.apache.hadoop.hbase.master.hmaster.run(hmaster.java:615) @ java.lang.thread.run(thread.java:662) 2014-06-20 12:52:40,473 info [master:hdev01:60000] master.hmaster: aborting 2014-06-20 12:52:40,473 debug [master:hdev01:60000] master.hmaster: stopping ser vice threads 2014-06-20 12:52:40,473 info [master:hdev01:60000] ipc.rpcserver: stopping serv er on 60000 2014-06-20 12:52:40,473 info [catalogjanitor-hdev01:60000] master.catalogjanito r: catalogjanitor-hdev01:60000 exiting 2014-06-20 12:52:40,473 info [hdev01,60000,1403283149823-balancerchore] balance r.balancerchore: hdev01,60000,1403283149823-balancerchore exiting 2014-06-20 12:52:40,474 info [rpcserver.listener,port=60000] ipc.rpcserver: rpc server.listener,port=60000: stopping 2014-06-20 12:52:40,474 info [rpcserver.responder] ipc.rpcserver: rpcserver.res ponder: stopped 2014-06-20 12:52:40,474 info [master:hdev01:60000] master.hmaster: stopping inf oserver 2014-06-20 12:52:40,474 info [rpcserver.responder] ipc.rpcserver: rpcserver.res ponder: stopping 2014-06-20 12:52:40,474 info [master:hdev01:60000.oldlogcleaner] cleaner.logcle aner: master:hdev01:60000.oldlogcleaner exiting 2014-06-20 12:52:40,475 info [hdev01,60000,1403283149823-clusterstatuschore] ba lancer.clusterstatuschore: hdev01,60000,1403283149823-clusterstatuschore exiting 2014-06-20 12:52:40,476 info [master:hdev01:60000.oldlogcleaner] master.replica tionlogcleaner: stopping replicationlogcleaner-0x246ba2ab1e4001c, quorum=hdev02: 5181,hdev01:5181,hdev03:5181, baseznode=/hbase 2014-06-20 12:52:40,479 info [master:hdev01:60000] mortbay.log: stopped selectc hannelconnector@0.0.0.0:16010 2014-06-20 12:52:40,478 info [master:hdev01:60000.archivedhfilecleaner] cleaner .hfilecleaner: master:hdev01:60000.archivedhfilecleaner exiting 2014-06-20 12:52:40,483 info [master:hdev01:60000.oldlogcleaner] zookeeper.zook eeper: session: 0x246ba2ab1e4001c closed 2014-06-20 12:52:40,484 info [master:hdev01:60000-eventthread] zookeeper.client cnxn: eventthread shut downwards 2014-06-20 12:52:40,589 debug [master:hdev01:60000] catalog.catalogtracker: stop ping catalog tracker org.apache.hadoop.hbase.catalog.catalogtracker@f3f348b 2014-06-20 12:52:40,591 info [master:hdev01:60000] client.hconnectionmanager$hc onnectionimplementation: closing zookeeper sessionid=0x246ba2ab1e4001b 2014-06-20 12:52:40,592 info [master:hdev01:60000] zookeeper.zookeeper: session : 0x246ba2ab1e4001b closed 2014-06-20 12:52:40,592 info [master:hdev01:60000-eventthread] zookeeper.client cnxn: eventthread shut downwards 2014-06-20 12:52:40,695 info [hdev01,60000,1403283149823.splitlogmanagertimeout monitor] master.splitlogmanager$timeoutmonitor: hdev01,60000,1403283149823.split logmanagertimeoutmonitor exiting 2014-06-20 12:52:40,696 info [master:hdev01:60000] zookeeper.zookeeper: session : 0x246ba2ab1e4001a closed 2014-06-20 12:52:40,696 info [main-eventthread] zookeeper.clientcnxn: eventthre advertisement shut downwards 2014-06-20 12:52:40,696 info [master:hdev01:60000] master.hmaster: hmaster main thread exiting 2014-06-20 12:52:40,697 error [main] master.hmastercommandline: master exiting java.lang.runtimeexception: hmaster aborted @ org.apache.hadoop.hbase.master.hmastercommandline.startmaster(hmaster commandline.java:194) @ org.apache.hadoop.hbase.master.hmastercommandline.run(hmastercommandl ine.java:135) @ org.apache.hadoop.util.toolrunner.run(toolrunner.java:70) @ org.apache.hadoop.hbase.util.servercommandline.domain(servercommandli ne.java:126) @ org.apache.hadoop.hbase.master.hmaster.main(hmaster.java:2803)

i thought might remnant of old run deleted files in hbases info directory, zookeepers info directory , hdfs. still got same error. strangely hmaster popper 1 time again temporarily when ran stop-hbase.sh although there wasn't much it.

my hbase version 98.3 , hadoop 2.2.0. hbase-site.comf is

<configuration> <property> <name>hbase.master</name> <value>hdev01:60000</value> <description>the host , port hbase master runs at. value of 'local' runs master , regionserver in single process. </description> </property> <property> <name>hbase.rootdir</name> <value>hdfs://hdev01:9000/hbase</value> <description>the directory shared part servers.</description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> <description>the mode cluster in. possible values false: standalone , pseudo-distributed setups managed zookeeper true: fully-distributed unmanaged zookeeper quorum (see hbase-env.sh) </description> </property> <property> <name>hbase.zookeeper.property.clientport</name> <value>5181</value> <description>property zookeeper's config zoo.cfg. port @ clients connect. </description> </property> <property> <name>zookeeper.session.timeout</name> <value>10000</value> <description></description> </property> <property> <name>hbase.client.retries.number</name> <value>10</value> <description></description> </property> <property> <name>hbase.zookeeper.quorum</name> <value>hdev01,hdev02,hdev03</value> <description>comma separated list of servers in zookeeper quorum. example, "host1.mydomain.com,host2.mydomain.com". default set localhost local , pseudo-distributed modes of operation. fully-distributed setup, should set total list of zookeeper quorum servers. if hbase_manages_zk set in hbase-env.sh list of servers start/stop zookeeper on. </description> </property> </configuration>

edit attempted hbase org.apache.hadoop.hbase.util.hbck.offlinemetarepair, error hbase file layout needs upgraded. have version null , want version 8. hbase.rootdir valid? if so, may need run 'hbase hbck -fixversionfile' unhelpful since without master hbck not run. edited edit nuked , restarted dfs , tried repairing , starting things again, started.

hbase namespace internal namespace hbase uses own management tables. seek run offline repair tool $hbase_home directory:

./bin/hbase org.apache.hadoop.hbase.util.hbck.offlinemetarepair

hadoop hbase

No comments:

Post a Comment