基于Zookeeper的HDFS HA配置主要涉及两个文件,core-site和hdfs-site.xml。
测试环境有三台
hadoop.master
hadoop.slave1
hadoop.slave2
hadoop.master包含的组件NameNode, JournalNode, Zookeeper,DFSZKFailoverController
hadoop.slave1 包含的组件Standby NameNode, DataNode, JournaleNode,DFSZKFailoverController
hadoop.slave2 包含的组件DataNode,JournalNode
1. core-site.xml配置
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hdfsHA</value> </property> <property> <name>io.file.buffer.size</name> <value>131702</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hadoop/data/tmp</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop.master:2181</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value></value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value></value> </property> <property> <name>hadoop.native.lib</name> <value>true</value> <description>Should native hadoop libraries, if present, be used.</description> </property> </configuration>
2. hdfs-site.xml配置
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.nameservices</name> <value>hdfsHA</value> </property> <property> <name>dfs.ha.namenodes.hdfsHA</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.hdfsHA.nn1</name> <value>hadoop.master:9000</value> </property> <property> <name>dfs.namenode.rpc-address.hdfsHA.nn2</name> <value>hadoop.slave1:9000</value> </property> <property> <name>dfs.namenode.http-address.hdfsHA.nn1</name> <value>hadoop.master:50070</value> </property> <property> <name>dfs.namenode.http-address.hdfsHA.nn2</name> <value>hadoop.slave1:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop.master:8485;hadoop.slave1:8485;hadoop.slave2:8485/hdfsHA</value> </property> <property> <name>dfs.ha.automatic-failover.enabled.hdfsHA</name> <value>true</value> </property> <property> <name>dfs.client.failover.proxy.provider.hdfsHA</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hadoop/data/dfs/journal</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/hadoop/data/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/hadoop/data/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop.master:9001</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>
3.启动过程
3.1 将两个配置文件分发到hadoop.slave1和hadoop.slave2节点
3.2 在三台机器上启动journalnode
sbin/hadoop-daemon.sh start journalnode
启动进程为6725 org.apache.hadoop.hdfs.qjournal.server.JournalNode
3.3 在hadoop.master上格式化Zookeeper(实际上三台机器哪一台都可以)
bin/hdfs zkfc -formatZK成功信息为:ha.ActiveStandbyElector: Successfully created /hadoop-ha/hdfsHA in ZK
3.4 在hadoop.master上初始化namenode并启动
bin/hdfs namenode -format sbin/hadoop-daemon.sh start namenode
3.5 对hadoop.slave1 namenode进行格式化并启动
bin/hdfs namenode -format sbin/hadoop-daemon.sh start namenode
此时,两台机器都处于standby状态
3.6 在hadoop.master和hadoop.slave1上启动zkfc
sbin/hadoop-daemon.sh start zkfc
启动进程为DFSZKFailoverController
此时,有一台处于active状态,另一台处于standby状态
3.7 在hadoop.master上启动datanode,此时slave1和slave2两台机器的datanode启动
相关推荐
HDFS HA 配置、启动与验证;HDFS HA 配置、启动与验证;任务一 修改core-site.xml配置文件;任务二 修改hdfs-site.xml配置文件(一);任务二 修改hdfs-site.xml配置文件(二);任务二 修改hdfs-site.xml配置文件(三);...
hdfs ha设置文档
HadoopHA高可用集群配置 hdfs-site.xml
Hadoop HA 集群搭建所需要的配置文件:core-site,hdfs-site,mapred-site,yarn-site四个xml文件和一个slaves文件
Hadoop部署和配置Kerberos安全认证全套流程。已经过实测并部署与生产环境。
01_hadoop_hdfs1分布式文件系统01 02_hadoop_hdfs1分布式文件系统02 03_hadoop_hdfs1分布式文件系统03 04_hadoop_hdfs1分布式文件系统04 05_hadoop_hdfs1分布式文件系统05 06_hadoop_hdfs1分布式文件系统06 07_...
第3~5章分别介绍了Namenode、Datanode以及HDFS客户端这三个组件的实现细节,同时穿插介绍了HDFS 2.X的新特性,例如Namenode HA、Federation Namenode等。 阅读《Hadoop 2.X HDFS源码剖析》可以帮助读者从架构设计与...
1、笔记本4G内存 ,操作系统WIN7 (屌丝的配置) 2、工具VMware Workstation 3、虚拟机:CentOS6.4共四台 每台机器:内存512M,硬盘40G,网络适配器:NAT模式 边看边操作 (本人原创)
通过修改flume源码实现flume向两个HA hadoop集群分发数据。
hadoop的ha配置过程,亲自搭建测试过,hdfs和yarn的HA都完美运行,mapreduce提交到集群中也运行成功。
之前一直在网上找社区版hadoop的配置,发现CDH安装直接界面搞定,自动生成配置。供检查学习。
hadoop-HA集群运作机制介绍 所谓HA,即高可用(7*24小时不中断服务),实现高可用最关键的是消除单点故障。hadoop-ha严格来说应该分成各个组件的HA机制——HDFS的HA、YARN的HA.
第1章 HDFS HA及解决方案 1.1 HDFS系统架构 1.2 HA定义 1.3 HDFS HA原因分析及应对措施 1.3.1 可靠性 1.3.2 可维护性 1.4 现有HDFS HA解决方案 1.4.1 Hadoop的元数据备份方案 1.4.2 Hadoop的SecondaryNameNode方案 ...
hadoopHA with QJM环境搭建(hdfs+hbase)
关于Hadoop分布式搭建部署的文档,仅供参考,可以切磋交流噢
Hadoop-2.8.1+Zookeeper-3.4.6(HDFS,YARN HA)部署指南
包含core-site.xml 、hdfs-site.xml 、mapred-site.xml 、yarn-site.xml 、slave
所谓HA,即高可用,实现高可用最关键的是消除单点故障,hadoop-ha严格来说应该分成各个组件的HA机制——HDFS的HA、YARN的HA;通过双namenode消除单点故障;通过双namenode协调工作
搭建手工切换HA的HDFS集群,学习大数据hadoop搭建环境的可以看下,通俗易懂
ecplise远程连接hadoop--hdfs java api操作文件.pdf