博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Hadoop的HA环境搭建
阅读量:7229 次
发布时间:2019-06-29

本文共 7087 字,大约阅读时间需要 23 分钟。

一、集群的规划

Zookeeper集群:

192.168.176.131 (bigdata112)
192.168.176.132 (bigdata113)
192.168.176.135 (bigdata114)

Hadoop集群:

192.168.176.131 (bigdata112) NameNode1 ResourceManager1 Journalnode
192.168.176.132 (bigdata113) NameNode2 ResourceManager2 Journalnode
192.168.176.135 (bigdata114) DataNode1 NodeManager1
192.168.176.136 (bigdata115) DataNode2 NodeManager2

二、准备工作

1、安装JDK

2、配置环境变量
3、配置免密码登录
4、配置主机名

三、配置Zookeeper(在192.168.176.131安装)

在主节点(bigdata112)上配置ZooKeeper

(*)配置/training/zookeeper-3.4.6/conf/zoo.cfg文件

dataDir=/training/zookeeper-3.4.6/tmpserver.1=bigdata112:2888:3888server.2=bigdata113:2888:3888server.3=bigdata114:2888:3888

(*)在/training/zookeeper-3.4.6/tmp目录下创建一个myid的空文件

echo 1 > /training/zookeeper-3.4.6/tmp/myid

(*)将配置好的zookeeper拷贝到其他节点,同时修改各自的myid文件

scp -r /training/zookeeper-3.4.6/ bigdata113:/training        scp -r /training/zookeeper-3.4.6/ bigdata114:/training

(*)分别修改113和114上/training/zookeeper-3.4.6/tmp/myid为2和3

四、安装Hadoop集群(在bigdata112上安装)

1、修改hadoo-env.sh

export JAVA_HOME=/training/jdk1.8.0_144

2、修改core-site.xml

fs.defaultFS
hdfs://ns1
hadoop.tmp.dir
/training/hadoop-2.7.3/tmp
ha.zookeeper.quorum
bigdata112:2181,bigdata113:2181,bigdata114:2181

3、修改hdfs-site.xml(配置这个nameservice中有几个namenode)

dfs.replication
2
dfs.webhdfs.enabled
true
dfs.nameservices
ns1
dfs.ha.namenodes.ns1
nn1,nn2
dfs.namenode.rpc-address.ns1.nn1
bigdata112:9000
dfs.namenode.http-address.ns1.nn1
bigdata112:50070
dfs.namenode.rpc-address.ns1.nn2
bigdata113:9000
dfs.namenode.http-address.ns1.nn2
bigdata113:50070
dfs.namenode.shared.edits.dir
qjournal://bigdata112:8485;bigdata113:8485;/ns1
dfs.journalnode.edits.dir
/training/hadoop-2.7.3/journal
dfs.ha.automatic-failover.enabled
true
dfs.client.failover.proxy.provider.ns1
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence shell(/bin/true)
dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa
dfs.ha.fencing.ssh.connect-timeout
30000

4、修改mapred-site.xml

mapreduce.framework.name
yarn

5、修改yarn-site.xml

yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
yrc
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
bigdata112
yarn.resourcemanager.hostname.rm2
bigdata113
yarn.resourcemanager.zk-address
bigdata112:2181,bigdata113:2181,bigdata114:2181
yarn.nodemanager.aux-services
mapreduce_shuffle

6、修改slaves

bigdata14    bigdata15

7、将配置好的hadoop拷贝到其他节点

scp -r /training/hadoop-2.7.3/ root@bigdata113:/training/    scp -r /training/hadoop-2.7.3/ root@bigdata114:/training/    scp -r /training/hadoop-2.7.3/ root@bigdata115:/training/

五、启动Zookeeper集群

进到zk的安装目录的bin目录下:

启动    ./zkServer.sh start     查看状态    ./zkServer.sh status

六、在bigdata112和bigdata113上启动journalnode

hadoop-daemon.sh start journalnode

七、格式化HDFS(在bigdata112上执行)

  1. 格式化HDFS
    hdfs namenode -format

2.将112上这台的/training/hadoop-2.7.3/tmp/dfs拷贝到bigdata13的/training/hadoop-2.7.3/tmp/dfs下

scp -r /training/hadoop-2.7.3/tmp/dfs/* root@bigdata113:/training/hadoop-2.7.3/tmp/dfs/

3.格式化zookeeper

hdfs zkfc -formatZK

日志:

17/07/13 00:34:33 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK.

八、在bigdata12上启动Hadoop集群

start-all.sh日志:    Starting namenodes on [bigdata12 bigdata13]    bigdata12: starting namenode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop113.out    bigdata13: starting namenode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop112.out    bigdata14: starting datanode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop115.out    bigdata15: starting datanode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop114.out    bigdata13: starting zkfc, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-zkfc-bigdata13.out    bigdata12: starting zkfc, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-zkfc-bigdata12.outbigdata113上的ResourceManager需要单独启动
命令:yarn-daemon.sh start resourcemanager

九、问题延伸

1、隔离机制和隔离级别    (*)关系型数据库:如果不考虑事务隔离级别,造成脏读、不可重复读、幻读    (*)HDFS的HA:如果不考虑隔离机制(隔离级别),造成脑裂的问题2、什么是脑裂的问题?    脑裂的问题,针对的是数据节点(DataNode)    由于某种原因,造成了整个HDFS中存在多个active的NameDode,这时候DataNode就不知道谁是真正的NameNode。

转载于:https://blog.51cto.com/12824426/2177663

你可能感兴趣的文章
OpenStack 的防火墙规则流程
查看>>
Overloading Django Form Fields
查看>>
03.MyBatis的核心配置文件SqlMapConfig.xml
查看>>
python学习笔记(9)-python编程风格
查看>>
Apache HTTP Server搭建虚拟主机
查看>>
(译).NET4.X 并行任务中Task.Start()的FAQ
查看>>
git log显示
查看>>
java中相同名字不同返回类型的方法
查看>>
Rails NameError uninitialized constant class solution
查看>>
Android 获取SDCard中某个目录下图片
查看>>
设置cookies第二天0点过期
查看>>
【转载】NIO客户端序列图
查看>>
poj_2709 贪心算法
查看>>
【程序员眼中的统计学(11)】卡方分布的应用
查看>>
文件夹工具类 - FolderUtils
查看>>
http://blog.csdn.net/huang_xw/article/details/7090173
查看>>
lua学习例子
查看>>
研究:印度气候变暖速度加剧 2040年或面临重灾
查看>>
python爬虫——爬取豆瓣TOP250电影
查看>>
C++与Rust操作裸指针的比较
查看>>