1、hadoop集群环境搭建new一.准备环境1.1. 安装包1)准备4台PC2)安装配置Linux系统:CentOS-7.0-1406-x86_64-DVD.iso3)安装配置Java环境:jdk-8u121-linux-x64.gz4)安装配置Hadoop:hadoop-2.7.4-x64.tar.gz5)安装配置Hbase:hbase-1.2.1-bin.tar.gz1.2. 网络配置主机名IPmaster172.16.18.102slave1172.16.18.103slave2172.16.18.104slave3172.16.18.1051.3. 常用命令# systemctl sta
2、rt foo.service #运行一个服务# systemctl stop foo.service #停止一个服务# systemctl restart foo.service #重启一个服务# systemctl status foo.service #显示一个服务(无论运行与否)的状态# systemctl enable foo.service #在开机时启用一个服务# systemctl disable foo.service #在开机时禁用一个服务# systemctl is-enablediptables.service #查看服务是否开机启动# reboot#重启主机# shut
3、down -h now #立即关机# source /etc/profile #配置文件修改立即生效# yum install net-tools二.安装配置CentOS2.1安装CentOS1)选择启动盘CentOS-7.0-1406-x86_64-DVD.iso,启动安装2)选择Install CentOS 7,回车,继续安装3)选择语言,默认是English,学习可以选择中文,正时环境选择English4)配置网络和主机名,主机名:master,网络选择开启,配置手动的IPV45)选择安装位置;在分区处选择手动配置;选择标准分区,点击这里自动创建他们,点击完成,收受更改6)修改root密
4、码,密码:Jit1237)重启,安装完毕。2.2配置IP2.2.1检查IP# ip addr或# ip link2.2.2配置IP和网关#cd/etc/sysconfig/network-scripts #进入网络配置文件目录# find ifcfg-em* #查到网卡配置文件,例如ifcfg-em1# vi ifcfg-em1 #编辑网卡配置文件或# vi /etc/sysconfig/network-scripts/ifcfg-em1#编辑网卡配置文件配置内容:BOOTPROTO=static #静态IP配置为static,动态配置为dhcpONBOOT=yes#开机启动IPADDR=17
5、2.16.18.102 #IP地址NETMASK=255.255.255.0 #子网掩码GATEWAY=172.16.18.1DNS1=219.149.194.55# systemctl restart network.service #重启网络2.2.3配置hosts# vi /etc/hosts编辑内容:172.16.18.102 master172.16.18.103 slave1172.16.18.104 slave2172.16.18.105 slave32.3关闭防火墙# systemctl status firewalld.service #检查防火墙状态# systemctl
6、stop firewalld.service #关闭防火墙# systemctl disable firewalld.service #禁止开机启动防火墙2.4时间同步# yum install -y ntp #安装ntp服务# ntpdate cn.pool.ntp.org #同步网络时间2.5安装配置jdk2.5.1卸载自带jdk安装好的CentOS会自带OpenJdk,用命令java -version ,会有下面的信息: Javaversion1.6.0 OpenJDK Runtime Environment (build 1.6.0-b09) OpenJDK 64-Bit Server
7、 VM (build 1.6.0-b09, mixedmode)最好还是先卸载掉openjdk,在安装sun公司的jdk.先查看rpm -qa | grep java显示如下信息: java-1.4.2-gcj-compat-1.4.2.0-40jpp.115 java-1.6.0-openjdk-1.6.0.0-1.7.b09.el5卸载: rpm -e -nodeps java-1.4.2-gcj-compat-1.4.2.0-40jpp.115 rpm -e -nodeps java-1.6.0-openjdk-1.6.0.0-1.7.b09.el5还有一些其他的命令 rpm -qa |
8、 grep gcj rpm -qa | grep jdk如果出现找不到openjdksource的话,那么还可以这样卸载 yum -y remove javajava-1.4.2-gcj-compat-1.4.2.0-40jpp.115 yum -y remove javajava-1.6.0-openjdk-1.6.0.0-1.7.b09.el52.5.2安装jdk上传jdk-8u121-linux-x64.gz 安装包到root根目录# mkdir /home# tar -zxvf jdk-8u121-linux-x64.gz -C /home/# rm -rf jdk-8u121-lin
9、ux-x64.gz2.5.3各个主机之间复制jdk# scp -r /home rootslave1:/home/hadoop# scp -r /home rootslave2:/home/hadoop# scp -r /home rootslave3:/home/hadoop2.5.4各个主机配置jdk环境变量# vi /etc/profile编辑内容export JAVA_HOME=/home/jdk1.8.0_121export PATH=$JAVA_HOME/bin:$PATHexportCLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/
10、tools.jar# source /etc/profile #使配置文件生效# java -version #查看java版本创建hadoop用户(每台主机上执行)rootMaster1 # groupadd hadoop /创建用户组rootMaster1 # useradd -g hadoop hadoop /新建hadoop用户并增加到hadoop工作组rootMaster1 # passwd hadoop/设置密码2.6配置ssh无密钥访问分别在各个主机上检查ssh服务状态:# systemctl status sshd.service #检查ssh服务状态# yum install
11、 openssh-server openssh-clients #安装ssh服务,如果已安装,则不用执行该步骤# systemctl start sshd.service #启动ssh服务,如果已安装,则不用执行该步骤分别在各个主机上生成密钥(每台主机分别执行)#su - hadoop /登录到hadoop用户# ssh-keygen -t rsa-P #生成密钥(按三次回车完成),如下图所示在slave1上# cp /.ssh/id_rsa.pub /.ssh/slave1.id_rsa.pub#scp /.ssh/slave1.id_rsa.pub hadoopmaster:/.ssh在s
12、lave2上# cp /.ssh/id_rsa.pub /.ssh/slave2.id_rsa.pub# scp /.ssh/slave2.id_rsa.pub hadoopmaster:/.ssh在slave3上# cp /.ssh/id_rsa.pub /.ssh/slave3.id_rsa.pub# scp /.ssh/slave3.id_rsa.pub hadoopmaster:/.ssh在master上# cd /.ssh# cat id_rsa.pub authorized_keys# cat slave1.id_rsa.pub authorized_keys# cat slave
13、2.id_rsa.pub authorized_keys# cat slave3.id_rsa.pub authorized_keys# scp authorized_keys hadoopslave1:/.ssh# scp authorized_keys hadoopslave2:/.ssh# scp authorized_keys hadoopslave3:/.ssh分别在各个主机上执行如下命令(赋予权限)su - hadoopchmod 600 /.ssh/authorized_keys测试ssh免密登录ssh slave1 #第一次登录需要输入yes 然后回车,如没提示输入密码,则配置
14、成功。三.安装配置hadoop3.1安装hadoop上传hadoop-2.7.4.tar.gz安装包到root根目录# tar -zxvf hadoop-2.7.4.tar.gz -C /home/hadoop# rm -rf hadoop-2.7.4.tar.gz# mkdir /home/hadoop/hadoop-2.7.4/tmp# mkdir /home/hadoop/hadoop-2.7.4/logs# mkdir /home/hadoop/hadoop-2.7.4/hdf# mkdir /home/hadoop/hadoop-2.7.4/hdf/data# mkdir /home
15、/hadoop/hadoop-2.7.4/hdf/name3.1.1在hadoop中配置hadoop-env.sh文件edit the file etc/hadoop/hadoop-env.sh todefine some parameters as follows: # set to the root ofyour Java installation export JAVA_HOME=/home/jdk1.8.0_1213.1.2修改yarn-env.sh#export JAVA_HOME=/home/y/libexec/jdk1.7.0/export JAVA_HOME=/home/jdk
16、1.8.0_1213.1.3修改slaves# vi /home/hadoop/hadoop-2.7.4/etc/hadoop/slaves配置内容:删除:localhost添加:slave1slave2slave33.1.4修改core-site.xml# vi /home/hadoop/hadoop-2.7.4/etc/hadoop/core-site.xml配置内容: fs.default.name hdfs:/master:9000 hadoop.tmp.dir file:/home/hadoop/hadoop-2.7.4/tmp io.file.buffer.size 131072
17、该属性值单位为KB,131072KB即为默认的64M 3.1.5修改hdfs-site.xml# vi /home/hadoop/hadoop-2.7.4/etc/hadoop/hdfs-site.xml配置内容:dfs.nameserviceshadoop-cluster1 dfs.datanode.data.dir /home/hadoop/hadoop-2.7.4/hdf/data true dfs.namenode.name.dir /home/hadoop/hadoop-2.7.4/hdf/name true dfs.replication1分片数量,伪分布式将其配置成1即可 dfs
18、.permissions false 3.1.6修改mapred-site.xml# cp /home/hadoop/hadoop-2.7.4/etc/hadoop/mapred-site.xml.template /home/hadoop/hadoop-2.7.4/etc/hadoop/mapred-site.xml# vi /home/hadoop/hadoop-2.7.4/etc/hadoop/mapred-site.xml配置内容: mapreduce.framework.name yarn mapreduce.jobhistory.address master:10020 mapre
19、duce.jobhistory.webapp.address master:19888 3.1.7修改yarn-site.xml# vi /home/hadoop/hadoop-2.7.4/etc/hadoop/yarn-site.xml配置内容: yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.mapred.ShuffleHandler yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address master:8032
20、 yarn.resourcemanager.scheduler.address master:8030 yarn.resourcemanager.resource-tracker.address master:8031 yarn.resourcemanager.admin.address master:8033 yarn.resourcemanager.webapp.address master:8088 3.2各个主机之间复制hadoop# scp -r /home/hadoop/hadoop-2.7.4 hadoopslave1:/home/hadoop# scp -r /home/had
21、oop/hadoop-2.7.4 hadoopslave2:/home/hadoop# scp -r /home/hadoop/hadoop-2.7.4 hadoopslave3:/home/hadoop3.3各个主机配置hadoop环境变量#su - root# vi /etc/profile编辑内容:export HADOOP_HOME=/home/hadoop/hadoop-2.7.4export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATHexport HADOOP_LOG_DIR=/home/hadoop/hadoop-2.7.4/log
22、sexport YARN_LOG_DIR=$HADOOP_LOG_DIR# source /etc/profile #使配置文件生效3.4格式化namenode# cd /home/hadoop/hadoop-2.7.4/sbin# hdfs namenode -format3.5启动hadoop启动hdfs:# cd /home/hadoop/hadoop-2.7.4/sbin# start-all.sh检查hadoop启动情况:http:/172.16.18.102:50070 #如下图所示http:/172.16.18.102:8088/cluster # 如下图所示检查进程:# jps
23、master主机包含ResourceManager、SecondaryNameNode、NameNode等,则表示启动成功,例如2212 ResourceManager2484 Jps1917 NameNode2078 SecondaryNameNode各个slave主机包含DataNode、NodeManager等,则表示启用成功,例如17153 DataNode17334 Jps17241 NodeManager停止hadoop命名#stop-all.sh四.安装配置zookeeper4.1配置zookeeper环境变量vi /etc/profileexport ZOOKEEPER_HOM
24、E=/home/hadoop/zookeeper-3.4.6export PATH=$ZOOKEEPER_HOME/bin:$PATHsource /etc/profile4.2配置zookeeper1、到zookeeper官网下载zookeeper2、在slave1,slave2,slave3上面搭建zookeeper例如:slave1 172.16.18.103slave2 172.16.18.104slave3 172.16.18.1053、上传zookeeper-3.4.6.tar.gz到任意一台服务器的根目录,并解压:zookeeper:tarzxvf zookeeper-3.4.6
25、.tar.gz -C /home/hadoop4、在zookeeper目录下建立zookeeper-data目录,同时将zookeeper目录下conf/zoo_simple.cfg文件复制一份成zoo.cfgcp/home/hadoop/zookeeper-3.4.6/conf/zoo_sample.cfg zoo.cfg5、修改zoo.cfg# Thenumber of milliseconds of each ticktickTime=2000# Thenumber of ticks that the initial#synchronization phase can takeinitLimit=10# Thenumber of ticks that can passbetween#sending a request and getting anacknowledgementsyncLimit=5# thedirectory where the snapshot isstored.# do notuse /tmp for storage, /tmp hereis just#example sakes.dataDir=/home/hadoop/zookeeper-3.4.6/zookeeper-data
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1