1、hadoop详细安装配置过程1Hadoop学习第一步之基础环境搭建1.下载并安装安装sshsudo apt-get install openssh-server openssh-client3.搭建vsftpd#sudo apt-get update#sudo apt-get install vsftpd配置参考的开始、关闭和重启$sudo /etc/vsftpd start #开始$sudo /etc/vsftpd stop #关闭$sudo /etc/vsftpd restart #重启4.安装sudo chown -R hadoop:hadoop /optcp /soft/ /optsu
2、do vi /etc/profilealias untar=tar -zxvfsudo source /etc/profilesource /etc/profileuntar jdk*环境变量配置# vi /etc/profile在profile文件最后加上# set java environmentexport JAVA_HOME=/opt/export CLASSPATH=.:$JAVA_HOME/lib/:$JAVA_HOME/lib/export PATH=$JAVA_HOME/bin:$PATH配置完成后,保存退出。不重启,更新命令#source /etc/profile测试是否安装
3、成功# Java version其他问题: 出现unable to resolve host 解决方法参考开机时停在 Starting sendmail 不动了的解决方案参考 安装软件时出现 E: Unable to locate package vsftpd参考 vi/vim 使用方法讲解参考分类:Hadoop-克隆master虚拟机至node1 、node2分别修改master的主机名为master、node1的主机名为node1、node2的主机名为node2(启动node1、node2系统默认分配递增ip,无需手动修改)分别修改/etc/hosts中的ip和主机名(包含其他节点ip和主
4、机名)-配置ssh免密码连入hadoopnode1:$ ssh-keygen -t dsa -P -f /.ssh/id_dsaGenerating public/private dsa key pair.Created directory /home/hadoop/.ssh.Your identification has been saved in /home/hadoop/.ssh/id_dsa.Your public key has been saved in /home/hadoop/.ssh/.The key fingerprint is:SHA256:B8vBju/uc3kl/v9
5、lrMqtltttttCcXgRkQPbVoU hadoopnode1The keys randomart image is:+-DSA 1024-+| . | o+.E . | . oo + | . + + |o +. o ooo +|=o. . o. ooo. o.|*o. .+=o .+.+|+-SHA256-+hadoopnode1:$ cd .sshhadoopnode1:/.ssh$ ll总用量 16drwx- 2 hadoop hadoop 4096 Jul 24 20:31 ./drwxr-xr-x 18 hadoop hadoop 4096 Jul 24 20:31 ./-r
6、w- 1 hadoop hadoop 668 Jul 24 20:31 id_dsa-rw-r-r- 1 hadoop hadoop 602 Jul 24 20:31 hadoopnode1:/.ssh$ cat authorized_keyshadoopnode1:/.ssh$ ll总用量 20drwx- 2 hadoop hadoop 4096 Jul 24 20:32 ./drwxr-xr-x 18 hadoop hadoop 4096 Jul 24 20:31 ./-rw-rw-r- 1 hadoop hadoop 602 Jul 24 20:32 authorized_keys-rw
7、- 1 hadoop hadoop 668 Jul 24 20:31 id_dsa-rw-r-r- 1 hadoop hadoop 602 Jul 24 20:31 单机回环ssh免密码登录测试hadoopnode1:/.ssh$ ssh localhostThe authenticity of host localhost () cant be established.ECDSA key fingerprint is SHA256:daO0dssyqt12tt9yGUauImOh6tt6A1SgxzSfSmpQqJVEiQTxas.Are you sure you want to conti
8、nue connecting (yes/no) yesWarning: Permanently added localhost (ECDSA) to the list of known hosts.Welcome to Ubuntu (GNU/Linux x86_64) * Documentation: packages can be updated.178 updates are security updates.New release LTS available.Run do-release-upgrade to upgrade to it.Last login: Sun Jul 24 2
9、0:21:39 2016 from hadoopnode1:$ exit注销Connection to localhost closed.hadoopnode1:/.ssh$出现以上信息说明操作成功,其他两个节点同样操作让主结点(master)能通过SSH免密码登录两个子结点(slave)hadoopnode1:/.ssh$ scp hadoopmaster:/.ssh/ ./The authenticity of host master () cant be established.ECDSA key fingerprint is SHA256:daO0dssyqtt9yGUuImOh646
10、A1SgxzSfatSmpQqJVEiQTxas.Are you sure you want to continue connecting (yes/no) yesWarning: Permanently added master, (ECDSA) to the list of known hosts.hadoopmasters password: 100% 603 s 00:00 hadoopnode1:/.ssh$ cat authorized_keys如上过程显示了node1结点通过scp命令远程登录master结点,并复制master的公钥文件到当前的目录下,这一过程需要密码验证。接着
11、,将master结点的公钥文件追加至authorized_keys文件中,通过这步操作,如果不出问题,master结点就可以通过ssh远程免密码连接node1结点了。在master结点中操作如下:hadoopmaster:/.ssh$ ssh node1The authenticity of host node1 () cant be established.ECDSA key fingerprint is SHA256:daO0dssyqt9yGUuImOh3466A1SttgxzSfSmpQqJVEiQTxas.Are you sure you want to continue conne
12、cting (yes/no) yesWarning: Permanently added node1, (ECDSA) to the list of known hosts.Welcome to Ubuntu (GNU/Linux x86_64) * Documentation: packages can be updated.178 updates are security updates.New release LTS available.Run do-release-upgrade to upgrade to it.Last login: Sun Jul 24 20:39:30 2016
13、 from hadoopnode1:$ exit注销Connection to node1 closed.hadoopmaster:/.ssh$ 由上图可以看出,node1结点首次连接时需要,“YES”确认连接,这意味着master结点连接node1结点时需要人工询问,无法自动连接,输入yes后成功接入,紧接着注销退出至master结点。要实现ssh免密码连接至其它结点,还差一步,只需要再执行一遍ssh node1,如果没有要求你输入”yes”,就算成功了,过程如下:hadoopmaster:/.ssh$ ssh node1Welcome to Ubuntu (GNU/Linux x86_64
14、) * Documentation: packages can be updated.178 updates are security updates.New release LTS available.Run do-release-upgrade to upgrade to it.Last login: Sun Jul 24 20:47:20 2016 from hadoopnode1:$ exit注销Connection to node1 closed.hadoopmaster:/.ssh$如上图所示,master已经可以通过ssh免密码登录至node1结点了。 对node2结点也可以用上
15、面同样的方法进行表面上看,这两个结点的ssh免密码登录已经配置成功,但是我们还需要对主结点master也要进行上面的同样工作,这一步有点让人困惑,但是这是有原因的,具体原因现在也说不太好,据说是真实物理结点时需要做这项工作,因为jobtracker有可能会分布在其它结点上,jobtracker有不存在master结点上的可能性。 对master自身进行ssh免密码登录测试工作:hadoopmaster:/.ssh$ scp hadoopmaster:/.ssh/ ./The authenticity of host master () cant be established.ECDSA key
16、 fingerprint is SHA256:daO0dssttqt9yGUuImOahtt166AgxttzSfSmpQqJVEiQTxas.Are you sure you want to continue connecting (yes/no) yesWarning: Permanently added master (ECDSA) to the list of known hosts. 100% 603 s 00:00 hadoopmaster:/.ssh$ cat authorized_keyhadoopmaster:/.ssh$ ssh masterWelcome to Ubunt
17、u (GNU/Linux x86_64) * Documentation: packages can be updated.178 updates are security updates.New release LTS available.Run do-release-upgrade to upgrade to it.Last login: Sun Jul 24 20:39:24 2016 from hadoopmaster:$ exit注销Connection to master closed.至此,SSH免密码登录已经配置成功。-解压 hadoop然后更新环境变量vi /etc/prof
18、ileexport JAVA_HOME=/opt/ CLASSPATH=.:$JAVA_HOME/lib/:$JAVA_HOME/lib/export HADOOP_HOME=/opt/hadoopexport PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbinexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS=alias untar=tar -zxvfalias viprofile=vi /etc/profilealia
19、s sourceprofile=source /etc/profilealias catprofile=cat /etc/profilealias cdhadoop=cd /opt/hadoop/alias startdfs=$HADOOP_HOME/sbin/alias startyarn=$HADOOP_HOME/sbin/alias stopdfs=$HADOOP_HOME/sbin/alias stopyarn=$HADOOP_HOME/sbin/source /etc/profile-步骤六:修改配置一共有7个文件要修改:$HADOOP_HOME/etc/hadoop/$HADOOP
20、_HOME/etc/hadoop/$HADOOP_HOME/etc/hadoop/$HADOOP_HOME/etc/hadoop/$HADOOP_HOME/etc/hadoop/$HADOOP_HOME/etc/hadoop/$HADOOP_HOME/etc/hadoop/slaves其中$HADOOP_HOME表示hadoop根目录 a) 、这二个文件主要是修改JAVA_HOME后的目录,改成实际本机jdk所在目录位置vi etc/hadoop/ (及 vi etc/hadoop/)找到下面这行的位置,改成(jdk目录位置,大家根据实际情况修改)export JAVA_HOME=/opt/另
21、外 中 , 建议加上这句:export HADOOP_PREFIX=/opt/hadoopb) 参考下面的内容修改: /opt/hadoop/tmp 注:/opt/hadoop/tmp 目录如不存在,则先mkdir手动创建的完整参数请参考 :50020 :50075 2 注: 表示数据副本数,一般不大于 datanode 的节点数。的完整参数请参考 yarn 的完整参数请参考 version= mapreduce_shuffle 的完整参数请参考另外,hadoop 与相比, 中的很多参数已经被标识为过时,具体可参考最后一个文件slaves暂时不管(可以先用mv slaves 将它改名),上述配
22、置弄好后,就可以在master上启用 NameNode测试了,方法:$HADOOP_HOME/bin/hdfs namenode format 先格式化16/07/25 。16/07/25 20:34:42 INFO : Allocated new BlockPoolId: BP-16/07/25 20:34:42 INFO : Storage directory /opt/hadoop/tmp/dfs/name has been successfully formatted.16/07/25 20:34:43 INFO : Going to retain 1 images with txid
23、 = 016/07/25 20:34:43 INFO : Exiting with status 016/07/25 20:34:43 INFO : SHUTDOWN_MSG: /*SHUTDOWN_MSG: Shutting down NameNode at master/*/等看到这个时,表示格式化ok$HADOOP_HOME/sbin/ 启动完成后,输入jps (ps -ef | grep .)查看进程,如果看到以下二个进程:5161 SecondaryNameNode4989 NameNode表示master节点基本ok了再输入$HADOOP_HOME/sbin/ ,完成后,再输入jp
24、s查看进程 5161 SecondaryNameNode5320 ResourceManager4989 NameNode如果看到这3个进程,表示yarn也ok了f) 修改 /opt/hadoop/etc/hadoop/slaves如果刚才用mv slaves 对该文件重命名过,先运行 mv slaves 把名字改回来,再vi slaves 编辑该文件,输入node1node2保存退出,最后运行$HADOOP_HOME/sbin/ $HADOOP_HOME/sbin/ 停掉刚才启动的服务 步骤七:将master上的hadoop目录复制到 node1,node2仍然保持在master机器上cd
25、先进入主目录 cd /optzip -r hadoopscp -r hadoopnode1:/opt/scp -r hadoopnode2:/opt/unzip 注: node1 、 node2 上的hadoop临时目录(tmp)及数据目录(data),仍然要先手动创建。-步骤八:验证master节点上,重新启动$HADOOP_HOME/sbin/$HADOOP_HOME/sbin/-hadoopmaster:/opt/hadoop/sbin$ Starting namenodes on mastermaster: starting namenode, logging to /opt/hado
26、op/logs/node1: starting datanode, logging to /opt/hadoop/logs/node2: starting datanode, logging to /opt/hadoop/logs/Starting secondary namenodes : starting secondarynamenode, logging to /opt/hadoop/logs/-hadoopmaster:/opt/hadoop/sbin$ starting yarn daemonsstarting resourcemanager, logging to /opt/ha
27、doop/logs/node1: starting nodemanager, logging to /opt/hadoop/logs/node2: starting nodemanager, logging to /opt/hadoop/logs/-顺利的话,master节点上有几下3个进程:ps -ef | grep ResourceManagerps -ef | grep SecondaryNameNodeps -ef | grep NameNode7482 ResourceManager7335 SecondaryNameNode7159 NameNodeslave01、slave02上有几下2个进程:ps -ef | grep DataNodeps -ef | grep NodeManager2296 DataNode2398 NodeManager同时可浏览: bin/hdfs dfsadmin -report 查看hdfs的状态
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1