Spark编程详细分析Word文件下载.docx
《Spark编程详细分析Word文件下载.docx》由会员分享,可在线阅读,更多相关《Spark编程详细分析Word文件下载.docx(13页珍藏版)》请在冰豆网上搜索。
10.10.20.18,
slave2:
10.10.20.19
2.分别修改三台服务器的hosts文件,具体如下:
vim/etc/hosts
127.0.0.1localhost
10.10.10.88spark-master
10.10.11.18spark-slave1
10.10.11.19spark-slave2
3.设置免密登陆,生成公钥和私钥
登陆master机器,执行如下命令
[root@spark-master/]#ssh-keygen-trsa
一直敲回车,最后生成密钥
Generatingpublic/privatersakeypair.
Enterfileinwhichtosavethekey(/root/.ssh/id_rsa):
/root/.ssh/id_rsaalreadyexists.
Overwrite(y/n)?
y
Enterpassphrase(emptyfornopassphrase):
Entersamepassphraseagain:
Youridentificationhasbeensavedin/root/.ssh/id_rsa.
Yourpublickeyhasbeensavedin/root/.ssh/id_rsa.pub.
Thekeyfingerprintis:
SHA256:
Nn0sHWtXvvMtgVaDyr2NlBha3RCDiHo4MNgpWkxFUOgroot@VM_10_45_centos
Thekey'
srandomartimageis:
+---[RSA2048]----+
|=oB+...o.|
|oO.....|
|.+oo..+.|
|.E+..oooo+o|
|oS+o=*+...|
|...++*...|
|o++|
|oo+|
|.o|
+----[SHA256]-----+
[root@spark-master/]#
4.
将上一步中生成的公钥文件分别复制到slave1和slave2服务器上
[root@spark-masterdata]#scp/root/.ssh/id_rsa.pubroot@10.10.11.18:
/data/
[root@spark-masterdata]#scp/root/.ssh/id_rsa.pubroot@10.10.11.19:
分别进入slave1和slave2服务器器,将公钥导入授权文件中
[root@spark-slave1/]#cddata
[root@spark-slave1data]#ls
id_rsa.pub
[root@spark-slave1data]#catid_rsa.pub>
>
/root/.ssh/authorized_keys
返回master服务器,测试授权是否成功
#测试slave1
[root@spark-master~]#sshspark-slave1
Lastlogin:
MonOct1515:
46:
272018from10.10.10.88
#测试slave2
[root@spark-master~]#sshspark-slave2
56:
302018from10.10.10.88
5.安装jdk
检查一下是否安装jdk,如果没有请先安装,我这里因为已经安装好jdk,所以就略去了安装的步骤,如果没有安装过的请自己上网查找。
[root@spark-master~]#java-version
javaversion"
1.8.0_152"
Java(TM)SERuntimeEnvironment(build1.8.0_152-b16)
JavaHotSpot(TM)64-BitServerVM(build25.152-b16,mixedmode)
[root@spark-master_centos~]#
6.scala安装与配置
1.下载
scala官网下载地址:
https:
//www.scala-lang.org/download/,找到要下载的版本,我这里选择的是scala-2.12.7.tgz。
我在下载过程中一直失败,然后网上查了下,有人说把地址换成
2.安装与配置
#切换到scala安装目录
[root@spark-master~]#cd/opt/scala/
#解压安装包
[root@spark-masterscala]#tar-xvfscala-2.12.7.tgz-C/opt/scala/
scala-2.12.7/
scala-2.12.7/man/
scala-2.12.7/man/man1/
scala-2.12.7/man/man1/fsc.1
scala-2.12.7/man/man1/scalac.1
scala-2.12.7/man/man1/scalap.1
scala-2.12.7/man/man1/scaladoc.1
scala-2.12.7/man/man1/scala.1
scala-2.12.7/doc/
scala-2.12.7/doc/licenses/
scala-2.12.7/doc/licenses/mit_tools.tooltip.txt
scala-2.12.7/doc/licenses/mit_jquery.txt
scala-2.12.7/doc/licenses/bsd_asm.txt
scala-2.12.7/doc/licenses/bsd_jline.txt
scala-2.12.7/doc/licenses/apache_jansi.txt
scala-2.12.7/doc/License.rtf
scala-2.12.7/doc/README
scala-2.12.7/doc/LICENSE.md
scala-2.12.7/doc/tools/
scala-2.12.7/doc/tools/scala.html
scala-2.12.7/doc/tools/css/
scala-2.12.7/doc/tools/css/style.css
scala-2.12.7/doc/tools/index.html
scala-2.12.7/doc/tools/scaladoc.html
scala-2.12.7/doc/tools/scalap.html
scala-2.12.7/doc/tools/scalac.html
scala-2.12.7/doc/tools/images/
scala-2.12.7/doc/tools/images/external.gif
scala-2.12.7/doc/tools/images/scala_logo.png
scala-2.12.7/doc/tools/fsc.html
scala-2.12.7/bin/
scala-2.12.7/bin/scalap.bat
scala-2.12.7/bin/scala
scala-2.12.7/bin/scalac.bat
scala-2.12.7/bin/fsc.bat
scala-2.12.7/bin/scaladoc.bat
scala-2.12.7/bin/scala.bat
scala-2.12.7/bin/scalap
scala-2.12.7/bin/scalac
scala-2.12.7/bin/fsc
scala-2.12.7/bin/scaladoc
scala-2.12.7/lib/
scala-2.12.7/lib/scala-library.jar
scala-2.12.7/lib/scala-compiler.jar
scala-2.12.7/lib/jline-2.14.6.jar
scala-2.12.7/lib/scala-reflect.jar
scala-2.12.7/lib/scalap-2.12.7.jar
scala-2.12.7/lib/scala-swing_2.12-2.0.3.jar
scala-2.12.7/lib/scala-parser-combinators_2.12-1.0.7.jar
scala-2.12.7/lib/scala-xml_2.12-1.0.6.jar
#编辑配置文件
[root@spark-masterscala]#vim/etc/profile
#在文件中增加如下环境变量的配置
exportSCALA_HOME=/opt/scala/scala-2.12.7
exportPATH=$PATH:
$SCALA_HOME/bin
#然后ESC,并wq!
保存后退出编辑
#使环境变量生效
[root@spark-masterscala]#source/etc/profile
#检查配置是否成功
[root@spark-masterscala]#scala-version
Scalacoderunnerversion2.12.7--Copyright2002-2018,LAMP/EPFLandLightbend,Inc.
[root@spark-masterscala]#
安装配置完成之后,按照同样的步骤安装到另外两台slave机器上。
二、hadoop分布式安装与配置
1.下载hadoop
hadoop可以通过Apache的官网进行下载,我这里选择的是2.8.5版本https:
//www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.8.5/hadoop-2.8.5.tar.gz,下载时请选择hadoop-2.x.y.tar.gz这个格式的文件,这是编译好的,另一个包含src的则是Hadoop源代码,需要进行编译才可使用。
2.把安装文件上传到目标服务器
首先把hadoop安装文件上传到master服务器的/opt/hadoop目录里,然后再分别拷贝到slav