hadoop241集群配置.docx
《hadoop241集群配置.docx》由会员分享,可在线阅读,更多相关《hadoop241集群配置.docx(29页珍藏版)》请在冰豆网上搜索。
hadoop241集群配置
1.实验环境:
4节点集群,ZK节点3个,hosts文件和各节点角色分配如下:
hosts:
192.168.66.91master
192.168.66.92slave1
192.168.66.93slave2
192.168.66.94slave3
角色分配:
ActiveNN
StandbyNN
DN
JournalNode
Zookeeper
FailoverController
master
V
V
V
V
slave1
V
V
V
V
V
slave2
V
V
V
slave3
V
2.hadoop-env.sh 修改以下三处即可
#Thejavaimplementationtouse.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_07
#Thedirectorywherepidfilesarestored./tmpbydefault.
#NOTE:
thisshouldbesettoadirectorythatcanonlybewrittentobytheuserthatwillrunthehadoopdaemons. Otherwisethereisthepotentialforasymlinkattack.
export HADOOP_PID_DIR=/home/yarn/Hadoop/hadoop-2.4.1/hadoop_pid_dir
export HADOOP_SECURE_DN_PID_DIR=/home/yarn/Hadoop/hadoop-2.4.1/hadoop_pid_dir
3.core-site.xml完整文件
xml version="1.0" encoding="UTF-8"?
>
xml-stylesheet type="text/xsl" href="configuration.xsl"?
>
-- Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You may obtain
a copy of the License at http:
//www.apache.org/licenses/LICENSE-2.0 Unless
required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. See the License for
the specific language governing permissions and limitations under the License.
See accompanying LICENSE file. -->
-- Put site-specific property overrides in this file. -->
fs.defaultFS
hdfs:
//myhadoop
NameNode UR,格式是hdfs:
//host:
port/,如果开启了NN
HA特性,则配置集群的逻辑名,具体参见我的博客
hadoop.tmp.dir
/home/yarn/Hadoop/hadoop-2.4.1/tmp
io.file.buffer.size
131072
Size of read/write buffer used in SequenceFiles.
ha.zookeeper.quorum
master:
2181,slave1:
2181,slave2:
2181
注意,配置了ZK以后,在格式化、启动NameNode之前必须先启动ZK,否则会报连接错误
4.hdfs-site.xml 完整文件
xml version="1.0" encoding="UTF-8"?
>
xml-stylesheet type="text/xsl" href="configuration.xsl"?
>
-- Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You may obtain
a copy of the License at http:
//www.apache.org/licenses/LICENSE-2.0 Unless
required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. See the License for
the specific language governing permissions and limitations under the License.
See accompanying LICENSE file. -->
-- Put site-specific property overrides in this file. -->
-- NN HA related configuration **BEGIN** -->
dfs.nameservices
myhadoop
Comma-separated list of nameservices.
as same as fs.defaultFS in core-site.xml.
dfs.ha.namenodes.myhadoop
nn1,nn2
The prefix for a given nameservice, contains a comma-separated
list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
dfs.namenode.rpc-address.myhadoop.nn1
master:
8020
RPC address for nomenode1 of hadoop-test
dfs.namenode.rpc-address.myhadoop.nn2
slave1:
8020
RPC address for nomenode2 of hadoop-test
dfs.namenode.http-address.myhadoop.nn1
master:
50070
The address and the base port where the dfs namenode1 web ui will listen
on.
dfs.namenode.http-address.myhadoop.nn2
slave1:
50070
The address and the base port where the dfs namenode2 web ui will listen
on.
dfs.namenode.servicerpc-address.myhadoop.n1
master:
53310
dfs.namenode.servicerpc-address.myhadoop.n2
slave1:
53310
dfs.ha.automatic-failover.enabled
true
Whether automatic failover is enabled. See the HDFS High
Availability documentation for details on automatic HA
configuration.
dfs.client.failover.proxy.provider.myhadoop
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
Configure the name of the Java class which will be used
by the DFS Client to determine which NameNode is the current Active,
and therefore which NameNode is currently serving client requests.
这个类是Client的访问代理,是HA特性对于Client透明的关键!
dfs.ha.fencing.methods
sshfence
how to communicate in the switch process
dfs.ha.fencing.ssh.private-key-files
/home/yarn/.ssh/id_rsa
the location stored ssh key
dfs.ha.fencing.ssh.connect-timeout
1000
dfs.journalnode.edits.dir
/home/yarn/Hadoop/hadoop-2.4.1/hdfs_dir/journal/
dfs.namenode.shared.edits.dir
qjournal:
//master:
8485;slave1:
8485;slave2:
8485/hadoop-journal
A directory on shared storage between the multiple
namenodes
in an HA cluster. This directory will be written by the active and read
by the standby in order to keep the namespaces synchronized. This
directory
does not need to be listed in dfs.namenode.edits.dir above. It should be
left empty in a non-HA cluster.
-- NN HA related configuration **END** -->
-- NameNode related configuration **BEGIN** -->
dfs.namenode.name.dir
file:
///home/yarn/Hadoop/hadoop-2.4.1/hdfs_dir/name
Path on the local filesystem where the NameNode stores
the namespace and transactions logs persistently.If this is a
comma-delimited list of directories then the name table is replicated
in all of the directories, for redundancy.
dfs.blocksize
1048576
HDFS blocksize of 128MB for large file-systems.
Minimumblocksizeis 1048576.
dfs.namenode.handler.count
10
More NameNode server threads to handle RPCs from large
number of DataNodes.
-- dfs.namenode.hosts master If
necessary, use this to control the list of allowable datanodes.
dfs.namenode.hosts.exclude slave1,slave2,slave3
If necessary, use this to control the list of exclude datanodes.
-->
-- NameNode related configuration **END** -->
-- DataNode related configuration **BEGIN** -->
dfs.datanode.data.dir
file:
///home/yarn/Hadoop/hadoop-2.4.1/hdfs_dir/data
Comma separated list of paths on the local filesystem of
a DataNode where it should store its blocks.If this is a
comma-delimited list of directories, then data will be stored in all
named directories, typically on different devices.
-- DataNode related configuration **END** -->
5.yarn-site.xml
xml version="1.0"?
>
-- Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You may obtain
a copy of the License at http:
//www.apache.org/licenses/LICENSE-2.0 Unless
required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. See the License for
the