hadoop配置

时间:2019-10-30
本文章向大家介绍hadoop配置,主要包括hadoop配置使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

一、hadoop (需要配置好jdk)

下载地址: hadoop-2.6.0-cdh5.6.0.tar.gz: https://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.6.0.tar.gz   tar -zxvf hadoop-2.6.0-cdh5.6.0.tar.gz

1. 添加所有节点:vi /etc/hosts  ip+主机名

2. 创建hadoop用户:useradd hadoop passwd hadoop

3. 配置免密连接:

(1) su hadoop =》ssh-keygen -t rsa =》cp id_rsa.pub authorized_keys;

(2)scp /root/.ssh/id_rsa.pub root@master:/root/.ssh/id_rsa_slave1.pub(举例,实际需要把所有子节点的公钥拷贝到主节点上);

(3)cat id_rsa_slave1.pub>> authorized_keys (合并);s

(4)cp authorized_keys rootMage2:/root/.ssh/ (拷贝到子节点);

4.修改hadoop环境脚本 hadoop-env.sh :

export JAVA_HOME= ;

export HADOOP_HOME= ;

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native;

export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib"

5. 配置文件  core-site.xml hdfs-site.xml yarn-site.xml mapred-site.xml 

(1) core-site.xml

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://master:9000</value>

  </property>

  <property>

    <name>mapred.job.tracker</name>

    <value>hdfs://master:9001</value>

  </property>

  <property>

    <name>io.file.buffer.size</name>

    <value>131072</value>

  </property>

  <property>

    <name>hadoop.tmp.dir</name>

    <value>/opt/temp</value>

  </property>

  <property>

    <name>hadoop.proxyuser.root.hosts</name>

    <value>*</value>

  <property>

    <name>hadoop.proxyuser.root.groups</name>

    <value>*</value>

  </property>

</configuration>

(2) hdfs-site.xml

<configuration>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>master:9001</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/opt/dfs/namenode</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
               <value>/opt/dfs/datanode</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>
                <name>dfs.webhdfs.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.namenode.http-address.ns1.master</name>
                <value>master:50070</value>
        </property>
</configuration>

(3) yarn-site.xml

<configuration>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>master</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>${yarn.resourcemanager.hostname}:8088</value>
  </property>
  <property>
    <description>Whether to enable log aggregation</description>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
  </property>
</configuration>

(4) mapred-site.xml 

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

6. 配置节点文件 : slaves  从节点名称

7.  vi /etc/profile: 

export HADOOP_HOME= ;

export PATH=$PATH:${HADOOP_HOME}/bin;

source /etc/profile

8. 将hadoop拷贝到所有子节点

scp -r /usr/hadoop-2.6.4 root@:/usr

9. 格式化hdfs 

hadoop namenode -format

10.完成

 

原文地址:https://www.cnblogs.com/xiennnnn/p/11764675.html