The most detailed and comprehensive Hadoop cluster HDFS cluster installation document in the history of the whole network

Big data learning monk 2022-02-13 07:18:38 阅读数:605

detailed comprehensive hadoop cluster hdfs

Detailed and comprehensive HDFS Installation process and environment configuration documents , Welcome to collect

1. Environmental preparation

Be careful : The following steps are in root Permission to operate

1.1 Virtual machine preparation

Clone three virtual machines (linux01、linux02、linux03),
Modify the network configuration of the virtual machine

vi /etc/sysconfig/network-scripts/ifcfg-eth0

Configure the network of the corresponding host IP

vi /etc/udev/rules.d/70-persistent-net.rules

Permanently change the host name

 vi /etc/sysconfig/network

Configure domain mapping

vi /etc/hosts
[[email protected] /]# cat /etc/hosts linux01 linux02 linux03 linux04 linux05

Turn off firewall
service iptables stop

View firewall status
service iptables status

iptables: Firewall is not running.

1)linux01、linux02、linux03 The corresponding memory of the host is :10G、2G、2G
2) stay linux01 Of /opt Create... On the directory apps and software Catalog

1.2 SSH Password free login

To configure linux01 Yes linux01、linux02、linux03 Secret free login of three hosts .
(1) Generate public and private keys :

[[email protected] .ssh]$ ssh-keygen -t rsa

Then knock ( Three carriage returns ), Two files will be generated id_rsa( Private key )、 Public key )
(2) Copy the public key to the target machine for password free login

[[email protected] .ssh]$ ssh-copy-id linux01
[[email protected] .ssh]$ ssh-copy-id linux02
[[email protected] .ssh]$ ssh-copy-id linux03

1.3 install JDK( Three stations )

1) stay linux01 Of /opt Create under directory apps and software Folder

[[email protected] opt]# mkdir apps
[[email protected] opt]# mkdir software

2) use SecureCRT take jdk-8u144-linux-x64.tar.gz Import to linux01 Of /opt/software Under the table of contents
3) stay Linux Under the system opt Check the directory to see if the package was imported successfully

[[email protected] software]$ ls

4) decompression JDK To /opt/apps Under the table of contents

[[email protected] software]$ tar -zxvf jdk-8u144-linux-x64.tar.gz -C /opt/apps/

5) To configure JDK environment variable
(1) First get JDK route

[[email protected] jdk1.8.0_144]$ pwd
(2) open /etc/profile file
[[email protected] software]$ vi /etc/profile

stay profile Add at the end of the file JDK route

export JAVA_HOME=/opt/apps/jdk1.8.0_144
export PATH=$PATH:$JAVA_HOME/bin

(3) Exit after saving


(4) Let the modified file take effect

[[email protected] jdk1.8.0_144]$ source /etc/profile

6) test JDK Is the installation successful

[[email protected] jdk1.8.0_144]# java -version
java version "1.8.0_144"

7) take linux01 Medium JDK And environment variables are distributed to linux02、linux03 Two hosts

[[email protected] opt]# xsync /opt/apps/
[[email protected] opt]# xsync /etc/profile

Respectively in linux02、linux03 On source once

[[email protected] ~]$ source /etc/profile
[[email protected] ~]# source /etc/profile

1.4 close SELINUX

Enhanced security Linux(Security-Enhanced Linux) abbreviation SELinux, It's a Linux
The kernel module , It's also Linux A security subsystem of .

SELinux Its structure and configuration are very complicated , So in order to avoid all kinds of mistakes , Proposed closure , There are two closing methods :
(1) Temporarily Closed :

[[email protected] ~]# setenforce 0

But this method is only effective for the current startup , It will fail after restarting the machine , The second method is recommended .
(2) Permanent ban
Modify the configuration file /etc/selinux/config

[[email protected] ~]# vim /etc/selinux/config
vi /etc/selinux/config
take SELINUX=enforcing Change it to SELINUX=disabled

(3) Sync /etc/selinux/config The configuration file

[[email protected] ~]# xsync /etc/selinux/config

(4) restart linux01、linux02、linux03 host

[[email protected] ~]# reboot
[[email protected] ~]# reboot
[[email protected] ~]# reboot

2 HDFS install

2.1 Upload hadoop Installation package

rz hadoop-2.8.5.tar.gz

2.2 decompression hadoop Install the package to the specified directory

tar -zxf hadoop-2.8.5.tar.gz -C /opt/apps/

2.3 modify hadoop The configuration file

2.3.1 modify

vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/
To configure jdk In the directory

# The java implementation to use.
export JAVA_HOME=/opt/bigdata/jdk1.8.0_141/

2.3.2 modify hdfs-site.xml

vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/hdfs-site.xml

<!-- To configure namenode Location of the machine -->
<!-- To configure namenode Metadata storage directory -->
<!-- To configure datanode Data storage directory -->
<!-- To configure secondary namenode Location of the machine -->

2.3.2 modify core-site.xml

hdfs dfs -df Check the storage of the file system The default operation is linux Local file system , The purpose of modifying this configuration file is to operate hdfs Cluster distributed file system
vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/core-site.xml

<!-- take hdfs dfs The default local file system of the client operation is changed to operation hdfs distributed file system -->

2.3.3 It will be revised hadoop The folder is synchronized to each node

for i in 2 3;do scp -r hadoop-2.8.5 linux0$i:$PWD;done
You can also synchronize one by one

scp -r /opt/apps/hadoop-2.8.5 linux02:$PWD
scp -r /opt/apps/hadoop-2.8.5 linux03:$PWD

3 HDFS Cluster start

3.1 initialization namenode

[[email protected] bin]# pwd
Start in the above directory and enter the code
./hdfs namenode -format

After initialization, you can enter... In the root directory opt/hdpdata See the created name Catalog

3.2 start-up namenode

[[email protected] sbin]# pwd
Execute... In the current directory
./ start namenode

3.2 Each node starts datanode

 sbin/ start namenode

3.3 After starting the access hadoop Of web page


 Insert picture description here

4 HDFS Cluster batch start

4.1 To configure slaves file

hdfs Read at startup slaves file , What we configure in this file to start datanode

vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/slaves
[[email protected] hadoop]# cat slaves

4.2 Configure the environment variables of the system

In order to operate hdfs Convenient command
The root directory Run the code , take hadoop Configure the location into the environment variable

 vi /etc/profile
export JAVA_HOME=/usr/apps/jdk1.8.0_141/
export HADOOP_HOME=/usr/apps/hadoop-2.8.5
source /etc/profile

Then we can Data in any directory bin sbin Any command under

4.3 perform

It can be in any directory perform , You can start with one click hadoop colony , It can be turned off hadoop colony

The above steps are to install hadoop Detailed steps of clustering , If you like, please collect and pay attention to me ~
The next chapter updates the commonly used hdfs command

copyright:author[Big data learning monk],Please bring the original link to reprint, thank you.