The most detailed and comprehensive Hadoop cluster HDFS cluster installation document in the history of the whole network

Big data learning monk 2022-02-13 07:18:38 阅读数:605

detailed comprehensive hadoop cluster hdfs

Detailed and comprehensive HDFS Installation process and environment configuration documents , Welcome to collect

1. Environmental preparation

Be careful : The following steps are in root Permission to operate

1.1 Virtual machine preparation

Clone three virtual machines (linux01、linux02、linux03),
Modify the network configuration of the virtual machine

vi /etc/sysconfig/network-scripts/ifcfg-eth0

Configure the network of the corresponding host IP

vi /etc/udev/rules.d/70-persistent-net.rules

Permanently change the host name

 vi /etc/sysconfig/network

Configure domain mapping

vi /etc/hosts
[[email protected] /]# cat /etc/hosts
192.168.133.3 linux01
192.168.133.4 linux02
192.168.133.5 linux03
192.168.133.6 linux04
192.168.133.7 linux05

Turn off firewall
service iptables stop

View firewall status
service iptables status

iptables: Firewall is not running.

1)linux01、linux02、linux03 The corresponding memory of the host is :10G、2G、2G
2) stay linux01 Of /opt Create... On the directory apps and software Catalog

1.2 SSH Password free login

To configure linux01 Yes linux01、linux02、linux03 Secret free login of three hosts .
(1) Generate public and private keys :

[[email protected] .ssh]$ ssh-keygen -t rsa

Then knock ( Three carriage returns ), Two files will be generated id_rsa( Private key )、id_rsa.pub( Public key )
(2) Copy the public key to the target machine for password free login

[[email protected] .ssh]$ ssh-copy-id linux01
[[email protected] .ssh]$ ssh-copy-id linux02
[[email protected] .ssh]$ ssh-copy-id linux03

1.3 install JDK( Three stations )

1) stay linux01 Of /opt Create under directory apps and software Folder

[[email protected] opt]# mkdir apps
[[email protected] opt]# mkdir software

2) use SecureCRT take jdk-8u144-linux-x64.tar.gz Import to linux01 Of /opt/software Under the table of contents
3) stay Linux Under the system opt Check the directory to see if the package was imported successfully

[[email protected] software]$ ls
jdk-8u144-linux-x64.tar.gz

4) decompression JDK To /opt/apps Under the table of contents

[[email protected] software]$ tar -zxvf jdk-8u144-linux-x64.tar.gz -C /opt/apps/

5) To configure JDK environment variable
(1) First get JDK route

[[email protected] jdk1.8.0_144]$ pwd
/opt/apps/jdk1.8.0_144
(2) open /etc/profile file
[[email protected] software]$ vi /etc/profile

stay profile Add at the end of the file JDK route

#JAVA_HOME
export JAVA_HOME=/opt/apps/jdk1.8.0_144
export PATH=$PATH:$JAVA_HOME/bin

(3) Exit after saving

:wq

(4) Let the modified file take effect

[[email protected] jdk1.8.0_144]$ source /etc/profile

6) test JDK Is the installation successful

[[email protected] jdk1.8.0_144]# java -version
java version "1.8.0_144"

7) take linux01 Medium JDK And environment variables are distributed to linux02、linux03 Two hosts

[[email protected] opt]# xsync /opt/apps/
[[email protected] opt]# xsync /etc/profile

Respectively in linux02、linux03 On source once

[[email protected] ~]$ source /etc/profile
[[email protected] ~]# source /etc/profile

1.4 close SELINUX

Enhanced security Linux(Security-Enhanced Linux) abbreviation SELinux, It's a Linux
The kernel module , It's also Linux A security subsystem of .

SELinux Its structure and configuration are very complicated , So in order to avoid all kinds of mistakes , Proposed closure , There are two closing methods :
(1) Temporarily Closed :

[[email protected] ~]# setenforce 0

But this method is only effective for the current startup , It will fail after restarting the machine , The second method is recommended .
(2) Permanent ban
Modify the configuration file /etc/selinux/config

[[email protected] ~]# vim /etc/selinux/config
vi /etc/selinux/config
take SELINUX=enforcing Change it to SELINUX=disabled
SELINUX=disabled

(3) Sync /etc/selinux/config The configuration file

[[email protected] ~]# xsync /etc/selinux/config

(4) restart linux01、linux02、linux03 host

[[email protected] ~]# reboot
[[email protected] ~]# reboot
[[email protected] ~]# reboot

2 HDFS install

2.1 Upload hadoop Installation package

rz hadoop-2.8.5.tar.gz

2.2 decompression hadoop Install the package to the specified directory

tar -zxf hadoop-2.8.5.tar.gz -C /opt/apps/

2.3 modify hadoop The configuration file

2.3.1 modify hadoop-env.sh

vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/hadoop-env.sh
To configure jdk In the directory

# The java implementation to use.
export JAVA_HOME=/opt/bigdata/jdk1.8.0_141/

2.3.2 modify hdfs-site.xml

vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/hdfs-site.xml

<!-- To configure namenode Location of the machine -->
<property>
<name>dfs.namenode.rpc-address</name>
<value>linux01:9000</value>
</property>
<!-- To configure namenode Metadata storage directory -->
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hdpdata/name/</value>
</property>
<!-- To configure datanode Data storage directory -->
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hdpdata/data/</value>
</property>
<!-- To configure secondary namenode Location of the machine -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>linux02:50090</value>
</property>

2.3.2 modify core-site.xml

hdfs dfs -df Check the storage of the file system The default operation is linux Local file system , The purpose of modifying this configuration file is to operate hdfs Cluster distributed file system
vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/core-site.xml

<!-- take hdfs dfs The default local file system of the client operation is changed to operation hdfs distributed file system -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://linux01:9000/</value>
</property>

2.3.3 It will be revised hadoop The folder is synchronized to each node

for i in 2 3;do scp -r hadoop-2.8.5 linux0$i:$PWD;done
You can also synchronize one by one

scp -r /opt/apps/hadoop-2.8.5 linux02:$PWD
scp -r /opt/apps/hadoop-2.8.5 linux03:$PWD

3 HDFS Cluster start

3.1 initialization namenode

[[email protected] bin]# pwd
/opt/bigdata/hadoop-2.8.5/bin
Start in the above directory and enter the code
./hdfs namenode -format

After initialization, you can enter... In the root directory opt/hdpdata See the created name Catalog

3.2 start-up namenode

[[email protected] sbin]# pwd
/opt/bigdata/hadoop-2.8.5/sbin
Execute... In the current directory
./hadoop-daemon.sh start namenode

3.2 Each node starts datanode

 sbin/hadoop-daemon.sh start namenode

3.3 After starting the access hadoop Of web page

http://linux01:50070

 Insert picture description here

4 HDFS Cluster batch start

4.1 To configure slaves file

hdfs Read at startup slaves file , What we configure in this file to start datanode

vi /opt/bigdata/hadoop-2.8.5/etc/hadoop/slaves
[[email protected] hadoop]# cat slaves
linux01
linux02
linux03

4.2 Configure the environment variables of the system

In order to operate hdfs Convenient command
The root directory Run the code , take hadoop Configure the location into the environment variable

 vi /etc/profile
export JAVA_HOME=/usr/apps/jdk1.8.0_141/
export HADOOP_HOME=/usr/apps/hadoop-2.8.5
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile

Then we can Data in any directory bin sbin Any command under

4.3 perform start-dfs.sh

It can be in any directory perform start-dfs.sh , You can start with one click hadoop colony
stop-dfs.sh , It can be turned off hadoop colony

The above steps are to install hadoop Detailed steps of clustering , If you like, please collect and pay attention to me ~
The next chapter updates the commonly used hdfs command

copyright:author[Big data learning monk],Please bring the original link to reprint, thank you. https://en.javamana.com/2022/02/202202130718365823.html