1. Basic knowledge of HDFS

Code sheep 2022-02-13 07:07:49 阅读数:515

basic knowledge hdfs

1.HDFS Basic knowledge of

1.HDFS Basic knowledge of
2.HDFS operation
3.HDFS Upload | Download process

1.1 Traditional documents 、 Storage
 file system : A way to store and manage data
Traditional storage : A program is a program , Data is data , Transfer data to the program during processing
1.2 Concept and reality of distributed file storage ( a key )

reflection : How to simulate the implementation of distributed file storage system ? What are the features ( function 、 advantage 、 effect )
( Reference video -「 Why is Chinese online disk so difficult 」)

  • Distributed

    Distributed storage energy - Unlimited expansion - Support massive data storage

 Insert picture description here

  • Block storage

For block parallel operation , Increase of efficiency

 Insert picture description here

  • Replica mechanism

Redundant storage , Data security

 Insert picture description here

  • Metadata management

The role of metadata : Quickly locate the file location and make it easy to find


HDFS(Hadoop Distributed File System)
Core architecture objectives : It mainly solves the problem of how to store big data - Through distributed storage
Distributed means HDFS Is a storage system across multiple computers

1.4HDFS The design goal of

Design objectives : Fault detection and automatic control 、 Quick fix ; Compared with speed , Pay more attention to the high throughput of data access
The general application is to stream data ,HDFS Designed for batch processing , Not user interaction
- It looks like a stand-alone ( Standard master-slave cluster ), But the bottom layer is distributed -

# reflection : Small file storage ( Upload ) What do I do ?( Small file storage 、 Slow reading speed )
Merge on upload
# Append content to the end of the file appendToFile
#(>> Additional ,> Cover )
[[email protected] ~]# echo 1 >> 1.txt
[[email protected] ~]# echo 2 >> 2.txt 
[[email protected] ~]# echo 3 >> 3.txt 
[[email protected] ~]# hadoop fs -put 1.txt /
[[email protected] ~]# hadoop fs -cat /1.txt
[[email protected] ~]# hadoop fs -appendToFile 2.txt 3.txt /1.txt
[[email protected] ~]# hadoop fs -cat /1.txt
[[email protected] ~]# 
# Additional use : Put small local files - Upload - Merge into a large file Solve the problem of small file scenario 
1.5HDFS Application scenarios of

fit : A large file , Data streaming access , Write multiple reads at a time , Low cost deployment ( cheap PC, High fault tolerance )
Not suitable for : Small files , Interactive data access , Modify frequently and arbitrarily , Low latency processing

1.6HDFS copy

Block default 128MB(134217728B), Default copy 3(1+2), The copy is lengthy ( The server only uses 3 One third of the stored data , The remaining 3 Divided 2 For backup , Low utilization )

copyright:author[Code sheep],Please bring the original link to reprint, thank you. https://en.javamana.com/2022/02/202202130707465844.html