HDFS transparent encryption usage, keystore and Hadoop kms, encryption area, key concepts and architecture of transparent encryption, KMS configuration

yida&yueda 2022-02-13 07:51:36 阅读数:339

hdfs transparent encryption usage keystore

HDFS Transparent encryption uses 、Keystore and Hadoop KMS、 Encrypted area

HDFS The data in will be in the form of block Is saved in the local disk of each data node , But these block It's all plain text , If under the operating system , Direct access block directory , adopt Linux Of cat The command can directly view the contents , And in clear text .
 Insert picture description here

Now let's go straight to DataNode The local store block The catalog of , Direct view block Content :

/export/data/hadoop-3.1.4/dfs/data/current/BP-1748151750-192.168.227.1511608259905540/current/finalized/subdir0/subdir0/

 Insert picture description here

1.2 Background and Application

1.2.1 Common encryption levels

  • Ø Application layer encryption

    This is the safest and most flexible way . Encrypted content is ultimately controlled by the application , And can accurately reflect the needs of users . however , Writing applications to implement encryption is generally difficult .

  • Ø Database layer encryption

    Similar to application encryption . Most database vendors provide some form of encryption , But there may be performance problems , In addition, for example, the index cannot be encrypted .

  • Ø File system layer encryption

    This approach has little impact on performance , And transparent to applications , It is generally easier to implement . But the application's fine-grained requirements and policies , May not be fully satisfied .

  • Ø Disk layer encryption

    Easy deployment and high performance , But quite inflexible , It can only prevent users from stealing data from the physical level .

HDFS Transparent encryption belongs to the encryption of database layer and file system layer . Good performance , And transparent to existing applications .HDFS Encryption can prevent attacks under the file system or , Also known as operating system level attacks (OS-level attacks). The operating system and disk can only interact with encrypted data , Because the data has been HDFS Encrypted .

1.2.2 Application scenarios

Data encryption is important for many governments around the world , Financial and regulatory agencies are mandatory , To meet privacy and other security requirements . for example , Card payment industry has adopted “ Payment card industry data security standards ”(PCI DSS) To improve information security . Other examples include the U.S. government's 《 The Federal Information Security Administration Act 》(FISMA) and 《 Health Insurance Portability and Liability Act 》(HIPAA) A request made . Encryption is stored in HDFS The data in can help your organization comply with such regulations .

1.3 Introduction to transparent encryption

HDFS Transparent encryption (Transparent Encryption) Support end-to-end transparent encryption , When enabled , For some that need encryption HDFS The files in the directory can be encrypted and decrypted transparently , There is no need to modify the user's business code . End to end means that encryption and decryption can only be done through the client . For files in the encrypted area ,HDFS What is saved is the encrypted file , The secret key of file encryption is also encrypted . Let illegal users copy files from the operating system level , It's also a ciphertext , Can't see .

HDFS Transparent encryption has the following features :

  • Only HDFS The client can encrypt or decrypt data .

  • Key management in HDFS external .HDFS Unable to access unencrypted data or encryption key .HDFS The management of and key are independent responsibilities , By different user roles (HDFS Administrators , Key manager ) To undertake , This ensures that no single user has unrestricted access to data and keys .

  • Operating system and HDFS Use only encrypted HDFS Data to interact , This reduces threats at the operating system and file system levels .

  • HDFS Use advanced encryption standard counter mode (AES-CTR) encryption algorithm .AES-CTR Support 128 Bit encryption key ( Default ), Or install Java Cryptography Extension(JCE) Infinite strength JCE To support 256 Bit encryption key .

1.4 Key concepts and architecture of transparent encryption

1.4.1 Encryption area and key

HDFS Transparent encryption has a new concept , Encrypted area (the encryption zone). The encrypted area is a special directory , When writing to a file, it will be transparently encrypted , When reading the file, it will be decrypted transparently .

When the encrypted area is created , There will be one Encryption area key EZ secret key ,encryption zone key) With the corresponding ,EZ The key is stored in HDFS In the external backup keystore . Each file in the encryption area has its own encryption key , be called Data encryption key DEK,data encryption key).DEK Will use their respective encrypted areas EZ Encryption with key , To form an encryption key for encrypted data (EDEK)HDFS Will not deal directly with DEK,HDFS Only deal with EDEK. The client will decrypt EDEK, Then use the following DEK To read and write data .

About EZ secret key 、DEK、EDEK The relationship between the three is as follows :
 Insert picture description here

1.4.2 Keystore and Hadoop KMS

Storage key (key) It's called a keystore (keystore), take HDFS With external enterprise keystores (keystore) Integration is the first step in deploying transparent encryption . This is because the key (key) Administrators and HDFS The separation of responsibilities between administrators is a very important aspect of this function . however , Most keystores are not for Hadoop Encryption seen by workload / Designed for decryption request rate .

So ,Hadoop A new service was developed , The service is called Hadoop Key management server (Key Management Server, Abbreviation KMS), This service is used as HDFS Proxy between client and keystore . Keystore and Hadoop KMS Between and with HDFS Must be used between clients Hadoop Of KeyProvider API Interact .

KMS The main responsibilities are as follows :

1. Provide access to the saved encrypted area secret key (EZ key)
2. Generate EDEK,EDEK Stored in NameNode On
3. by HDFS Client decryption EDEK

1.4.3 Access files in the encrypted area

1.4.3.1 The process of writing an encrypted file

 Insert picture description here

Premise : establish HDFS When encrypting the area, a HDFS Encryption area ( Catalog ), At the same time KMS Create a... In the service key And its EZ Key, And the connection between the two .

1.Client towards NN Request in HDFS Create a new file in an encrypted area ;
2.NN towards KMS Request for this file EDEK,KMS Use the corresponding EZ key Make a new one EDEK Send to NN;
3. This EDEK Will be NN Write to file metadata in ;
4.NN send out EDEK to Client;
5.Client send out EDEK to KMS Request decryption ,KMS Use the corresponding EZ key take EDEK Decrypted as DEK Send to Client;
6.Client use DEK Send encrypted file content to datanode For storage .

DEK Is the key to encrypt and decrypt a file , and KMS Stored in EZ key Is the key used to encrypt and decrypt all files (DEK) Key of .** therefore ,EZ Key Is more important data , Only in KMS For internal use (DEK The encryption and decryption of is only in KMS Memory for ), Will not be passed outside for use , and HDFS The server can only contact EDEK, therefore HDFS The server can't decrypt the encrypted area file .

1.4.3.2 The process of reading and decrypting files

Read process and write process types , The difference is NN Directly read the metadata in the encrypted file EDEK Return to the client , Just like the client EDEK Send to KMS obtain DEK. Then decrypt and read the encrypted content .

EDEK The encryption and decryption of are completely in KMS on . what's more , Request to create or decrypt EDEK Your client will never handle EZ secret key . only KMS Can be used as required EZ Key creation and decryption EDEK.

1.5 KMS To configure

1.5.1 close HDFS colony

stay node1 On the implementation stop-dfs.sh.

1.5.2 key Key generation

[[email protected] ~]# keytool -genkey -alias 'yida'
Enter keystore password:
Re-enter new password:
What is your first and last name?
[Unknown]:
What is the name of your organizational unit?
[Unknown]:
What is the name of your organization?
[Unknown]:
What is the name of your City or Locality?
[Unknown]:
What is the name of your State or Province?
[Unknown]:
What is the two-letter country code for this unit?
[Unknown]:
Is CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown correct?
[no]: yes
Enter key password for <yida>
(RETURN if same as keystore password):
Re-enter new password:

1.5.3 To configure kms-site.xml

Profile path :/export/server/hadoop-3.1.4/etc/hadoop/kms-site.xml

<configuration>
<property>
<name>hadoop.kms.key.provider.uri</name>
<value>jceks://[email protected]/${user.home}/kms.jks</value>
</property>
<property>
<name>hadoop.security.keystore.java-keystore-provider.password-file</name>
<value>kms.keystore.password</value>
</property>
<property>
<name>dfs.encryption.key.provider.uri</name>
<value>kms://[email protected]:16000/kms</value>
</property>
<property>
<name>hadoop.kms.authentication.type</name>
<value>simple</value>
</property>
</configuration>

The password file is located in... Through the classpath Hadoop Find in the configuration directory of .

1.5.4 kms-env.sh

export KMS_HOME=/export/server/hadoop-3.1.4
export KMS_LOG=${KMS_HOME}/logs/kms
export KMS_HTTP_PORT=16000
export KMS_ADMIN_PORT=16001

1.5.5 modify core|hdfs-site.xml

core-site.xml

<property>
<name>hadoop.security.key.provider.path</name>
<value>kms://[email protected]:16000/kms</value>
</property>

hdfs-site.xml

 <property>
<name>dfs.encryption.key.provider.uri</name>
<value>kms://[email protected]:16000/kms</value>
</property>

Synchronize configuration files to other nodes

1.5.5.1 KMS Service startup

hadoop --daemon start kms

 Insert picture description here

1.5.5.2 HDFS Cluster start

start-dfs.sh

1.6 Transparent encryption uses

1.6.1 establish key

Switch to normal users allenwoon operation

 # su allenwoon # hadoop key create yida # hadoop key list -metadata

1.6.2 Create an encrypted area

Use root Super user operation

# As a super user , Create a new empty directory , And set it as the encryption area
hadoop fs -mkdir /zone
hdfs crypto -createZone -keyName itcast -path /zone
# Put it chown For ordinary users
hadoop fs -chown allenwoon:allenwoon /zone

1.6.3 Test encryption effect

Operate as ordinary user

# Put the file as an ordinary user , And read it out
echo helloitcast >> helloWorld
hadoop fs -put helloWorld /zone
hadoop fs -cat /zone /helloWorld
# As an ordinary user , Get encrypted information from file
hdfs crypto -getFileEncryptionInfo -path /zone/helloWorld

`

1.6.3 Test encryption effect

Operate as ordinary user

# Put the file as an ordinary user , And read it out
echo helloitcast >> helloWorld
hadoop fs -put helloWorld /zone
hadoop fs -cat /zone /helloWorld
# As an ordinary user , Get encrypted information from file
hdfs crypto -getFileEncryptionInfo -path /zone/helloWorld

Download files directly block It is found that the data cannot be read .

copyright:author[yida&yueda],Please bring the original link to reprint, thank you. https://en.javamana.com/2022/02/202202130751311543.html