JOEL-T99 2022-01-26 16:14:58 阅读数:694
Hive by C/S Pattern , Its architecture is as follows ：
Hive Data used in HDFS in ,Hive Of HQL It will turn into MR、Tez or Spark after , stay Hadoop Running on a cluster .
Hive Three modes of operation ： Embedded mode 、 Local mode 、 Remote mode .
Embedded mode （Local/Embedded Metastore Database（Derby））： This mode is generally used to practice and test ,Hive At run time, a... Will be generated in the deployment directory Derby Documents and a metastore_db Catalog .
Local mode （Local/Embedded Metastore Server）： Use MySQL The database stores metadata . In this mode , Whenever you use
hiveserver2 When , Will start a... Internally metastore Embedded services .
If there are too many clients , Each client initiates its own connection , Would be right. mysql Cause more pressure .
Remote mode （Remote Metastore Server）： Use MySQL The database stores metadata . The mode is to metastore Service from Hive Services are stripped out and deployed , Give Way metastore Service and Hive Services run in different processes , This further decouples the architecture , To ensure the Hive The stability of , Improved service efficiency .
Support multiple clients to connect at the same time , And the client doesn't need to know MySQL Username and password , Just connect metastore The service can be , It provides better management and security .
Parser （SQL Parser）： take SQL String to abstract syntax tree AST, This step is usually completed with a third-party tool library , such as antlr; Yes AST Grammatical analysis , For example, whether a table exists , Whether the field exists 、SQL Is there any semantic error .
compiler （Physical Plan）： take AST Compile build logic execution plan .
Optimizer （Query Optimizer）： Optimize the logical execution plan .
actuator （Execution）： Transform the logical execution plan into a physical plan that can run
Realize to Hive The user interfaces accessed include ：CLI、JDBC/ODBC、HWI、Thrift etc. .
CLI（Command Line Interface）： Command line interface .CLI Startup time , It will start one at the same time Hive copy .
JDBC/ODBC： Use Java Mode of access Hive.
HWI（Hive Web Interface）： Access... Through a browser Hive.
Thrift：Facebook An acid-base framework developed ,Hive Inherited the service （Hiveserver/HiveServer2）
Hiveserver/HiveServer2 The difference between ：
Are allowed without starting CLI Under the circumstances , Access via remote client Hive,Hiveserver Only a single client is supported , stay Hive-0.11.0 The code of this module has been rewritten in the version, and Hiveserver2,Hiveserver2 Support multi client , For open clients （JDBC、ODBC…） Provides better support .
Metadata （Metastore）： Contains the table name 、 The database to which the table belongs （ The default is default）、 The owner of the watch 、 Column / Partition field 、 The type of watch （ Inside / External table ）、 Table data directory, etc .
Metadata is stored in by default derby In the database , But it's usually used MySQL Database storage ！
1️⃣： Client submit HQL The program is sent to Driver（ Any database driver , Such as JDBC、ODBC） In the implementation of ;
2️⃣：Driver according to HQL analysis Query sentence , Validation Syntax ;
3️⃣： The compiler sends a request for metadata to Metastore;
4️⃣：Metastore Send the required metadata to the compiler as a response ;
5️⃣： Compiler check requirements , And resend the plan to Driver;
6️⃣：Driver Send the execution plan to the execution engine ;
7️⃣： The execution engine sends the job to JobTracker,NamaNode Assign jobs to TaskTracker,DataNode perform MapReduce operation , At the same time of execution , The execution engine passes Matastore Perform metadata operations
8️⃣： The execution engine receives DataNode Result
9️⃣： The execution engine sends the results to Driver
1️⃣0️⃣ ：Driver Send results to Hive port
copyright：author[JOEL-T99]，Please bring the original link to reprint, thank you. https://en.javamana.com/2022/01/202201261614556800.html