MySQL learning notes -- creating high-performance indexes

wwz_ henu 2022-02-13 05:26:20 阅读数:482

mysql learning notes creating high-performance

One . Basics

The index is implemented in the storage engine , Instead of server implementation . There is no unified index standard , Different engines that support the same index , The underlying implementation may also be different .

B-Tree Indexes

Most indexes are B-Tree Indexes . Each leaf node contains a pointer to the child node of the next page . Use B-Tree This term , Because MySQL stay CREATE TABLE And other statements . The underlying implementation of different engines is different ,NDB It uses T-Tree data structure ,InnoDB It uses B+Tree data structure . The storage engine is used in different ways B-Tree.MyISAM Use prefix compression technology , Make the index smaller , Index rows through the physical location of the data ;InnoDB Store in the original data format , The indexed row is referenced by the primary key .B-Tree It usually means that data is stored in order ; The index columns are stored in order , So it's good for finding range data ;

Applicable queries : Full match ; Match the leftmost prefix ; Match column prefix ; Match range value ; Exactly match one column and range to another ; Queries that only access the index ; It can also be used for ORDER BY;

B-Tree Limit : If you don't start with the leftmost column of the index , Index cannot be used ; You can't skip columns in an index ; If there is a range query for a column in the query , Then all columns to the right of this column cannot be found using index optimization ;

If in A、B、C Create... On three columns B-Tree Indexes , be A、AB、ABC、AC( The effect is the same A) You can use index ; if A Query for range , be AB\AC\ABC in A The right column cannot use index ;

Hash Indexes

Implementation based on hash table , Only queries that exactly match all columns of the index will take effect . For each row of data , The search engine will generate a hash code for it , The index stores the hash code in the index , At the same time, the pointer to each row of data is saved in the hash table . Only Memory Explicit support Hash Indexes ; The index itself only stores the corresponding hash value , So the structure is very compact , The search speed is very fast ;

characteristic : Hash index only contains hash value and row pointer , Do not store field values , You cannot use index values to avoid reading rows ; Index data is not sorted by index value , Can't be used to sort ; Partial index matching lookup is not supported ; Only equivalence comparison is supported ,=、in、<=>, No range queries are supported ,>、< etc. ; Possible hash conflicts , Need to traverse all row pointers in the linked list , This causes the query speed to slow down ; In case of hash conflict , Index maintenance costs are high ;

If the storage engine does not support hash indexing , A pseudo hash index can be created . example , If necessary, according to URL lookup , and URL longer , New columns can be created URL_CRC=CRC32(URL), You can use triggers to maintain this column . To avoid hash conflicts , At the same time, compare the original value ;

Spatial data index

MyISAM Support spatial data index , Can be used to store geographic data . Prefix free query , Data will be indexed from all dimensions ; You can use any dimension to combine queries ;

Full-text index

Apply to MATCH AGAINST operation

Two . advantage

  • Reduce the amount of data that the server needs to scan
  • Help server avoid sorting and temporary tables
  • Will be random I/O Change to order I/O
A very small watch , Full table scanning is more effective ; Medium and large table indexes are more effective ; Extra large watch , The cost of building and using indexes also increases , At this point, you can partition 、 table .

3、 ... and . Index strategy

  1. Independent columns : An index column cannot be part of an expression , Nor can it be a function parameter .
  2. Prefix index and index selectivity : If the column is a very long character column , The first part of the characters that can be indexed , But it will reduce the selectivity of the index ; Index selectivity refers to the ratio of non repeated index values to the total number of data tables , Highly selective indexes can filter out more rows when querying ; The only index selectivity is 1; Cannot use prefix index for ORDER BY and GROUP BY, Can't do overlay scanning ;
  3. You can invert the string and store , To use the suffix index ;
  4. Multi column index : Create separate indexes on multiple columns , It can not effectively improve the query efficiency ; if WHERE There are AND or OR Or both ( Independent indexes are used in conditions ),MySQL Will use index merge .
  5. Select the appropriate index column order :
  6. Cluster index : It's not a separate index type , It's how data is stored . Data rows and adjacent key values are stored closely together ; A table can only have one clustered index , Because you can't put data rows in two places at the same time ; Not all engines support clustered indexes ,InnoDB Support , Aggregate data through primary keys , If there is no primary key, select a unique non empty index to replace , If there is no such index, a primary key will be implicitly defined ; This index cannot be created separately , There is an engine to automatically create ;
  7. Overlay index : An index contains the field values of all queries , Overwrite index ;
  8. Using index scans to sort :
Add an index to the prefix of a field ,5 Is the prefix length , This length can be obtained by comparing the selectivity value
ALTER TABLE table1 ADD key(field1(5))

Four . Maintain indexes and tables

Check whether the meter is damaged

CHECK TABLE table_name
Try to fix
REPAIR TABLE table_name
Update index statistics

       The query optimizer passes records_in_range() and info() Two API Understand the index value distribution information of the storage engine , To decide how to use the index .

  Regenerate Statistics

ANALYZE TABLE table_name
Reduce index and data fragmentation

  • Line fragment : Data rows are stored in fragments in multiple places .
  • Between the lines : Logically sequential pages , Or rows are not stored sequentially on disk . It has a great impact on operations such as full table scanning and clustered index scanning .
  • The remaining space debris : There is a lot of space left in the data page .
about MyISAM, All three fragments can appear .InnoDB There will be no short line fragments . Can pass OPTIMIZE TABLE Or export and then import to sort out the data .
For not supporting OPTIMIZE TABLE The engine of , Can pass ALTER TABLE Modify the table engine to the current engine to organize the table .
ALTER TABLE table_name ENGINE=engine_name


copyright:author[wwz_ henu],Please bring the original link to reprint, thank you.