MySQL learning notes -- schema and data type optimization

wwz_ henu 2022-02-13 05:26:26 阅读数:692

mysql learning notes schema data

One . Choose the optimized data type

Smaller ones are usually better : Less disk usage 、CPU、 Memory , What is needed in handling CPU Less cycles , It should be noted that increasing the range of data types is time-consuming

Simple is better : The processing of simple data types usually requires less CPU cycle , Integers are less expensive than string operations , Use MySQL Built in types to store dates and times instead of using strings , Use integer storage IP Address

Try to avoid NULL:NULL Is the default property of the column , Normally, you should specify NOT NULL; contain NULL The column of ,MySQL Harder to optimize , Expected to make index 、 Index statistics and values are more complex ; for NULL Your columns take up more space , It also needs special treatment ; for NULL When the columns of are indexed , Each index record requires additional bytes , stay MyISAM In the engine , It may cause a fixed size index to become a variable size index .

When selecting a data type for a column , First select the large type ( integer 、 character string 、 Date, etc. ); secondly , Choose a specific type , precision 、 Length etc. ;

1. Integer types

TINYINT、SMALLINT、MEDIUMINT、INT、BIGINT, Respectively take up 8(1 byte )、16(2 byte )、24(3 byte )、32(4 byte )、64(8 byte ) position , The storage range is -2^n-1-----2^n-1-1, among n It's a number .

To develop int(11) Meaningless for most applications , And int(1) The occupied space is the same .

2. Real number type / floating-point

MySQL Both support precise type , Also supports imprecise types .

FLOAT(4 byte ) and DOUBLE(8 byte ) Support the use of standard floating-point operations for approximate calculations .

DECIMAL Used to store exact decimals , Requires additional space and computing overhead , Mostly used to store financial data . stay 5.0 And above , This type supports accurate calculation . Maximum allowed 65 A digital . You can expand the number by a corresponding multiple , Use BIGINT Storage DECIMAL.

3. character string

VARCHAR: Store variable length strings . More space saving than fixed length strings , Use only the necessary space , Shorter strings take up less space . If the table uses ROW_FORMAT=FIXED, Each line will use a fixed length . Use 1~2 Byte storage string length , If the column length is less than or equal to 255 Use 1 Bytes , Otherwise, use two bytes ; example : Use latin1 Character set ,VARCHAR(10) Occupy 11 Bytes ,VARCHAR(1000) Occupy 1002 Bytes ; As the behavior grows , The update may require additional work , If the space occupied by the row increases , Different engines handle it differently ,MyISAM Split into different pieces to store ,InnoDB You need to split the page and put the rows into one page ; Save with trailing spaces

CHAR: Store fixed length string . It is suitable for storing very short strings or strings with basically the same length . Delete the last space when storing ; How data is stored depends on the storage engine ;Memory The engine only supports fixed length lines

And CHAR Follow VARCHAR Similarly, there is BINARY Follow VARBINARY, They store binary strings , It stores bytecode instead of characters , Filled with \0 Not spaces

BLOB and TEXT: It uses binary mode and character mode to store large string data .TINYTEXT/SMALLTEXT/TEXT/MEDIUMTEXT/LONGTEXT,TINYBLOB/SMALLBLOB/BLOG/MEDIUMBLOG/LONGBLOB.BLOB yes SMALLBLOB A synonym for ,TEXT yes SMALLINT A synonym for .MySQL Put each TEXT and BLOB Treat as a separate object ; The storage engine performs special processing when storing ; When the value is too high ,InnoDB Will use external storage areas , Stored in the line is 1~4 A pointer of bytes ;BLOB No collation or character set ;TEXT There are sorting rules and character sets ; Only for the top of each column max_sort_length Sort by length , You can modify this setting or use ORDER BY SUSTRING(column,length); The full length cannot be indexed , You can't use these indexes to eliminate sorting ;

ENUM: You can use this type instead of the commonly used string type to reduce space consumption . Avoid using ENUM Store numbers ; Sort by internally stored integers instead of defined strings ; Can be used explicitly in queries field(column,'value1','value2'); modify ENUM You need to ALTER TABLE, It can take a long time ;MySQL Save the enumeration value as an integer , Therefore, when taking the column value, you need to search to convert it into a string , So enumeration has some overhead ;

DATETIME and TIMESTAMP:DATETIME Large storage range 1001 year ~9999 year , Accuracy to seconds , Encapsulate the date into YYYYMMDDHHMMSS Integer in format , It's not about time zone , Use 8 byte ;TIMESTAMP What is preserved is 1970 year 1 month 1 The number of seconds since midnight , Use 4 Bytes , Storage 1970~2038 Year Date , The display depends on the time zone ; You can use FROM_UNIXTIME() Convert timestamp to time ;UNIX_TIMESTAMP() Convert dates to timestamps ;TIMESTAMP More efficient ;

Bit data :

        BIT,bit(1) Express 1 position ,bit(2) Express 2 position , Maximum 64 position ;MyISAM Will pack all BIT Column , therefore 15 Separate BIT Column only 15 position ;InnoDB For each BIT The column uses a minimum integer sufficient to store , Therefore, the space occupation cannot be reduced ;

        SET, Similar to a that can have multiple values ENUM, have access to find_in_filed(),field() Such a function to query

4. Select the identifier

It is important to choose the appropriate data type for the identification column . It is more likely to use the identity column to compare with other columns , For example, association operation , Or find other columns by identifying Columns ; It should be consistent with the data type in the column in the associated table ; Type to match exactly , Include UNSIGNED; On the premise of meeting current and future needs , The smallest type should be selected ; The integer type is usually the best data type for identifying columns , Fast and can use AUTO_INCREMENT; Do not use ENUM and SET type ; Avoid using string types ;

Two .Schema Design traps

  • Too many columns : for example ORM

Too many connections :MySQL Limit the maximum number of... Per association operation 61 Tables , A single query is best placed in 12 Make a connection within ;

Omnipotent enumeration : Over enumeration

Enumeration in disguise : Use SET Instead of ENUM

3、 ... and . Paradigms and anti paradigms

Normalization : Each fact appears and only appears once .

Anti normalization : data redundancy , Appear many times .

Paradigm advantages :

  • The update operation is faster than the anti normal form
  • The watch is smaller , Less memory
  • Because the data are rarely repeated , Less need for distinct and GROUP BY

Paradigm shortcomings :

  • Usually you need an association table

The advantages of anti paradigm :

  • It can avoid Association

The disadvantages of anti paradigm :

  • If no association query is required , Low efficiency
In practice , Mix the two ways
copyright:author[wwz_ henu],Please bring the original link to reprint, thank you.