[DB] MariaDB 인코딩

LifeCoding 2023. 2. 10. 11:06

2023. 2. 10. 11:06

MariaDB 기본 Charset 확인하기

MariaDB [(none)]> show variables like 'c%';
+----------------------------------+----------------------------+
| Variable_name                    | Value                      |
+----------------------------------+----------------------------+
| character_set_client             | utf8mb4                    |
| character_set_connection         | utf8mb4                    |
| character_set_database           | utf8mb4                    |
| character_set_filesystem         | binary                     |
| character_set_results            | utf8mb4                    |
| character_set_server             | utf8mb4                    |
| character_set_system             | utf8mb3                    |
| character_sets_dir               | /usr/share/mysql/charsets/ |
| check_constraint_checks          | ON                         |
| collation_connection             | utf8mb4_general_ci         |
| collation_database               | utf8mb4_general_ci         |
| collation_server                 | utf8mb4_general_ci         |
| column_compression_threshold     | 100                        |
| column_compression_zlib_level    | 6                          |
| column_compression_zlib_strategy | DEFAULT_STRATEGY           |
| column_compression_zlib_wrap     | OFF                        |
| completion_type                  | NO_CHAIN                   |
| concurrent_insert                | AUTO                       |
| connect_timeout                  | 10                         |
| core_file                        | OFF                        |
+----------------------------------+----------------------------+

데이터베이스의 인코딩 확인하기

MariaDB [(none)]> SELECT default_character_set_name, DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA WHERE schema_name = "wordpress";
+----------------------------+------------------------+
| default_character_set_name | DEFAULT_COLLATION_NAME |
+----------------------------+------------------------+
| utf8mb4                    | utf8mb4_general_ci     |
+----------------------------+------------------------+

테이블의 인코딩 확인하기

결과 중에 Collation 컬럼 확인해보면 된다.

MariaDB [wordpress]> SHOW FULL COLUMNS FROM wp_posts;
+-----------------------+---------------------+--------------------+------+-----+---------------------+----------------+---------------------------------+---------+
| Field                 | Type                | Collation          | Null | Key | Default             | Extra          | Privileges                      | Comment |
+-----------------------+---------------------+--------------------+------+-----+---------------------+----------------+---------------------------------+---------+
| ID                    | bigint(20) unsigned | NULL               | NO   | PRI | NULL                | auto_increment | select,insert,update,references |         |
| post_author           | bigint(20) unsigned | NULL               | NO   | MUL | 0                   |                | select,insert,update,references |         |
| post_date             | datetime            | NULL               | NO   |     | 0000-00-00 00:00:00 |                | select,insert,update,references |         |
| post_date_gmt         | datetime            | NULL               | NO   |     | 0000-00-00 00:00:00 |                | select,insert,update,references |         |
| post_content          | longtext            | utf8mb4_unicode_ci | NO   |     | NULL                |                | select,insert,update,references |         |
| post_title            | text                | utf8mb4_unicode_ci | NO   |     | NULL                |                | select,insert,update,references |         |
| post_excerpt          | text                | utf8mb4_unicode_ci | NO   |     | NULL                |                | select,insert,update,references |         |
| post_status           | varchar(20)         | utf8mb4_unicode_ci | NO   |     | publish             |                | select,insert,update,references |         |
| comment_status        | varchar(20)         | utf8mb4_unicode_ci | NO   |     | open                |                | select,insert,update,references |         |
| ping_status           | varchar(20)         | utf8mb4_unicode_ci | NO   |     | open                |                | select,insert,update,references |         |
| post_password         | varchar(255)        | utf8mb4_unicode_ci | NO   |     |                     |                | select,insert,update,references |         |
| post_name             | varchar(200)        | utf8mb4_unicode_ci | NO   | MUL |                     |                | select,insert,update,references |         |
| to_ping               | text                | utf8mb4_unicode_ci | NO   |     | NULL                |                | select,insert,update,references |         |
| pinged                | text                | utf8mb4_unicode_ci | NO   |     | NULL                |                | select,insert,update,references |         |
| post_modified         | datetime            | NULL               | NO   |     | 0000-00-00 00:00:00 |                | select,insert,update,references |         |
| post_modified_gmt     | datetime            | NULL               | NO   |     | 0000-00-00 00:00:00 |                | select,insert,update,references |         |
| post_content_filtered | longtext            | utf8mb4_unicode_ci | NO   |     | NULL                |                | select,insert,update,references |         |
| post_parent           | bigint(20) unsigned | NULL               | NO   | MUL | 0                   |                | select,insert,update,references |         |
| guid                  | varchar(255)        | utf8mb4_unicode_ci | NO   |     |                     |                | select,insert,update,references |         |
| menu_order            | int(11)             | NULL               | NO   |     | 0                   |                | select,insert,update,references |         |
| post_type             | varchar(20)         | utf8mb4_unicode_ci | NO   | MUL | post                |                | select,insert,update,references |         |
| post_mime_type        | varchar(100)        | utf8mb4_unicode_ci | NO   |     |                     |                | select,insert,update,references |         |
| comment_count         | bigint(20)          | NULL               | NO   |     | 0                   |                | select,insert,update,references |         |
+-----------------------+---------------------+--------------------+------+-----+---------------------+----------------+---------------------------------+---------+

utf8과 utf8mb4의 차이

UTF-8 and UTFMB4 are character sets used in MySQL to store text data.

UTF-8 is a character encoding that can represent a wide range of characters and symbols, including most of the characters used in the world's written languages. UTF-8 is a variable-width encoding, which means that characters can take up anywhere from 1 to 4 bytes of storage space in a database.

UTFMB4 is an extension of UTF-8 that can store a wider range of characters, including emoji and other special symbols. UTFMB4 uses a maximum of 4 bytes to encode characters, which makes it more memory-intensive than UTF-8.

In general, if you need to store text data that includes a limited range of characters, UTF-8 is a good choice. However, if you need to store a wider range of characters, including emoji and special symbols, UTFMB4 is the better choice. It's important to note that using UTFMB4 requires a more powerful database setup and can lead to increased memory usage, so it may not be appropriate for all applications.

utf8bm4_general_ci 와 utf7mb4_unicode_ci의 차이

UTF8MB4_GENERAL_CI and UTF8MB4_UNICODE_CI are collation types for UTF8MB4 character sets in MySQL.

A collation determines how text data is compared and sorted in a database. The collation type you choose affects how text data is treated in comparison operations, such as equal to (==), not equal to (!=), greater than (>), and less than (<).

UTF8MB4_GENERAL_CI is a case-insensitive collation that sorts text data in a way that is suitable for most applications. It only considers the basic letter and number characters and ignores special characters and symbols.

UTF8MB4_UNICODE_CI, on the other hand, is a case-insensitive collation that takes into account the full range of Unicode characters, including special symbols and emoji. This collation type is more suitable for applications that need to handle a wide range of characters, such as multilingual websites.

In general, UTF8MB4_GENERAL_CI is faster and uses less memory than UTF8MB4_UNICODE_CI, but it provides less accurate comparison results for special characters and symbols. UTF8MB4_UNICODE_CI provides more accurate comparison results, but is slower and uses more memory.

The choice between UTF8MB4_GENERAL_CI and UTF8MB4_UNICODE_CI depends on the specific needs of your application and the text data that you need to store in your database.

'DB' 카테고리의 다른 글

[DB] MariaDB 보안 설정 (0)	2024.09.27
[MariaDB] root 패스워드 리셋 방법 (0)	2023.07.19
[SQLite] SQLite DB 한계 (0)	2022.04.03
[InnoDB] cannot allocate memory for the buffer pool 에러 (0)	2022.03.22
데이터베이스(DB) vs. 데이터베이스 관리 시스템(DBMS) (0)	2021.12.20

인생 코딩