MariaDB 기본 Charset 확인하기
MariaDB [(none)]> show variables like 'c%';
+----------------------------------+----------------------------+
| Variable_name | Value |
+----------------------------------+----------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8mb3 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| check_constraint_checks | ON |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_general_ci |
| collation_server | utf8mb4_general_ci |
| column_compression_threshold | 100 |
| column_compression_zlib_level | 6 |
| column_compression_zlib_strategy | DEFAULT_STRATEGY |
| column_compression_zlib_wrap | OFF |
| completion_type | NO_CHAIN |
| concurrent_insert | AUTO |
| connect_timeout | 10 |
| core_file | OFF |
+----------------------------------+----------------------------+
데이터베이스의 인코딩 확인하기
MariaDB [(none)]> SELECT default_character_set_name, DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA WHERE schema_name = "wordpress";
+----------------------------+------------------------+
| default_character_set_name | DEFAULT_COLLATION_NAME |
+----------------------------+------------------------+
| utf8mb4 | utf8mb4_general_ci |
+----------------------------+------------------------+
테이블의 인코딩 확인하기
결과 중에 Collation 컬럼 확인해보면 된다.
MariaDB [wordpress]> SHOW FULL COLUMNS FROM wp_posts;
+-----------------------+---------------------+--------------------+------+-----+---------------------+----------------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+-----------------------+---------------------+--------------------+------+-----+---------------------+----------------+---------------------------------+---------+
| ID | bigint(20) unsigned | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | |
| post_author | bigint(20) unsigned | NULL | NO | MUL | 0 | | select,insert,update,references | |
| post_date | datetime | NULL | NO | | 0000-00-00 00:00:00 | | select,insert,update,references | |
| post_date_gmt | datetime | NULL | NO | | 0000-00-00 00:00:00 | | select,insert,update,references | |
| post_content | longtext | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
| post_title | text | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
| post_excerpt | text | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
| post_status | varchar(20) | utf8mb4_unicode_ci | NO | | publish | | select,insert,update,references | |
| comment_status | varchar(20) | utf8mb4_unicode_ci | NO | | open | | select,insert,update,references | |
| ping_status | varchar(20) | utf8mb4_unicode_ci | NO | | open | | select,insert,update,references | |
| post_password | varchar(255) | utf8mb4_unicode_ci | NO | | | | select,insert,update,references | |
| post_name | varchar(200) | utf8mb4_unicode_ci | NO | MUL | | | select,insert,update,references | |
| to_ping | text | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
| pinged | text | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
| post_modified | datetime | NULL | NO | | 0000-00-00 00:00:00 | | select,insert,update,references | |
| post_modified_gmt | datetime | NULL | NO | | 0000-00-00 00:00:00 | | select,insert,update,references | |
| post_content_filtered | longtext | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
| post_parent | bigint(20) unsigned | NULL | NO | MUL | 0 | | select,insert,update,references | |
| guid | varchar(255) | utf8mb4_unicode_ci | NO | | | | select,insert,update,references | |
| menu_order | int(11) | NULL | NO | | 0 | | select,insert,update,references | |
| post_type | varchar(20) | utf8mb4_unicode_ci | NO | MUL | post | | select,insert,update,references | |
| post_mime_type | varchar(100) | utf8mb4_unicode_ci | NO | | | | select,insert,update,references | |
| comment_count | bigint(20) | NULL | NO | | 0 | | select,insert,update,references | |
+-----------------------+---------------------+--------------------+------+-----+---------------------+----------------+---------------------------------+---------+
utf8과 utf8mb4의 차이
UTF-8 and UTFMB4 are character sets used in MySQL to store text data.
UTF-8 is a character encoding that can represent a wide range of characters and symbols, including most of the characters used in the world's written languages. UTF-8 is a variable-width encoding, which means that characters can take up anywhere from 1 to 4 bytes of storage space in a database.
UTFMB4 is an extension of UTF-8 that can store a wider range of characters, including emoji and other special symbols. UTFMB4 uses a maximum of 4 bytes to encode characters, which makes it more memory-intensive than UTF-8.
In general, if you need to store text data that includes a limited range of characters, UTF-8 is a good choice. However, if you need to store a wider range of characters, including emoji and special symbols, UTFMB4 is the better choice. It's important to note that using UTFMB4 requires a more powerful database setup and can lead to increased memory usage, so it may not be appropriate for all applications.
utf8bm4_general_ci 와 utf7mb4_unicode_ci의 차이
UTF8MB4_GENERAL_CI and UTF8MB4_UNICODE_CI are collation types for UTF8MB4 character sets in MySQL.
A collation determines how text data is compared and sorted in a database. The collation type you choose affects how text data is treated in comparison operations, such as equal to (==), not equal to (!=), greater than (>), and less than (<).
UTF8MB4_GENERAL_CI is a case-insensitive collation that sorts text data in a way that is suitable for most applications. It only considers the basic letter and number characters and ignores special characters and symbols.
UTF8MB4_UNICODE_CI, on the other hand, is a case-insensitive collation that takes into account the full range of Unicode characters, including special symbols and emoji. This collation type is more suitable for applications that need to handle a wide range of characters, such as multilingual websites.
In general, UTF8MB4_GENERAL_CI is faster and uses less memory than UTF8MB4_UNICODE_CI, but it provides less accurate comparison results for special characters and symbols. UTF8MB4_UNICODE_CI provides more accurate comparison results, but is slower and uses more memory.
The choice between UTF8MB4_GENERAL_CI and UTF8MB4_UNICODE_CI depends on the specific needs of your application and the text data that you need to store in your database.
'DB' 카테고리의 다른 글
[DB] MariaDB 보안 설정 (0) | 2024.09.27 |
---|---|
[MariaDB] root 패스워드 리셋 방법 (0) | 2023.07.19 |
[SQLite] SQLite DB 한계 (0) | 2022.04.03 |
[InnoDB] cannot allocate memory for the buffer pool 에러 (0) | 2022.03.22 |
데이터베이스(DB) vs. 데이터베이스 관리 시스템(DBMS) (0) | 2021.12.20 |