MySQL and UTF-8 - locales and some advanced settings

Default locales, and collation in MySQL since 4.1 is latin1_swedish_ci, it works well but if you have some polish texts it’s not so good, there is way to change defaults, and it is quite easy.

After installation of MySQL edit your config file (default location of this file is /etc/mysql/my.cnf).

  1. Find [mysqld] section in my.cnf

  2. Add this entries to it:

    [mysqld]
    #
    # Enable Polish collocation and default utf8 encoding
    # Change mysql default character set
    #
    character_set_server=utf8
    collation_server=utf8_polish_ci
    default-character-set=utf8
    
    init-connect="SET NAMES utf8;"
    # It makes
    # SET character_set_client = utf8;
    # SET character_set_results = utf8;
    # SET character_set_connection = utf8;
    

    You can also add option: skip-character-set-client-handshake to enforce using of utf8 encoding in database and connection, to read more about this you can visit mysql manual.

  3. Type: /etc/init.d/mysql restart

  4. You’re done ;)

This setting alters only databases which will be created.

Changing character sets in VARCHAR data

To change charset in table loaded with text data, you could run SQL query like:

ALTER DATABASE `DATABASE_NAME` DEFAULT CHARACTER SET utf8 COLLATE utf8_polish_ci;

Or in batch: Snippet below comes from stackoverflow.com (visit it for more details) (if you’re using some indices you should drop it).

mysql --database=dbname -B -N -e "SHOW TABLES" \
| awk '{print "ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8 COLLATE utf8_polish_ci;"}' \
| mysql --database=dbname &

If you want to check current CHARACTER_SET and COLLATION

SHOW VARIABLES LIKE "character_set%";
SHOW VARIABLES LIKE "collation%";

or

SHOW CREATE DATABASE `DATABASE_NAME`;

Comments

comments powered by Disqus