mySQL Forcing Utf-8 Compliance for All Connections.

The problem that most people face when setting up a UTF-8 database in mySQL is that without calling ‘SET NAMES’ in the mySQL client prior to issuing any queries (PHP, C++ etc …) that the client connection will actually in most cases default to  latin-1.

However as of mySQL 5.x or higher you can issue a statement in the my.cnf file calling init_connect.

This will trigger a series of defined commands / queries every time a non super user connects (So if you are using root to connect to your mySQL database, stop reading now and slap yourself HARD).

i.e.

[mysqld]
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set=utf8
character-set-server=utf8
collation-server=utf8_general_ci
skip-character-set-client-handshake

UPDATE 04/09/09

my mySQL version 5.0.45 x64 only picks up the last entry of init_connect

Use this example in this case:

[mysqld]
init_connect='SET collation_connection = utf8_general_ci; SET NAMES utf8;'
default-character-set=utf8
character-set-server=utf8
collation-server=utf8_general_ci

Restart mySQL and check the mysqld.log has not returned any errors (Or your event viewer if you are using windows).

Every client connection will now default to utf-8 encoding and not latin-1, removing the need to add a SET NAMES call on every connection.

This will work for PHP, C++, ruby etc… as the client encoding is now handeled server side, rather that waiting on the client to issue a SET NAMES command.

UPDATE 30/03/09: Added “skip-character-set-client-handshake” this ignores the clients request to set the connection charset, this info courtesy of “wardo” https://word.wardosworld.com/?p=164

UPDATE 10/09/09

Been having some issues with this working the workaround is to add this config as a single line:

init_connect='SET collation_connection = utf8_general_ci; SET NAMES utf8;'

Comments