从插入/读取数据库中的utf8内容时遇到问题.我正在做的所有验证似乎都指出我的数据库中的内容应该是utf8编码的事实,但它似乎是拉丁编码的.最初从CLI从PHP脚本导入数据.
Zend Framework Version: 1.10.5 mysql-server-5.0: 5.0.51a-3ubuntu5.7 php5-mysql: 5.2.4-2ubuntu5.10 apache2: 2.2.8-1ubuntu0.16 libapache2-mod-php5: 5.2.4-2ubuntu5.10
-mysql:
mysql> SHOW VARIABLES LIKE 'character_set%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) mysql> SHOW VARIABLES LIKE 'collation%'; +----------------------+-----------------+ | Variable_name | Value | +----------------------+-----------------+ | collation_connection | utf8_general_ci | | collation_database | utf8_bin | | collation_server | utf8_general_ci | +----------------------+-----------------+
-数据库
created with CREATE DATABASE mydb CHARACTER SET utf8 COLLATE utf8_bin; CREATE SCHEMA `mydb` DEFAULT CHARACTER SET utf8 COLLATE utf8_bin ; mysql> status; -------------- mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2 Connection id: 7 Current database: mydb Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.0.51a-3ubuntu5.7-log (Ubuntu) Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /var/run/mysqld/mysqld.sock Uptime: 9 min 45 sec
-sql:在进行插入之前我运行了
SET names 'utf8';
-php:在进行插入之前,我使用utf8_encode()和mb_detect_encoding(),这给了我'UTF-8'.从db中检索内容并在将其发送给用户之前mb_detect_encoding()也提供'UTF-8'
让我正确显示内容的唯一方法是将内容类型设置为拉丁语(如果我嗅到流量,我可以看到带有ISO-8859-1的内容类型标题):
ini_set('default_charset', 'ISO-8859-1');
此测试显示内容以拉丁语形式出现.我不明白为什么.有人有任何想法吗?
谢谢.
好吧,我发现那SET NAMES
并不是那么好.在文档中巅峰......
我通常做的是执行4个查询:
SET CHARACTER SET 'UTF8'; SET character_set_database = 'UTF8'; SET character_set_connection = 'UTF8'; SET character_set_server = 'UTF8';
试一试,看看是否适合你......
哦,请记住,所有UTF-8字符<= 127也是有效的ISO-8859-1字符.因此,如果您在流中只有<= 127的字符,mb_detect_encoding
则会落在较高流行率的字符集(默认情况下为"UTF-8")...