Details
-
Bug
-
Resolution: Incomplete
-
P2: Important
-
None
-
4.5.1, 4.7.1
-
None
Description
QTextCodec::codecForLocale()->fromUnicode() produces Latin-1 on UTF-8 locales on Linux
Tested two versions of gcc:
gcc (Gentoo 4.3.2-r3 p1.6, pie-10.1.5) 4.3.2
gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3
And OpenSUSE 11.4 + qt 4.7.1. As a slight variation, the bug was found out with toLocal8Bit(), but the cause is obviously the same.
Steps to reproduce / test case
QTextCodec::codecForLocale()->fromUnicode() seems to be producing Latin-1 on UTF-8 based locales (tested on en_US.UTF-8 and fr_FR.UTF-8) on Linux. The following program demonstrates the problem:
#include <QFile> #include <QTextCodec> int main(int, char **) { QByteArray array = QTextCodec::codecForLocale()->fromUnicode(QString::fromUtf8("\xc3\xa9")); QFile out(QString::fromUtf8("out")); Q_ASSERT(out.open(QIODevice::WriteOnly | QIODevice::Truncate)); Q_ASSERT(out.write(array) == array.size()); return 0; }
Obtained output:
The file "out" contains a single byte: e9.
Expected output:
The file "out" should contain the two-byte sequence c3 a9.
Remarks:
- e9 and c3 a9 are the encodings of "é" in Latin-1 and UTF-8, respectively.
- Replacing codecForLocale() with codecForName("UTF-8") in the program yields the expected output.
More information
Tested on
Qt 4.5.0-0ubuntu4.2 on Ubuntu
Qt 4.5.1 with iconv support on Gentoo