Package: general
Severity: normal

I have problem in mysql (5.0.32-Debian_7etch5-log Debian etch distribution)
Better to view it on a wide screen.

It's all about cyrillic characters:
comparing to and
comparing to

characters (first code is cp1251):
0xE5 = U+0435 : CYRILLIC SMALL LETTER IE
0xC5 = U+0415 : CYRILLIC CAPITAL LETTER IE
0xB8 = U+0451 : CYRILLIC SMALL LETTER IO
0xA8 = U+0401 : CYRILLIC CAPITAL LETTER IO
0xE8 = U+0438 : CYRILLIC SMALL LETTER I
0xC8 = U+0418 : CYRILLIC CAPITAL LETTER I
0xE9 = U+0439 : CYRILLIC SMALL LETTER SHORT I
0xC9 = U+0419 : CYRILLIC CAPITAL LETTER SHORT I

from http://www.unicode.org/Public/MAPPIN...OWS/CP1251.TXT

I compare several collations in cp1251 and utf8 character sets.
SQL SELECT looks like (cp1251 hex codes):
select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >= '', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
B8 A8 E9 C9 7A 5A E5 B8 E8 E9 C8 C9 E5 B8 C5 A8 E8 E9 A8 B8 C9 E9 C1 E1 4C 6C

Compare case sensitive collations
---------------------------------

mysql> set names cp1251 collate cp1251_general_cs;
mysql> select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >= '', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| '' = '' | '' = '' | 'z' = 'Z' | '' = '' | '' = '' | '' = '' | '' >= '' | '' = '' | '' > '' | '' > '' | '' > '' | '' > '' | 'L' > 'l' |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
Correct

mysql> set names utf8 collate utf8_bin;
mysql> select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >= '', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| '' = '' | '' = '' | 'z' = 'Z' | '' = '' | '' = '' | '' = '' | '' >= '' | '' = '' | '' > '' | '' > '' | '' > '' | '' > '' | 'L' > 'l' |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
Wrong!

The 1 (true value) in second result is a error! It has to be 0 (false) (sorting issue)


Compare case insensitive collations
-----------------------------------

mysql> set names cp1251 collate cp1251_general_ci;
mysql> select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >= '', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| '' = '' | '' = '' | 'z' = 'Z' | '' = '' | '' = '' | '' = '' | '' >= '' | '' = '' | '' > '' | '' > '' | '' > '' | '' > '' | 'L' > 'l' |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
Correct

mysql> set names utf8 collate utf8_general_ci;
mysql> select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >= '', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| '' = '' | '' = '' | 'z' = 'Z' | '' = '' | '' = '' | '' = '' | '' >= '' | '' = '' | '' > '' | '' > '' | '' > '' | '' > '' | 'L' > 'l' |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
Wrong in columns (starting from 1): 1, 2, 7. First 3 columns have to be 1 and the rest - 0

mysql> set names utf8 collate utf8_unicode_ci;
mysql> select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >= '', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| '' = '' | '' = '' | 'z' = 'Z' | '' = '' | '' = '' | '' = '' | '' >= '' | '' = '' | '' > '' | '' > '' | '' > '' | '' > '' | 'L' > 'l' |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
+-----------+-----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+-----------+-----------+-----------+-----------+
Wrong in columns (starting from 1): 4, 5, 6, 7, 8. First 3 columns have to be 1 and the rest - 0

How to repeat:
set names cp1251 collate cp1251_general_cs;
select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >='', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
set names utf8 collate utf8_bin;
select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >='', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
set names cp1251 collate cp1251_general_ci;
select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >='', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
set names utf8 collate utf8_general_ci;
select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >='', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';
set names utf8 collate utf8_unicode_ci;
select '' = '', '' = '', 'z' = 'Z', '' = '', '' = '', '' = '', '' >='', '' = '', '' > '', '' > '', '' > '', '' > '', 'L' > 'l';

-- System Information:
Debian Release: 4.0
APT prefers stable
APT policy: (500, 'stable')
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-4-686
Locale: LANG=ru_RU.UTF-8, LC_CTYPE=ru_RU.UTF-8 (charmap=CP1251)



--
To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org