fleet

empayre/fleet

Fork 0

mirror of https://github.com/empayre/fleet.git synced 2024-11-06 08:55:24 +00:00

Commit Graph

Author	SHA1	Message	Date
Roberto Dip	9896d591c4	ensure duplicates are removed before enforcing collations (#10814 ) Related to #10787, this tries to find in the tables with High likelihood described in the issue. This successfully accounts for unique keys that contain leading/trailing whitespace and are using a collation with a pad attribute set to `NO PAD` (considers whitespace as any other character instead of ignoring it) I haven't found a way to successfully detect the same scenario for special unicode characters, for example: ``` mysql> SELECT TABLE_NAME, TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'software'; +------------+--------------------+ \| TABLE_NAME \| TABLE_COLLATION \| +------------+--------------------+ \| software \| utf8mb4_general_ci \| +------------+--------------------+ 1 row in set (0.01 sec) mysql> select vendor COLLATE utf8mb4_unicode_ci from software where name = 'zchunk-libs' GROUP BY vendor COLLATE utf8mb4_unicode_ci; +-----------------------------------+ \| vendor COLLATE utf8mb4_unicode_ci \| +-----------------------------------+ \| vendor \| \| vendor? \| +-----------------------------------+ 2 rows in set (0.01 sec) mysql> ALTER TABLE `software` CONVERT TO CHARACTER SET `utf8mb4` COLLATE `utf8mb4_unicode_ci`; ERROR 1062 (23000): Duplicate entry 'zchunk-libs-1.2.1-rpm_packages--vendor\2007-x86_64' for key 'unq_name' ``` > Note that `?` in "vendor?" is an unicode character	2023-03-29 13:31:24 -03:00

Author

SHA1

Message

Date

Roberto Dip

9896d591c4

ensure duplicates are removed before enforcing collations (#10814 )

Related to #10787, this tries to find in the tables with High likelihood
described in the issue.

This successfully accounts for unique keys that contain leading/trailing
whitespace and are using a collation with a pad attribute set to `NO
PAD` (considers whitespace as any other character instead of ignoring
it)

I haven't found a way to successfully detect the same scenario for
special unicode characters, for example:

```
mysql> SELECT TABLE_NAME, TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'software';
+------------+--------------------+
| TABLE_NAME | TABLE_COLLATION    |
+------------+--------------------+
| software   | utf8mb4_general_ci |
+------------+--------------------+
1 row in set (0.01 sec)

mysql> select vendor COLLATE utf8mb4_unicode_ci from software where name = 'zchunk-libs' GROUP BY vendor COLLATE utf8mb4_unicode_ci;
+-----------------------------------+
| vendor COLLATE utf8mb4_unicode_ci |
+-----------------------------------+
| vendor                            |
| vendor?                           |
+-----------------------------------+
2 rows in set (0.01 sec)

mysql> ALTER TABLE `software` CONVERT TO CHARACTER SET `utf8mb4` COLLATE `utf8mb4_unicode_ci`;
ERROR 1062 (23000): Duplicate entry 'zchunk-libs-1.2.1-rpm_packages--vendor\2007-x86_64' for key 'unq_name'
```
> **Note** that `?`  in "vendor?" is an unicode character

2023-03-29 13:31:24 -03:00

1 Commits