MySQL disallows indexing a full value of BLOB
, TEXT
and long VARCHAR
columns because data they contain can be huge, and implicitly DB index will be big, meaning no benefit from index.
MySQL requires that you define first N characters to be indexed, and the trick is to choose a number N that’s long enough to give good selectivity, but short enough to save space. The prefix should be long enough to make the index nearly as useful as it would be if you’d indexed the whole column.
Before we go further let us define some important terms. Index selectivity is ratio of the total distinct indexed values and total number of rows. Here is one example for test table:
+-----+-----------+
| id | value |
+-----+-----------+
| 1 | abc |
| 2 | abd |
| 3 | adg |
+-----+-----------+
If we index only the first character (N=1), then index table will look like the following table:
+---------------+-----------+
| indexedValue | rows |
+---------------+-----------+
| a | 1,2,3 |
+---------------+-----------+
In this case, index selectivity is equal to IS=1/3 = 0.33.
Let us now see what will happen if we increase number of indexed characters to two (N=2).
+---------------+-----------+
| indexedValue | rows |
+---------------+-----------+
| ab | 1,2 |
| ad | 3 |
+---------------+-----------+
In this scenario IS=2/3=0.66 which means we increased index selectivity, but we have also increased the size of index. Trick is to find the minimal number N which will result to maximal index selectivity.
There are two approaches you can do calculations for your database table. I will make demonstration on the this database dump.
Let’s say we want to add column last_name in table employees to the index, and we want to define the smallest number N which will produce the best index selectivity.
First let us identify the most frequent last names:
select count(*) as cnt, last_name
from employees
group by employees.last_name
order by cnt
+-----+-------------+
| cnt | last_name |
+-----+-------------+
| 226 | Baba |
| 223 | Coorg |
| 223 | Gelosh |
| 222 | Farris |
| 222 | Sudbeck |
| 221 | Adachi |
| 220 | Osgood |
| 218 | Neiman |
| 218 | Mandell |
| 218 | Masada |
| 217 | Boudaillier |
| 217 | Wendorf |
| 216 | Pettis |
| 216 | Solares |
| 216 | Mahnke |
+-----+-------------+
15 rows in set (0.64 sec)
As you can see, the last name Baba is the most frequent one. Now we are going to find the most frequently occurring last_name prefixes, beginning with five-letter prefixes.
+-----+--------+
| cnt | prefix |
+-----+--------+
| 794 | Schaa |
| 758 | Mande |
| 711 | Schwa |
| 562 | Angel |
| 561 | Gecse |
| 555 | Delgr |
| 550 | Berna |
| 547 | Peter |
| 543 | Cappe |
| 539 | Stran |
| 534 | Canna |
| 485 | Georg |
| 417 | Neima |
| 398 | Petti |
| 398 | Duclo |
+-----+--------+
15 rows in set (0.55 sec)
There are much more occurrences of every prefix, which means we have to increase number N until the values are almost the same as in the previous example.
Here are results for N=9
select count(*) as cnt, left(last_name,9) as prefix
from employees
group by prefix
order by cnt desc
limit 0,15;
+-----+-----------+
| cnt | prefix |
+-----+-----------+
| 336 | Schwartzb |
| 226 | Baba |
| 223 | Coorg |
| 223 | Gelosh |
| 222 | Sudbeck |
| 222 | Farris |
| 221 | Adachi |
| 220 | Osgood |
| 218 | Mandell |
| 218 | Neiman |
| 218 | Masada |
| 217 | Wendorf |
| 217 | Boudailli |
| 216 | Cummings |
| 216 | Pettis |
+-----+-----------+
Here are results for N=10.
+-----+------------+
| cnt | prefix |
+-----+------------+
| 226 | Baba |
| 223 | Coorg |
| 223 | Gelosh |
| 222 | Sudbeck |
| 222 | Farris |
| 221 | Adachi |
| 220 | Osgood |
| 218 | Mandell |
| 218 | Neiman |
| 218 | Masada |
| 217 | Wendorf |
| 217 | Boudaillie |
| 216 | Cummings |
| 216 | Pettis |
| 216 | Solares |
+-----+------------+
15 rows in set (0.56 sec)
This are very good results. This means that we can make index on column last_name
with indexing only first 10 characters. In table definition column last_name
is defined as VARCHAR(16)
, and this means we have saved 6 bytes (or more if there are UTF8 characters in the last name) per entry. In this table there are 1637 distinct values multiplied by 6 bytes is about 9KB, and imagine how this number would grow if our table contains million of rows.
You can read other ways of calculating number of N in my post Prefixed indexes in MySQL.
MySQL запрещает индексирование полного значения столбцов BLOB
, TEXT
и long VARCHAR
, потому что данные, которые они содержат, могут быть огромными, а неявно индекс DB будет большим, что не означает выгоды от индекса.
MySQL требует, чтобы вы определяли первые N символов, которые нужно индексировать, и трюк состоит в том, чтобы выбрать число N, которое достаточно долго, чтобы обеспечить хорошую избирательность, но достаточно короткое, чтобы сэкономить место. Префикс должен быть достаточно длинным, чтобы сделать индекс почти таким же полезным, как если бы вы индексировали весь столбец.
Прежде чем идти дальше, давайте определим некоторые важные термины. Селективность индекса — это отношение всех отдельных индексированных значений и общего количества строк. Вот один пример тестовой таблицы:
+-----+-----------+
| id | value |
+-----+-----------+
| 1 | abc |
| 2 | abd |
| 3 | adg |
+-----+-----------+
Если мы индексируем только первый символ (N = 1), тогда таблица индексов будет выглядеть как следующая таблица:
+---------------+-----------+
| indexedValue | rows |
+---------------+-----------+
| a | 1,2,3 |
+---------------+-----------+
В этом случае селективность индекса равна IS = 1/3 = 0,33.
Посмотрим, что произойдет, если мы увеличим число индексированных символов до двух (N = 2).
+---------------+-----------+
| indexedValue | rows |
+---------------+-----------+
| ab | 1,2 |
| ad | 3 |
+---------------+-----------+
В этом сценарии IS = 2/3 = 0,66, что означает увеличение селективности индекса, но мы также увеличили размер индекса. Трюк состоит в том, чтобы найти минимальное число N, которое приведет к максимальной селективности индекса.
Существует два подхода, которые вы можете выполнить для расчета таблицы базы данных. Я сделаю демонстрацию на этой дампе базы данных.
Предположим, мы хотим добавить столбец last_name в таблице сотрудников в индекс, и мы хотим определить наименьшее число N, которое будет обеспечивать лучшую селективность индекса.
Сначала определим наиболее часто используемые фамилии:
select count(*) as cnt, last_name
from employees
group by employees.last_name
order by cnt
+-----+-------------+
| cnt | last_name |
+-----+-------------+
| 226 | Baba |
| 223 | Coorg |
| 223 | Gelosh |
| 222 | Farris |
| 222 | Sudbeck |
| 221 | Adachi |
| 220 | Osgood |
| 218 | Neiman |
| 218 | Mandell |
| 218 | Masada |
| 217 | Boudaillier |
| 217 | Wendorf |
| 216 | Pettis |
| 216 | Solares |
| 216 | Mahnke |
+-----+-------------+
15 rows in set (0.64 sec)
Как вы можете видеть, последнее имя Baba является самым частым. Теперь мы собираемся найти наиболее часто встречающиеся префиксы last_name, начиная с пятибуквенных префиксов.
+-----+--------+
| cnt | prefix |
+-----+--------+
| 794 | Schaa |
| 758 | Mande |
| 711 | Schwa |
| 562 | Angel |
| 561 | Gecse |
| 555 | Delgr |
| 550 | Berna |
| 547 | Peter |
| 543 | Cappe |
| 539 | Stran |
| 534 | Canna |
| 485 | Georg |
| 417 | Neima |
| 398 | Petti |
| 398 | Duclo |
+-----+--------+
15 rows in set (0.55 sec)
В каждом префиксе гораздо больше случаев, что означает, что мы должны увеличить число N до тех пор, пока значения не будут такими же, как в предыдущем примере.
Вот результаты для N = 9
select count(*) as cnt, left(last_name,9) as prefix
from employees
group by prefix
order by cnt desc
limit 0,15;
+-----+-----------+
| cnt | prefix |
+-----+-----------+
| 336 | Schwartzb |
| 226 | Baba |
| 223 | Coorg |
| 223 | Gelosh |
| 222 | Sudbeck |
| 222 | Farris |
| 221 | Adachi |
| 220 | Osgood |
| 218 | Mandell |
| 218 | Neiman |
| 218 | Masada |
| 217 | Wendorf |
| 217 | Boudailli |
| 216 | Cummings |
| 216 | Pettis |
+-----+-----------+
Вот результаты для N = 10.
+-----+------------+
| cnt | prefix |
+-----+------------+
| 226 | Baba |
| 223 | Coorg |
| 223 | Gelosh |
| 222 | Sudbeck |
| 222 | Farris |
| 221 | Adachi |
| 220 | Osgood |
| 218 | Mandell |
| 218 | Neiman |
| 218 | Masada |
| 217 | Wendorf |
| 217 | Boudaillie |
| 216 | Cummings |
| 216 | Pettis |
| 216 | Solares |
+-----+------------+
15 rows in set (0.56 sec)
Это очень хорошие результаты. Это означает, что мы можем сделать индекс в столбце last_name
с индексированием только первых 10 символов. В столбце определения таблицы last_name
определяется как VARCHAR(16)
, и это означает, что мы сохранили 6 байтов (или больше, если есть символы UTF8 от имени) для каждой записи. В этой таблице 1637 различных значений, умноженных на 6 байт, составляет около 9 КБ, и представьте, как это число будет расти, если наша таблица содержит миллион строк.
Вы можете прочитать другие способы вычисления числа N в моем сообщении Префиксные индексы в MySQL.
MySQL Error 1170 (42000): BLOB/TEXT Column Used in Key Specification Without a Key Length
ERROR 1170 (42000): BLOB/TEXT column ‘field_name’ used in key specification without a key length
The error happens because MySQL can index only the first N chars of a BLOB or TEXT column. So The error mainly happen when there is a field/column type of TEXT or BLOB or those belongs to TEXT or BLOB types such as TINYBLOB, MEDIUMBLOB, LONGBLOB, TINYTEXT, MEDIUMTEXT, and LONGTEXT that you try to make as primary key or index. With full BLOB or TEXT without the length value, MySQL is unable to guarantee the uniqueness of the column as it’s of variable and dynamic size. So, when using BLOB or TEXT types as index, the value of N must be supplied so that MySQL can determine the key length. However, MySQL doesn’t support limit on TEXT or BLOB. TEXT(88) simply won’t work.
The error will also pop up when you try to convert a table column from non-TEXT and non-BLOB type such as VARCHAR and ENUM into TEXT or BLOB type, with the column already been defined as unique constraints or index. The Alter Table SQL command will fail.
The solution to the problem is to remove the TEXT or BLOB column from the index or unique constraint, or set another field as primary key. If you can’t do that, and wanting to place a limit on the TEXT or BLOB column, try to use VARCHAR type and place a limit of length on it. By default, VARCHAR is limited to a maximum of 255 characters and its limit must be specified implicitly within a bracket right after its declaration, i.e VARCHAR(200) will limit it to 200 characters long only.
Sometimes, even though you don’t use TEXT or BLOB related type in your table, the Error 1170 may also appear. It happens in situation such as when you specify VARCHAR column as primary key, but wrongly set its length or characters size. VARCHAR can only accepts up to 256 characters, so anything such as VARCHAR(512) will force MySQL to auto-convert the VARCHAR(512) to a SMALLTEXT datatype, which subsequently fail with error 1170 on key length if the column is used as primary key or unique or non-unique index. To solve this problem, specify a figure less than 256 as the size for VARCHAR field.
About the Author: LK
Page load link
Scenario:
A table has a primary key field with varchar(255) data type. Sometimes a length of 255 characters is not enough. So the field type is updated as text. During update the below error occurs:
BLOB/TEXT column ‘column_id’ used in key specification without a key length
Error:
MySQL error: key specification without a key length
Reason for this issue:
1. MySQL will be able to index only the first N chars of a BLOB or TEXT column.
2. The error occurs when one of the below column types is assigned to a primary key column.
- TEXT or BLOB,
- TINYBLOB,
- MEDIUMBLOB,
- LONGBLOB,
- TINYTEXT,
- MEDIUMTEXT, and
- LONGTEXT
3. In BLOB or TEXT type, there is no length specification. MySQL cannot guarantee the uniqueness of the column due to its dynamic size.
Fix 1:
Use an integer auto_increment surrogate key column as primary key and the text column with UNIQUE constraint.
Fix 2:
1. Create the TEXT field without the unique constraint.
2. Add a sibling VARCHAR field that is unique and contains a MD5 or SHA1 encrypted value of the TEXT field.
3. Calculate and store the encrypted value and check its TEXT field.
Fix 3:
To use index in TEXT field, utilize the MyISAM storage engine and the choose data type as FULLTEXT for the index column.
Fix 4:
The solution is to specify the index length.
ALTER TABLE [tale_name] ADD INDEX (content(255));
alter table wikitechy_table ADD UNIQUE(emp_name(767), emp_address(767));
NOTE: 767 is the number of characters’ limit up to which MySQL will index columns while dealing with BLOB/TEXT indexes.
Fixes are applicable to the following versions of MySql:
- MySQL 3.23
- MySQL 4.0
- MySQL 4.1
- MySQL 5.0
- MySQL 5.7
Related Error Tags:
- BLOB/TEXT column ‘value’ used in key specification without a key length
- MySQL Error 1170 (42000): BLOB/TEXT Column Used in Key Specification Without a Key Length
- DatabaseError: (1170, «BLOB/TEXT column ‘tree_path’ used in key specification without a key length»)
Sorry, I forgot about error:
When I try to load dump, I receive an error:
ERROR 1170 (42000) at line 225099: BLOB/TEXT column 'query' used in key specification without a key length
225099:
create table toc_piwik_log_profiling (
query text not null,
count int(10) unsigned,
sum_time_ms float,
UNIQUE query (query)
);
What’s causing the issue? And how to fix?
asked Dec 4, 2013 at 11:28
1
You can not define index by TEXT
/ BLOB
columns without specifying index length:
BLOB and TEXT columns also can be indexed, but a prefix length must be
given.
Logically — that is because these data types represent huge data and, therefore, index can not be created by whole field.
answered Dec 4, 2013 at 11:32
Alma DoAlma Do
36.9k9 gold badges74 silver badges104 bronze badges
1
The solution for me was changing column type from text
to varchar(64)
.
answered Jul 2, 2021 at 15:43
Eternal21Eternal21
4,0502 gold badges47 silver badges61 bronze badges