MySQL Full-Text Search

Search should be most powerful feature to success any business ….

MYSQL always enhance this feature and add some new thing to move on…

Some new updates which really add value in your knowledge base….

Introduction to MySQL Full-Text Search

Gives you a brief introduction to MySQL full-text search and its features.
MySQL supports text searching by using the LIKE statement and regular expression. However, when the text column is large and the number of rows in a table is increased, using those methods has limitations:

Performance: MySQL has to scan the whole table to find the exact text based on pattern in the LIKE statement or pattern in the regular expressions.
Flexible search: with the LIKE statement and regular expression search, it is difficult to have a flexible search query e.g., to find product whose description contains car but not classic.
Relevance ranking: there is no way to specify which row in the result set that is more relevant.

Because of those limitations, MySQL extended a very nice feature so-called full-text search. Technically, MySQL indexes words of enabled full-text search column and performs searches effectively based on this index. MySQL uses a sophisticated algorithm to determine the row that is matched against the search query most.

The following are some important features of MySQL full-text search:

Native SQL-like interface: you use SQL-like statement to use the full-text search.
Fully dynamic index: MySQL automatically index the text column whenever the data changes. You don’t need to build the index periodically.
Moderate index size: it does not take much memory to store the index.
Last but not least, it is fast to search based on complex search queries.

Notice that not all storage engines support full-text searching. In MySQL version 5.6 or later, only MyISAM and InnoDB storage engines support full-text search.

1. Defining FULLTEXT Indexes for MySQL Full-Text Searching

Before performing full-text search in a column of a table, you must index its data. MySQL will re-index again whenever the data of the column changes. In MySQL, the full-text index is a kind of index named FULLTEXT.

MySQL supports indexing and re-indexing data in the full-text search enabled column automatically. MySQL version 5.6 or later allows you to define full-text index for a column with CHAR, VARCHAR or TEXT data type in MyISAM or InnoDB table type. Notice that MySQL supported full-text index in the InnoDB tables since version 5.6.

MySQL allows you to define FULLTEXT index by using the CREATE TABLE statement when you create the table or ALTER TABLE or CREATE INDEX statement for the existing tables.
Defining FULLTEXT index using CREATE TABLE statement

Typically, you define the FULLTEXT index for a column when you create a new table using the CREATE TABLE statement as follows:

CREATE TABLE table_name(
column1 data_type,
column2 data_type,
column3 data_type,

PRIMARY_KEY(key_column),
FULLTEXT (column1,column2,..)
);

You put the a list of comma-separated columns that you want to create full-text index in parentheses after the FULLTEXT statement.

The following statement creates a new table named posts that has FULLTEXT index for the post_content column.

CREATE TABLE posts (
id int(4) NOT NULL AUTO_INCREMENT,
title varchar(255) NOT NULL,
post_content text,
PRIMARY KEY (id),
FULLTEXT KEY post_content (post_content)
);

Defining FULLTEXT index for existing tables

In case you already have existing tables and want to define full-text indexes, you can use the ALTER TABLE statement or CREATE INDEX statement.
Defining FULLTEXT index using ALTER TABLE statement

The following syntax defines a FULLTEXT index using the ALTER TABLE statement:

ALTER TABLE  table_name
ADD FULLTEXT(column_name1, column_name2,…)

You put the table_name is the ADD FULLTEXT clause that defines a FULLTEXT index for one or more columns.

For example, we can define a FULLTEXT index for the productDescription and productLine columns in the products table in the sample database as follows:

ALTER TABLE products
ADD FULLTEXT(productDescription,productLine)

Defining FULLTEXT index using CREATE INDEX statement

You can also use CREATE INDEX statement to create FULLTEXT index for existing tables. See the following syntax:

CREATE FULLTEXT INDEX index_name
ON table_name(idx_column_name,…)

The following statement creates a FULLTEXT index for the addressLine1 and addressLine2 columns of the offices table.

CREATE FULLTEXT INDEX address
ON offices(addressLine1,addressLine2)

Notice that for a table which has many rows, it is faster to load the data into the table that has no FULLTEXT index first and then create the FULLTEXT index, than loading a large amount of data into a table that has existing FULLTEXT index.
Removing full-text search columns

To remove a FULLTEXT index, you just delete the index using the ALTER TABLE … DROP INDEX statement. For example, the following statement removes the address FULLTEXT index in the offices table:

ALTER TABLE offices
DROP INDEX address

In this tutorial, we have shown you how to define and remove FULLTEXT indexes that support full-text search in MySQL.

2. MySQL Natural Language Search Feature

In natural language search, MySQL looks for rows or documents which are relevant to the free-text natural human language query, for example “How to use MySQL full-text search”.

Introduction to MySQL natural language search

In natural language searches, MySQL looks for rows or documents that are relevant to the free-text natural human language query, for example “How to use MySQL full-text search”.

Relevance is a positive floating-point number. When the relevance is zero, it means that there is no similarity. MySQL computes the relevance based on various factors including the number of words in the record, the number of unique words in the record, the total number of words in the collection, and the number of documents (rows) that contain a particular word.

To perform natural language searches, you use MATCH() and  AGAINST() functions. The MATCH() function specifies the column where you want to search and the AGAINST() function determines the search expression to be used.

MySQL natural language search example

We will use the products table in the sample database for the demonstration.

First, we need to enable full-text search in the productLine column of the products table using the ALTER TABLE ADD FULLTEXT statement:

Second, we can search for products whose product lines have word  Classic . We use the MATCH() and AGAINST() functions as the following query:

products with classic product line

To search for product whose product line contains Classic or Vintage keyword, you can perform the following query:

products with classic and vintage product line
The AGAINST() function uses IN NATURAL LANGUAGE MODE search modifier by default therefore you can omit it in the query. There are other search modifiers e.g.,  IN BOOLEAN MODE  for Boolean text searches.

You can explicitly use the IN NATURAL LANGUAGE MODE search modifier in your query as follows:

By default, MySQL performs searches in the case-insensitive fashion. However, you can instruct MySQL to perform case-sensitive searches using binary collation for indexed columns.

Sort the result set by relevance

A very important feature of full-text search is how MySQL ranks the rows in the result set based on their relevance. When the MATCH() function is used in the WHERE clause, MySQL returns the rows that are more relevant first.

The following example shows you how MySQL sorts the result set by the relevance.

First, we enable the full-text search feature for the  productName column of the products table.

Second, we search for products whose names contain  Ford  and/or  1932 using the following query:

Let’s examine the output:
mysql full text search sort by relevancy
The products, whose names contain both 1932 and Ford , are returned first and then the products whose names contains only Ford keyword .

There are some important points you should remember when using full-text search:

  • The minimum length of the keyword to be found by MySQL full-text search engine is 4. It means that if you search for the keyword whose length is less than 4 e.g., car, cat, etc., you will not get any results.
  • Stop words are ignored. MySQL defines a list of stop words in the MySQL source code distribution storage/myisam/ft_static.c

In this tutorial, you have shown you how to use the MATCH() and AGAINST() functions to perform natural language searches in MySQL.

3. MySQL Boolean Full-Text Searches Feature

You will add value about MySQL Boolean full-text search and its main features. We will give you some examples to help you understand the concept better.

Summary: in this tutorial, you will learn how to perform Boolean full-text searches in MySQL. In addition, you will learn how to use Boolean operators to form very complex search queries.

Besides the natural language search, MySQL supports an additional form of full-text search that is called Boolean full-text search. In the Boolean mode, MySQL searches for words instead of concept like in the natural language search.

MySQL allows you to perform search based on very complex queries in the Boolean mode along with Boolean operators. This is why the full-text search in Boolean mode is suitable for experienced users.

To perform full-text search in the Boolean mode, you use the IN BOOLEAN MODE modifier in the AGAINST expression. The following example shows you how to search for product whose product name contains the Truck word.

mysql boolean tex searches - product name with keyword Truck
Two products whose product names contain the Truck word are returned.

To find the product whose product names contain the   Truck word but not any rows that contain  Pickup, you can use the exclude Boolean operator ( -), which returns the result that excludes the Pickup keyword as the following query:

mysql boolean tex searches with Boolean operator

MySQL boolean full-text search operators

The following table illustrates the full-text search Boolean operators and their meanings:

Operator Description
+ Include, word must be present.
Exclude, word must not be present.
> Include, and increase ranking value.
< Include, and decrease ranking value.
() Group words into sub expressions (allowing them to be included, excluded, ranked, and so forth as a group).
~ Negate a word’s ranking value.
* Wildcard at end of word.
“” Defines a phrase (as opposed to a list of individual words, the entire phrase is matched for inclusion or exclusion).

The following examples illustrate how to use boolean full-text operators in search query:

To search for rows that contain at least one of the two words: mysql or tutorial

‘mysql tutorial’

To search for rows that contain both words: mysql and tutorial

‘+mysql +tutorial’

To search for rows that contain the word “mysql”, but put the higher rank for the rows that contain “tutorial”:

‘+mysql tutorial’

To search for rows that contain the word “mysql” but not “tutorial”

‘+mysql -tutorial’

To search for rows that contain word “mysql” and rank the row lower if it contains the word “tutorial”.

‘+mysql ~tutorial’

To search for rows that contain the words “mysql” and “tutorial”, or “mysql” and “training” in whatever order, but put the rows that contain “mysql tutorial” higher than “mysql training”.

‘+mysql +(>tutorial <training)’

To find rows that contain words starting with “my” such as “mysql”, “myyahoo”, etc., you use the following:

‘my*’

MySQL boolean full-text search main features

  • MySQL do not automatically sort rows in the order of decreasing relevance in boolean full-text search .
  • To perform boolean queries, InnoDB tables require all columns of the MATCH expression has a FULLTEXT index. Notice that MyISAM tables do not require this, although the search is quite slow.
  • MySQL does not support multiple boolean operators on a search query on InnoDB tables e.g., ‘++mysql’. MySQL will return an error if you do so. However, MyISAM behaves differently. It ignores other operators and uses the operator that is closest to the search word, for example, ‘+-mysql’ will become ‘-mysql’.
  • InnoDB full-text search does not support trailing plus (+) or minus (-) sign. It only supports leading plus or minus sign. MySQL will report an error if you search word is ‘mysql+’ or ‘mysql-‘. In addition, the following leading plus or minus with wildcard are invalid: +*, +-
  • The 50% threshold is not applied. By the way, 50% threshold means if a word appears in more than 50% of the rows, MySQL will ignore it in the search result.

In this tutorial, we have shown you how to perform full-text searches with Boolean mode in MySQL.

4. Using MySQL Query Expansion Feature

This show you a very important feature of MySQL full-text search which is known as Query Expansion.

Summary: in this tutorial, you will learn about MySQL query expansion to widen the search results based on automatic relevance feedback.

Introduction to MySQL Query Expansion

In some cases, users want to search for information based on the knowledge that they have. Users use their knowledge to define keywords to search for information, and typically those keywords are too short. To help users to find what they want based on the too-short keywords, MySQL full-text search engine introduces a concept called query expansion.

The query expansion is used to widen the search result of the full-text searches based on automatic relevance feedback (or blind query expansion). Technically, MySQL full-text search engine performs the following steps when the query expansion is used:

  • First, MySQL full-text search engine looks for all rows that match the search query.
  • Second, it checks all rows in the search result and finds the relevant words.
  • Third, it performs a search again but based on the relevant words instead of the original keywords provided by the users.

From application perspective, you can use the query expansion when the search results are too few. You perform the searches again but with query expansion to offer users more information that are related and relevant to what they are looking for.

To use the query expansion, you use the WITH QUERY EXPANSION search modifier in the AGAINST() function. The following illustrates the syntax of the query using the WITH QUERY EXPANSION search modifier.

MySQL Query Expansion Example

Let’s look at an example of query expansion to see how it works.

We will use the productName column of the products table to demonstrate the query expansion feature. First, we enable the full-text search index for this column.

Second, we search for product whose product name contains the 1992 term without using query expansion.

As you see, the search result has 2 products whose product names contain the term 1992.

Third, we can widen the search result by using query expansion as the following statement:

We got more rows in the search result when we used query expansion. The first two rows are the most relevant and the other rows come from the relevant keyword that are in the first two rows e.g., Ferrari.

Notice that blind query expansion tends to increase noise significantly by returning non-relevant results. It is highly recommended that you use query expansion only when the searched keyword is short.

In this tutorial, we have introduced you to MySQL query expansion to widen the search results when the keywords provided by users are short.

Add a Comment

Your email address will not be published. Required fields are marked *