Article

Table of Contents
Theme:
Was this article helpful?
Try Vultr Today with

$50 Free on Us!

Want to contribute?

You could earn up to $300 by adding new articles!

How to Use MySQL Regular Expressions

Author: Francis Ndungu

Last Updated: Thu, Jul 15, 2021
MySQL and MariaDB Programming

Introduction

MySQL provides a powerful way to perform complex text-based searches through the regular expressions function REGEXP. This string matching algorithm expresses a search through a sequence of symbols and characters to find a pattern in a text.

You can use regular expressions to match, locate, and manage text-based records in a MySQL database, saving a lot of time when working with complex searches in a huge database.

Apart from searching, you can also use REGEXP with a combination of other MySQL functions to replace and rearrange fields in a database table. You can also extract substrings from long text, split a string into human-readable tokens and verify/validate the structure of a string to check if it meets your set format.

This guide illustrates the power of the MySQL REGEXP function using examples. After completing it, you'll be able to implement different combinations of regular expressions in any MySQL database to achieve your desired results.

Prerequisites

To complete this tutorial, you need the following:

1. Create a Test Database

SSH to your server and log in to MySQL as root.

$ sudo mysql -u root -p

Key in your MySQL root password and press ENTER to continue. Then, once you get the mysql> prompt, type the command below to create a test database.

mysql> CREATE DATABASE regex_db;

Output.

Query OK, 1 row affected (0.01 sec)

Switch to the database.

mysql> USE regex_db;

Output.

Database changed

Next, create a customers table.

mysql> CREATE table customers (
         customer_id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
         first_name VARCHAR(50),
         last_name  VARCHAR(50),
         email      VARCHAR(255)
       ) ENGINE = InnoDB;

Output.

Query OK, 0 rows affected (0.04 sec)

Insert some records into the customers table.

mysql> INSERT INTO customers (first_name, last_name, email) VALUES ('JANE', 'SMITH', 'jane@example.com');
       INSERT INTO customers (first_name, last_name, email) VALUES ('MARY', 'ROE', '-');
       INSERT INTO customers (first_name, last_name, email) VALUES ('JOHN', 'DOE', 'john_doe@example.com');
       INSERT INTO customers (first_name, last_name, email) VALUES ('MARY', 'FREDRICK', '');
       INSERT INTO customers (first_name, last_name, email) VALUES ('BANNIE', 'ISAAC', 'bannie_isaac@example.com');

Output.

...
Query OK, 1 row affected (0.01 sec)

Execute the SELECT statement below to verify the records.

mysql> SELECT
       customer_id,
       first_name,
       last_name,
       email
       FROM customers;

You should see the following list of customers.

+-------------+------------+-----------+--------------------------+
| customer_id | first_name | last_name | email                    |
+-------------+------------+-----------+--------------------------+
|           1 | JANE       | SMITH     | jane@example.com         |
|           2 | MARY       | ROE       | -                        |
|           3 | JOHN       | DOE       | john_doe@example.com     |
|           4 | MARY       | FREDRICK  |                          |
|           5 | BANNIE     | ISAAC     | bannie_isaac@example.com |
+-------------+------------+-----------+--------------------------+
5 rows in set (0.00 sec)

With the sample database and table in place, you'll now test the different regular expressions supported by MySQL.

2. Match the Beginning of a String

To match a record beginning with a certain pattern, use the Caret character ^. For instance, to match all first_name's that start with the letter J, execute the command below.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE first_name REGEXP '^J';

You should see a list of customers matching your pattern, as shown below.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           1 | JANE       | SMITH     |
|           3 | JOHN       | DOE       |
+-------------+------------+-----------+
2 rows in set (0.00 sec)

While this is for demonstration purposes, you can use the above example in a real-life scenario. For instance, if you're in the manufacturing industry, you can use this approach to select all batch numbers that begin with certain characters from a database to recall defective items.

In the next step, you'll see how you can match the end of a string.

3. Match End of a String

Use the Dollar $ character to match the position of a string right after the last character. For instance, to match all last_name's ending with the letters OE, run the statement below.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE last_name REGEXP 'OE$';

Only MARY ROE's AND JOHN DOE's names end with the letters OE and they should be listed as follows.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           2 | MARY       | ROE       |
|           3 | JOHN       | DOE       |
+-------------+------------+-----------+
2 rows in set (0.00 sec)

You can use this approach, for instance, to retrieve all customers' records from a specific zip code if their address information is stored in a single column.

To test this, add address and zip columns to the customers table.

mysql> ALTER TABLE customers
       ADD address VARCHAR(255)
       AFTER email ,
       ADD zip VARCHAR(20)
       AFTER address;

Output.

Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

Next, update the customer's address information.

mysql> UPDATE customers SET address ='123 FIRST RD. TOWN 1 AZ 8659' WHERE customer_id = '1';
       UPDATE customers SET address ='456 2ND RD. TOWN 2 AZ 12345' WHERE customer_id = '2';
       UPDATE customers SET address ='789 3RD RD. TOWN 3 AZ 8659' WHERE customer_id = '3';
       UPDATE customers SET address ='555 FIRST RD. TOWN 1 AZ 8659' WHERE customer_id = '4';
       UPDATE customers SET address ='987 FIFTH RD TOWN 2 1414' WHERE customer_id = '5';

Output.

...
Rows matched: 1  Changed: 1  Warnings: 0

Make sure the table is updated.

mysql> SELECT * FROM customers;

Output

+-------------+------------+-----------+--------------------------+------------------------------+------+
| customer_id | first_name | last_name | email                    | address                      | zip  |
+-------------+------------+-----------+--------------------------+------------------------------+------+
|           1 | JANE       | SMITH     | jane@example.com         | 123 FIRST RD. TOWN 1 AZ 8659 | NULL |
|           2 | MARY       | ROE       | -                        | 456 2ND RD. TOWN 2 AZ 12345  | NULL |
|           3 | JOHN       | DOE       | john_doe@example.com     | 789 3RD RD. TOWN 3 AZ 8659   | NULL |
|           4 | MARY       | FREDRICK  |                          | 555 FIRST RD. TOWN 1 AZ 8659 | NULL |
|           5 | BANNIE     | ISAAC     | bannie_isaac@example.com | 987 FIFTH RD TOWN 2 1414     | NULL |
+-------------+------------+-----------+--------------------------+------------------------------+------+
5 rows in set (0.00 sec)

As you can see, the address details, including the street, town, state, and zip, are held in one column. For example, to get all customers from the zip code 8659, execute the command below.

mysql> SELECT
       customer_id,
       first_name,
       last_name,
       address,
       zip
       FROM customers
       WHERE address REGEXP '8659$';

Output.

+-------------+------------+-----------+------------------------------+------+
| customer_id | first_name | last_name | address                      | zip  |
+-------------+------------+-----------+------------------------------+------+
|           1 | JANE       | SMITH     | 123 FIRST RD. TOWN 1 AZ 8659 | NULL |
|           3 | JOHN       | DOE       | 789 3RD RD. TOWN 3 AZ 8659   | NULL |
|           4 | MARY       | FREDRICK  | 555 FIRST RD. TOWN 1 AZ 8659 | NULL |
+-------------+------------+-----------+------------------------------+------+
3 rows in set (0.00 sec)

You can also use MySQL REGEXP results in an UPDATE statement to edit records. For instance, in this case, you can now update the customer's zip code column to the correct value using the statement below.

mysql> UPDATE customers
       SET zip = '8659'
       WHERE address REGEXP '8659$';

Output.

Rows matched: 3  Changed: 3  Warnings: 0

Confirm the new changes.

mysql> SELECT
       *
       FROM customers;

Output

+-------------+------------+-----------+--------------------------+------------------------------+------+
| customer_id | first_name | last_name | email                    | address                      | zip  |
+-------------+------------+-----------+--------------------------+------------------------------+------+
|           1 | JANE       | SMITH     | jane@example.com         | 123 FIRST RD. TOWN 1 AZ 8659 | 8659 |
|           2 | MARY       | ROE       | -                        | 456 2ND RD. TOWN 2 AZ 12345  | NULL |
|           3 | JOHN       | DOE       | john_doe@example.com     | 789 3RD RD. TOWN 3 AZ 8659   | 8659 |
|           4 | MARY       | FREDRICK  |                          | 555 FIRST RD. TOWN 1 AZ 8659 | 8659 |
|           5 | BANNIE     | ISAAC     | bannie_isaac@example.com | 987 FIFTH RD TOWN 2 1414     | NULL |
+-------------+------------+-----------+--------------------------+------------------------------+------+
5 rows in set (0.00 sec)

As you can see, the MySQL REGEXP function is essential when it comes to finding and rearranging text-based column values. In the next step, you'll search for values that match characters enclosed with square brackets.

4. Match Characters Between Square Brackets

You can match any character listed between the square brackets in MySQL using the regular expression. For instance, to list all first_name's containing the character H or Y, use the syntax below.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE first_name REGEXP '[HY]';

Output.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           2 | MARY       | ROE       |
|           3 | JOHN       | DOE       |
|           4 | MARY       | FREDRICK  |
+-------------+------------+-----------+
3 rows in set (0.00 sec)

Similarly, use the command below to locate all last_name's that contain either the characters R or D followed by OE.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE last_name REGEXP '[RD]OE';

Output.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           2 | MARY       | ROE       |
|           3 | JOHN       | DOE       |
+-------------+------------+-----------+
2 rows in set (0.00 sec)

The two searches above are helpful when you're not sure about the exact name you're looking for in a database, but you at least happen to know some of the characters it contains. Next, you'll search a range of characters.

5. Match a Range of Characters

You can use a dash between a range of characters to locate a string using REGEXP. For instance, to retrieve all first_name's containing letters a-d, use the syntax below.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE first_name REGEXP '[a-d]';

Output.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           1 | JANE       | SMITH     |
|           2 | MARY       | ROE       |
|           4 | MARY       | FREDRICK  |
|           5 | BANNIE     | ISAAC     |
+-------------+------------+-----------+
4 rows in set (0.00 sec)

To match any zip codes that contain numbers 0-9, run the SQL command below.

mysql> SELECT
       customer_id,
       first_name,
       last_name,
       zip
       FROM customers
       WHERE zip REGEXP '[0-9]';

Output.

+-------------+------------+-----------+------+
| customer_id | first_name | last_name | zip  |
+-------------+------------+-----------+------+
|           1 | JANE       | SMITH     | 8659 |
|           3 | JOHN       | DOE       | 8659 |
|           4 | MARY       | FREDRICK  | 8659 |
+-------------+------------+-----------+------+
3 rows in set (0.00 sec)

In a production environment, you can use the above approaches to validate records. For instance, if you'd like to send text messages to your customers, you can filter records that only contain numeric phone numbers. Next, you'll match substrings.

6. Match a Substring

You can match a substring at any position in MySQL using regular expressions. For instance, find all names that contain the substring AN in that order by running the query below.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE first_name REGEXP 'AN';

Output.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           1 | JANE       | SMITH     |
|           5 | BANNIE     | ISAAC     |
+-------------+------------+-----------+
2 rows in set (0.00 sec)

Next, find a substring at a specific position by using the syntax below.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE first_name REGEXP '^..NE';

Output.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           1 | JANE       | SMITH     |
+-------------+------------+-----------+
1 row in set (0.00 sec)

In most cases, you'll use substring to locate and replace erroneously entered data in MySQL. Next, you'll see how to use the character classes.

7. Match POSIX Character Classes

The MySQL REGEXP clause supports searching by character classes. This means you can search and return records that contain either alphabetical or numerical characters or even both.

For instance, use the syntax below to look for alphanumeric characters in the first_name field.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE first_name REGEXP '[:alnum:]';

Since all first_name's contain alphanumeric characters, they should be listed as follows.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           1 | JANE       | SMITH     |
|           2 | MARY       | ROE       |
|           3 | JOHN       | DOE       |
|           4 | MARY       | FREDRICK  |
|           5 | BANNIE     | ISAAC     |
+-------------+------------+-----------+
5 rows in set (0.01 sec)

To search the alphabetical characters only, use the syntax below.

mysql> SELECT
       customer_id,
       first_name,
       last_name
       FROM customers
       WHERE first_name REGEXP '[:alpha:]' ;

Output.

+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           1 | JANE       | SMITH     |
|           2 | MARY       | ROE       |
|           3 | JOHN       | DOE       |
|           4 | MARY       | FREDRICK  |
|           5 | BANNIE     | ISAAC     |
+-------------+------------+-----------+
5 rows in set (0.00 sec)

To search digits only, for instance, to locate zip codes with numeric characters, run the SQL command below.

mysql> SELECT
       customer_id,
       first_name,
       last_name,
       zip
       FROM customers
       WHERE zip REGEXP '[:digit:]';

Output.

+-------------+------------+-----------+------+
| customer_id | first_name | last_name | zip  |
+-------------+------------+-----------+------+
|           1 | JANE       | SMITH     | 8659 |
|           3 | JOHN       | DOE       | 8659 |
|           4 | MARY       | FREDRICK  | 8659 |
+-------------+------------+-----------+------+
3 rows in set (0.00 sec)

Although this is not a definitive list of all MySQL regular expressions, it should help you grasp the basic concepts.

Conclusion

In this guide, you've created a sample database and a table. You've also tested the MySQL REGEXP function using different combinations. Use the knowledge in this guide when you want to perform complex searches in your next MySQL project.

Want to contribute?

You could earn up to $300 by adding new articles