Pinterest Pixel

Simplify Data Analysis with Regex in Excel – #1 Cheat Sheet for Excel Users

Regex in Excel is a very powerful tool in data analysis that is used to simplify and... read more

Free Practice Workbook
John Michaloudis
Posted on

Overview

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users | MyExcelOnline Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users | MyExcelOnline

Regex in Excel is a very powerful tool in data analysis that is used to simplify and streamline various tasks. You can use regex to extract, replace or match substrings based on user-defined patterns.

While many of us have relied on text functions like FIND, LEFT, and SUBSTITUTE in Excel to perform operations on text strings. It is important to understand that these functions come with their limitations. You can use Visual Basic for Applications (VBA) in Excel to surpass this limitation. It can provides enhanced flexibility and time-saving benefits.

In this article, we will be exploring the following topics in detail –

Let’s dive into the article to understand and unlock the potential of Regex in Excel!

Download the Excel Workbook below to follow along and understand how to use Regex in Excel –
download excel workbookRegEx-Function-in-Excel.xlsm

 

Regex in Excel

Regex is a sequence of characters consisting of letters, numbers, special characters, operators, and constructs. It is a powerful tool that enables users to describe and identify patterns within a text. You can define rules using regex to extract, replace, or match patterns based on certain criteria.

For example, [from–to] means you can match any character mentioned in the bracket.

  • [0-9] – Any single digit from 0 to 9
  • [a-z] – Any single lowercase letter
  • [A-Z] – Any single uppercase letter

Although Excel lacks a built-in regex function, you can use regex with the help of Visual Basic for Applications (VBA).

 

How to Use VBA for Regex

You can use VBA to create custom Regex Functions in Excel. Let’s see how it can be done:

STEP 1: Press Alt + F11 to open the window for Microsoft Visual Basic for Applications.

Or, Go to Developer > Visual Basic.

Show & Hide Field List in Excel Pivot Table

STEP 2: Go to Insert > Module and copy-paste the following code in it.

Show & Hide Field List in Excel Pivot Table

STEP 3: Download this file and copy-paste the code into the module section –

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 4: Press Ctrl + S to save the Excel Workbook.

STEP 5: In the dialog box, select No to save the WorkBook as Macro enabled.

Show & Hide Field List in Excel Pivot Table

STEP 6: In the Save as dialog box, select Excel Macro-Enabled Workbook from Save as type dropdown. Click Save.

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

This Macro-Enabled Workbook will now have three custom Regex Functions saved and ready to be used. Let’s take a look at them.

 

Match Patterns using Regex

The custom function that has been created using VBA to match patterns is used MatchPatternUsingRegex. The syntax of this function is –

=MatchPatternUsingRegex(input_range, pattern, [match_case])

where,

  • input_range – the range of cells you want to search for the pattern. Required.
  • pattern – the regular expression pattern you want to match. Required.
  • match_case – specify whether the pattern should be case-sensitive or not (default is True). Optional.

In this example, we want to check if the email address mentioned in column A belongs to the domain “@gmail.com”.

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

The pattern that we will be using to check the same is –

[a-zA-Z0-9._%+-]+@gmail\.com

  • [a-zA-Z0-9._%+-]+ This part matches one or more occurrences of any of these lowercase letters (a-z), uppercase letters (A-Z), digits (0-9), period (.), underscore (_), percent sign (%), plus sign (+), or hyphen/minus sign (-). The pattern will match these characters before the @ sign.
  • gmail\.com This part will match the exact text string “@gmail.com”.

Follow the steps below to match patterns using Regex in Excel –

STEP 1: Enter the formula.

=MatchPatternUsingRegex(

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 2: Select the cell containing the text you want to check.

=MatchPatternUsingRegex(A2,

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 3: Type the pattern you want to check.

=MatchPatternUsingRegex(A2,”[a-zA-Z0-9._%+-]+@gmail\.com”)

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

In the result, you can see that all the email addresses that belong to Gmail provide the value TRUE, else FALSE.

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

 

Replace Patterns using Regex

The custom function that has been created using VBA to replace patterns is used ReplaceUsingRegex. The syntax of this function is –

=ReplaceUsingRegex(text, pattern, replacement)

where,

  • text – the cells that contain the text string. Required.
  • pattern – the regular expression pattern you want to replace. Required.
  • replacement – the text that you want to replace the pattern with. Required.

In this example, we have a list of names but it contains unwanted characters like exclamation mark (!), @, full stop (.), digits (0-9), etc. We will use this function to replace all unwanted characters with an empty string.

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

The pattern that we will be using is “[^a-zA-Z/s]”. It means any character that is not mentioned in the bracket i.e. lowercase letters (a-z), uppercase letters (A-Z), and space (/s).

Follow the steps below to replace patterns using Regex in Excel –

STEP 1: Enter the formula.

=ReplaceUsingRegex(

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 2: Select the cell containing the text string.

=ReplaceUsingRegex(A2,

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 3: Type the pattern you want to replace.

=ReplaceUsingRegex(A2,”[^a-zA-Z/s]”

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 4: Type an empty string.

=ReplaceUsingRegex(A2,”[^a-zA-Z/s]”,””)

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

In the result, you can see that all the characters except lowercase letters (a-z), uppercase letters (A-Z), and space (/s) have been replaced by an empty string.

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

 

Extract Patterns using Regex

The custom function that has been created using VBA to extract patterns from text strings is ExtractUsingRegex. The syntax of this function is –

=ExtractUsingRegex(text, pattern)

where,

  • text – the cells that contain the text string. Required.
  • pattern– regular expression pattern you want to extract. Required.

In this example, we have a list of names and want to extract the last name. Here, we will be using the pattern “\w+$” that will help us extract the end of the string.

  • /w – indicates any character including uppercase letters, lowercase letters, digits, and underscore.
  • + – indicates one or more characters
  • $ – indicates the end of the string.

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

Follow the steps below to extract patterns using Regex in Excel –

STEP 1: Enter the formula.

=ExtractUsingRegex(

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 2: Select the cell containing the text string.

=ExtractUsingRegex(A2,

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

STEP 3: Type the pattern you want to extract.

=ExtractUsingRegex(A2,”\w+$”)

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

In the result, we can see that the end of the string i.e. the last name has been extracted.

Simplify Data Analysis with Regex in Excel - #1 Cheat Sheet for Excel Users

 

Conclusion

Regex is an extremely powerful tool that can be] used to make data analysis tasks simpler and streamlined. While Excel provides built-in text functions, they have limitations.

VBA in Excel is a flexible and time-saving alternative for Excel users. VBA for regex can be used for matching patterns, replacing patterns, and extracting patterns. By using custom regex functions created through VBA, users can apply complex pattern matching and manipulation to text data in Excel.

If you like this Excel tip, please share it
Founder & Chief Inspirational Officer at

John Michaloudis is a former accountant and finance analyst at General Electric, a Microsoft MVP since 2020, an Amazon #1 bestselling author of 4 Microsoft Excel books and teacher of Microsoft Excel & Office over at his flagship MyExcelOnline Academy Online Course.

See also  The Ultimate Guide to Excel Data Analysis Toolpak

Get Video Training

Advance your Microsoft Excel & Office Skills with the MyExcelOnline Academy!

Dramatically Reduce Repetition, Stress, and Overtime!
Exponentially Increase Your Chances of a Promotion, Pay Raise or New Job!

Learn in as little as 5 minutes a day or on your schedule.

Learn More!

Share to...