JavaScript: Regular Expressions
Page 23
RegExp
Introduction
Learn to apply searches and matches with forty JavaScript features for regular expressions. When complete, try mixing and matching features and functionality to harness the power of regular expressions.
Enjoy this comprehensive short tutorial, with working
examples of JavaScript's RegEx
modifiers, groups, quantifiers, metadata,
matching and searching.
Overview
This page covers regular expressions as originally implemented with ECMAScript 3 and additions to RegExp with ECMAScript 2018.
RegExp abbreviates Regular Expressions.
Regular Expressions apply patterns of characters
to search and replace text.
A RegEx Object
includes a pattern
with properties and methods.
This tutorial covers methods match()
and search()
.
RegEx also processes with methods matchAll(), replace(), replaceAll(), split()
and String methods. Be sure to try them!
Modifiers
/<pattern>/<modifiers>
See the following three modifiers, g, i
and m
, in various examples on this page.
Modifier - g: Global
g
means search
for all matches.
Modifier - i: Ignore Case
i
means ignore case.
Modifier - m: Multiple Lines
m
means search multiple lines.
Modifier - u: Unicode Characters
u
means search string including Unicode properties.
Modifier - s: Display NewLine in Dot Searches
s
displays new line characters, between two characters
when aplied to the dot search. For example, m.n
.
Groups
RegEx groups allow developers to combine sets of searches with brackets, brackets and caret, or parenthesis and pipe.
Brackets
Brackets such as [abc]
, look
for any of the characters between brackets.
You can substitute abc
with
other individual characters, digits, punctuation or symbols.
In the following case, look for characters
a, b
or c
.
The results would Not include capital letters between A and Z,
punctuation, symbols and digits.
[abc]
Brackets with Range
You might also search for a range of elements.
For example
[A-Z]
,
returns only capital letters from a given string.
The group, [0-9]
returns only digits.
Brackets with Caret
Apply the caret symbol, ^
to
search for everything except
the elements within brackets.
The caret symbol ^
indicates NOT.
The following query looks for only characters, symbols or punctuation, and not digits.
[^0-9]
.
The folowing query also looks for any characters
that do not appear within the brackets.
In this case look for characters
d - z, A - Z, 0 - 9
.
[^abc]
Parenthesis and Pipe
The pipe symbol, |
represents
a logical or. In this case.
(x|y)
tells RegEx to
search for either x OR y.
List of Group Symbols
- Return all results containing either x or z - (x|y)
- Return all results only containing x, y or z - [xyz]
- Return all results not containing x, y or z - [^xyz]
- Return all results from either set of brackets - ([a-z]|[1-2])
Group Examples
Group - [abc],[^abc]
Group a,b,c with global modifier g
.
let p = /[abc]/g; let s = "mushroom gourmet ice cream 29.99"; let r = s.match(p);
Find just characters a, b and c.
Group NOT a, b, c with global modifier g
.
let p = /[^abc]/g; let s = "mushroom gourmet ice cream 29.99"; let r = s.match(p);
Find everything but a, b and c.
Group - (x|y), With & Without Modifier i
.
Group with method match()
displays just e or m.
The word Mushroom
is capitalized, without the modifier, i
,
the letter M
should not return from this search.
let p = /(m|c)/g; let s = "Mushroom gourmet ice cream 29.99"; let r = s.match(p);
Find every m
or c
.
Group with method match()
displays just e or m.
The word Mushroom
is capitalized, with the modifier, i
,
the letter M
should return from this search.
let p = /(m|c)/gi; let s = "Mushroom gourmet ice cream 29.99"; let r = s.match(p);
Find every m
or c
, including capital letters.
Group - [0-9],[^0-9]
Return just digits.
let p = /[0-9]/g; let s = "mushroom gourmet ice cream 29.99"; let r = s.match(p);
Return all but digits.
let p = /[^0-9]/g; let s = "mushroom gourmet ice cream 29.99"; let r = s.match(p);
Quantifiers
List of Quantifiers
- Selection followed by zero or more occurrences - *
- Last occurence of a selection - +
- All occurrences of selection - ?
- Return sequence of exactly n element count - {n}
- Return sequences between n and m element count - {n,m}
- Return sequence of at least n element count - {n,}
- Return sequence at end of search string - $
- Return substring at start of string with method
match()
- ^ - Return index or -1 if string begins with substring, using
search()
- ^ - Return substring at start of string - ^
- Search for word A, followed by word B
/A(?=B)
- ?= - Search for word A, when NOT followed by word B:
/A(?=B)
- ?!
Quantifiers - * +
* - Find go
followed by zero or more o
characters per word.
let p = /go*/g; let s = "mushroom gourmet good ice cream 29.99"; let r = s.match(p);
+ - Find index, or position, of last ea
. Method seaarch()
returns the indices only.
let p = /ea+/; let s = "mushroom gourmet ice cream 29.99"; let r = s.search(p);
Quantifiers - ?,{n}
? - All occurrences of g
followed by zero or more o
s.
let p = /go?/g; let s = "great go gourment goods."; let r = s.match(p);
{n} - Find sequence of n digits or characters.
let pattern = /\d{2}/g; let s = "mushroom gourmet ice cream 29.99. + Mint icecream 1.99. + Chocolate ice cream $1000.00"; let r = s.match(p);
\d indicates searching for digits. See every sequence of two digits:
Quantifiers - {n,m},{n,}
{n,m}- Search for sequence of n to m (3 - 4 here) word characters (non digit).
let p = /D{3,4}/g; let s = "gourmet goods, goodies, yipeee!"; let r = s.match(p);
Notice commas, eclamations points are considered non digits then returned
with sequences s, g
and es,,
and yipe
and ee!
.
However SPACES are not considered non digit values. SPACES are
ignored in es,,
which is really es,SPACE,
,
{n,} - Find sequence of at least n characters or digits.
let pattern = /\w{8,}/gm; let s = "mushroom gourmet ice cream 29.99."; + "Mint icecream 1.99. "; + "Chocolate ice cream $1000.00"; let r = s.match(p);
\w word characters with 8 or more characters. Include Multiline with the modifier m
.
Quantifiers - $, Sets within Parenthesis ([set]|[set])
{n,} - Find sequence at end of text. Finds last sequence regardless of /g
global flag.
let pattern = /cream$/g; let s = "mushroom ice cream cream"; let r = s.match(p);
The sought term must have NO OTHER characters at the end of the string. Not even a period. Sought term must be absolute last set of elements.
([a-z]|[1-2]) Return all lower case letters and digits 1 through 2.
let pattern = /([a-z]|[1-2])/g; let s = "mushroom ice cream $2.99"; let r = s.match(p);
Quantifier - ^
We've already seen ^
applied within brackets. Here let's look
at ^
in other contexts.
In brackets RegEx searches for terms NOT matching the values in brackets.
In between forward slashes RegEx searches for terms AT THE START of
the given string.
^
finds values at the start of a given string.
$
finds values at the end of a given string.
^ - Find the substring, mush
if it starts the string, with method match()
.
let pattern = /^mush/g; let s = "mushroom ice cream cream 29.99."; let r = s.match(p);
^ - Return 0
if mush
starts a string. Return -1
if mush
doesn't start a string, with method search()
.
let pattern = /^mush/g; let s = "ice cream mushroom only 29.99."; let r = s.search(p);
Quantifiers - ?!, ?=
?= - Search for word A, followed by word B: /A(?=B)
let pattern = /ice(?= cream)/gm; let s = "Mushroom ice cream, "+ "Chocolate ice cream," + "Lime ice sorbet."; let r = s.match(p);
Return just those instances of ice
where its followed by cream
. However apply multiple line searches with modifier m
.
?! - Search for word A, when NOT followed by word B: /A(?=B)
let pattern = /ice(?! cream)/g; let s = "Mushroom ice cream, "+ "Chocolate ice cream," + "Lime ice sorbet."; let r = s.match(p);
Return just those instances of ice
where its not followed by cream
. Use modifier m
for multiple line searches, again.
Special Characters (Metacharacters)
Special characters, also called Metacharacters
,
apply specific restrictions to search queries.
Tap a box, below, to execute code for
each metacharacter.
The JavaScript file behind each box,
regex.js.
List of Special Characters (Metacharacters)
- Replace One Character: .
- Word Search: \w
- NOT Word Search: \W
- Find Digits: \d
- Find Non Digits: \D
- Find Spaces: \s
- Find Non Spaces: \S
- Start of Word: \b
- Find Non Start of Word: \B
- Find Null Character: \0
- Find New Line: \n
- Find Form Feed: \f
- Find Carriage Return: \r
- Find Tab: \t
- Find Vertical Tab: \v
- Find One Hex Character: \xxx
- Find Two Decimal Hex Characters: \xdd
- Find One Hex Character: \uxxxx
Special Characters: Match
Find the alpha numeric match, of a search term.
Call the method match()
. Tap a box!
Also see regex.js
function fRegex()
for details.
Metacharacters - m.n,\uxxxx
Dot: Replace one character.
let p = /m.n/g; let s = "mint mond"; let r = s.match(p);
Returns m, n,
and any character between m, n
, including m
and n
.
Find character by Hex value.
let p = /\u006D/g; //6D represents 'm'. let s = "mint mond"; let r = s.match(p);
Returns each instance of the character m
.
Metacharacters - \w, \W
Word characters: a-z, A-Z, 0-9 plus underscore:
let p = /\w/g; let s = "andrew jackson $20.00"; let r = s.match(p);
Not Word Characters:
let p = /\W/g; let s = "andrew jackson $20.00"; let r = s.match(p);
Commas indicate both $
, space and .
(decimal point), are not considered words.
Metacharacters - \d, \D
Find Digits:
let p = /\d/g; let s = "andrew jackson $20.00"; let r = s.match(p);
Find NON Digits:
let p = /\D/g; let s = "andrew jackson $20.00"; let r = s.match(p);
Metacharacters - \s,\S
Find whitespace
let p = /\s/g; let s = "andrew jackson $20.00"; let r = s.match(p);
This example displays the word whitespace,
for each individual empty space.
Find not whitespace
let p = /\S/g; let s = "andrew jackson $20.00"; let r = s.match(p);
Returns each character with a comma separated list. Doesn't return spaces.
Metacharacters - \b,\B
Word which starts with ja
.
let p = /\bja/g; let s = "andrew james jackson $20.00"; let r = s.match(p);
Word with son
, where son
does not start the word.
let p = /\Bson/g; let s = "andrew jackson killed a Bison."; let r = s.match(p);
Metacharacters - \0,\n
Search for NULL
let p = /\0/g; let s = Where O Death is thy \0Sting?"; let r = s.search(p);
Search for \n newline.
let p = /\n/g; let s = "Don't \ntalk politics."; let r = s.search(p);
Metacharacters - \f,\r
Search form feed \f.
let p = /\f/; let s = "never hurt \fanyone."; let r = s.match(p);
Search carriage return \r
let p = /\r/g; let s = "abcdefghijklm\rnopqrstuvwxyz"; let r = s.match(p);
Metacharacters - \t,\v
let p = /\t/g; let s = "Learn Programming\t7Thunders.biz."; let r = s.match(p);
Vertical Tab
let p = /\v/g; let s = "Learn to Code\v7Thunders.biz." let r = s.match(p);
Metacharacters - \xxx,\xdd
Search for hex number B = x42
let p = /\x42/g; let s = "Andrew Jackson nursed a Bison."; let r = s.match(p);
Search for octal number B = 102
let p = /\102/g; let s = "Andrew Jackson nursed a Bison."; let r = s.match(p);
Special Characters: Search
Find the index, or position, of a search term.
Call the method search()
. Tap a box!
Apply modifiers, groups, quantifiers and special characters
with the search()
method, as you would with the
match()
method, as demonstrated above.
Method match()
returns substring matches.
Method search()
returns the starting index of substring matches.
Find some examples, below. Also see regex.js
function fRegexSearch()
for details.
Match - \B and -b
First occurrence of word, but not at start of a word.
let p = /\Bdr/; let s = "andrew drake jackson $20.00"; let r = s.search(p);
Last occurrence of a word
let p = /\bdr/; let s = "andrew drake jackson $20.00"; let r = s.search(p);
ECMAScript 2018 New Features
Learn about
four new features from ECMAScript 2018.
The new features include
unicode property escapes with (\p{...})
,
lookbehind assertions with (?<= )
and
(?<! )
, as well as the
dotAll
flag.
DotAll Flag
You saw the Metacharacters - m.n,\uxxxx explanation and examples previously. The single dot returns every substring with matching characters on either side of the dot. The dot itself represents almost any string element.
dotAll Flag with match()
The dotAll Flag, s
, allows developers
to display all textual content, even invisible content,
which include elements surrounding the dot metacharacter.
With the match()
method, the s
dotAll
flag allows the dot to match line terminators, between elements.
dotAll
The RegEx dotAll
property allows developers
to determine whether or not invisible newline
elements separate dot
parameters.
Check the RegEx dotAll
property, to determine
if a newline character exists between any dot
RegEx strings.
Unicode Property Escapes
Unicode property escapes helpfully match emojis, punctuations, letters and characters from specific languages or scripts.
Emoji, Punctuation, Language Elements
Appy RegEx to search for a very wide range of emoji. Look online for many different properties including emoji, punctuation, and elements from various languages. The list below displays a few.
Emoji Short List
- Bicyclist Emoji - 🚴
- Pedestrian Emoji - 🚶
- Clock Emoji - ⌚
- Rainbow Emoji - 🌈
- Unicode Property Escapes - (\p{...})
- DotAll Flag
- Look Behind Assertions - (?<= )
- Look Behind Assertions - (?<! )
Look Behind Assertions (?<= ), (?<! )
First occurrence of word, but not at start of a word.
let p = /\Bdr/; let s = "andrew drake jackson $20.00"; let r = s.search(p);
Last occurrence of a word
let p = /\bdr/; let s = "andrew drake jackson $20.00"; let r = s.search(p);
Unicode Escapes, Emoji
Return just the emoji. A smile with heart-eyes represents the {Emoji_Presentation}
in this case.
// Search for emoji. let p = /\p{Emoji_Presentation}/gu; let s ="I feel 😍"; let r = s.match(p);
Unicode property escapes require the modifier, u
.
Substitute a wide range of Emoji.
Return just the emoji. A rainbow represents the {Emoji_Presentation}
in this case.
// Search for emoji. let p = /\p{Emoji_Presentation}/gu; let s ="It is on the way 🌈"; let r = s.match(p);
Unicode property escapes require the modifier, u
.
Substitute a wide range of Emoji.
DotAll Flag = s
Returns true
with the dotAll flag.
let reg = "monday mint ice cream m/nn./s"; let r = reg.dotAll;
Returns false
without the dotAll flag.
let reg = /monday mint ice cream mnn.; let r = reg.dotAll;
The DotAll flag, s
returns the m\nn
dot substring.
The newline character's between the m and the n.
let s = "monday mint ice cream m\nn."; let reg = /m.n/gs; let r = s.match(reg);
Given string includes a newline character \n
.
With the modifier s
, at the end of the
RegEx expression, we will see the last three character combination
of m,unknown character,n
.
No DotAll flag, s
, does not return the
m\nn
dot substring, with the newline character.
let s = "monday mint ice cream m\nn."; let reg = /m.n/g; let r = s.match(reg);
Given string includes a newline character \n
.
Without the modifier s
, at the end of the
RegEx expression, we won't see the last three character combination
of m,unknown character,n
.
JavaScript Source Code
See the functions applied to this page in JavaScript file, regex.js.
Most colored boxes, on this page simply call
fRegex(sId,s,reg)
or fRegexSearch(sId,s,reg)
,
with parameters sId, s
and reg
.
Parameter sId
is an identifier to
an HTML element on this page.
Parameter s
is a string for
RegEx to process.
Parameter reg
is a JavaScript formated regular expression.
HTML Example
The following markup demonstrates
a typical example on this page.
A teal
colored
box executes function fRegex()
,
when you tap the box.
Function fRegex()
parameters include the ID of
an HTML element within the box,
a string to process, and
a regular expression.
The same format was applied for each example.
Colors and search parameters were changed
to highlight various RegEx features.
The pre
element shows
you which string and which regular expression
will process.
See JavaScript file, regex.js
for three functions applied to examples on this page.
<div class="teal box" onclick="fRegex( 'idL', 'andrew jackson $20.00', reg=/\d/g)" > <p class="bld" > Find Digits: </p> <pre> let p = /\d/g; let s = "andrew jackson $20.00"; let r = s.match(p); </pre> <p id="idL"></p> </div>
Summary
You learned to apply searches and matches with forty JavaScript features for regular expressions. Now try mixing and matching features and functionality to harness the power of regular expressions.
I hope you enjoyed this comprehensive short tutorial, with working
examples of JavaScript's RegEx
modifiers, groups, quantifiers, metadata,
matching and searching.
Overview
This page covered regular expressions as originally implemented with ECMAScript 3 and additions to RegExp with ECMAScript 2018.
RegExp abbreviates Regular Expressions.
Regular Expressions apply patterns of characters
to search and replace text.
A RegEx Object
includes a pattern
with properties and methods.
This tutorial covered methods match()
and search()
.
RegEx also includes methods matchAll(), replace(), replaceAll(), split()
and String methods. Be sure to try them!
Learn JavaScript
JavaScript's the foundation of Web developer and Website design skills. This free and unique JavaScript tutorial includes some new or seldom used, but useful features.