Regular Expression

Regular Expression

Character class Description Pattern Matches
[character_group] Matches any single character in character_group. By default, the match is case-sensitive. [ae] “a” in “gray”

“a”, “e” in “lane”

[^character_group] Negation: Matches any single character that is not in character_group. By default, characters in character_group are case-sensitive. [^aei] “r”, “g”, “n” in “reign”
[firstlast] Character range: Matches any single character in the range from first to last. [A-Z] “A”, “B” in “AB123”
. Wildcard: Matches any single character except \n.

To match a literal period character (. or \u002E), you must precede it with the escape character (\.).

a.e “ave” in “nave”

“ate” in “water”

\p{name} Matches any single character in the Unicode general category or named block specified by name. \p{Lu}

\p{IsCyrillic}

“C”, “L” in “City Lights”

“Д”, “Ж” in “ДЖem”

\P{name} Matches any single character that is not in the Unicode general category or named block specified by name. \P{Lu}

\P{IsCyrillic}

“i”, “t”, “y” in “City”

“e”, “m” in “ДЖem”

\w Matches any word character. \w “I”, “D”, “A”, “1”, “3” in “ID A1.3”
\W Matches any non-word character. \W ” “, “.” in “ID A1.3”
\s Matches any white-space character. \w\s “D ” in “ID A1.3”
\S Matches any non-white-space character. \s\S ” _” in “int __ctr”
\d Matches any decimal digit. \d “4” in “4 = IV”
\D Matches any character other than a decimal digit. \D ” “, “=”, ” “, “I”, “V” in “4 = IV”
Assertion Description Pattern Matches
^ The match must start at the beginning of the string or line. ^\d{3} “901” in

“901-333-“

$ The match must occur at the end of the string or before \n at the end of the line or string. -\d{3}$ “-333” in

“-901-333”

\A The match must occur at the start of the string. \A\d{3} “901” in

“901-333-“

\Z The match must occur at the end of the string or before \n at the end of the string. -\d{3}\Z “-333” in

“-901-333”

\z The match must occur at the end of the string. -\d{3}\z “-333” in

“-901-333”

\G The match must occur at the point where the previous match ended. \G\(\d\) “(1)”, “(3)”, “(5)” in “(1)(3)(5)[7](9)”
\b The match must occur on a boundary between a \w (alphanumeric) and a \W (nonalphanumeric) character. \b\w+\s\w+\b “them theme”, “them them” in “them theme them them”
\B The match must not occur on a \b boundary. \Bend\w*\b “ends”, “ender” in “end sends endure lender”
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s