Regular Expression (regex - regexp) is a sequence of characters that define a search pattern. Usually such patterns are used by string searching algorithms for “find” or “find and replace” operations on strings, or for input validation

Regular Expressions - Relation to Finite-State Automata (FSA)

  • a regular expression is 1 way of describing a finite-state automata
  • any regular expression can be implemented as a finite-state automata
  • any finite-state automata can be described as a regular expression
  • a regular expression is 1 way of characterizing a particular kind of formal language called regular language
  • both regular expressions and finite-state automata can be used to describe regular languages
  • regular grammar - is another way of describing regular languages

Regular Expressions - Operators

syntax

description

example use

example matches

|

boolean “or” separates alternatives

gray|grey

  • gray
  • grey

[]

square brackets is another way of |

gr[ae]y

  • gray
  • grey

[A-Z]

an upper case letter

[a-z]

a lower case letter

[0-9]

a single digit

[

carat means negation only when first in []

[

()

grouping are used to define the scope and precedence of the operators

gr(a|e)y

  • gray
  • grey

?

indicates zero or one occurrences of the preceding element

colou?r

  • color
  • colour

indicates zero or more occurrences of the preceding element

ab*c

  • ac
  • abc
  • abbc

indicates one or more occurrences of the preceding element

ab+c

  • abc
  • abbc

{n}

the preceding item is matched exactly n times

{min,}

the preceding item is matched min or more times

{min, max}

the preceding item is matched at least min times, but not more than max times

.

wildcard matches any character

a.b

  • acb
  • a-b

^

anchors beginning

$

anchors end