What is Regex?
A regular expression (regex or regexp) is a pattern that describes a set of strings. It's like a search query on steroids—instead of searching for exact text, you search for patterns.
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/
That mess above? It matches email addresses. Regex is powerful, but it has a reputation for being write-only code.
Basic Syntax
| Pattern | Matches | Example |
|---|---|---|
hello | Literal text | "hello" in "hello world" |
. | Any single character | "h.t" matches "hat", "hit", "hot" |
\d | Any digit (0-9) | "\d\d\d" matches "123" |
\w | Word character (a-z, A-Z, 0-9, _) | "\w+" matches "hello_123" |
\s | Whitespace (space, tab, newline) | "hello\sworld" |
^ | Start of string | "^hello" matches "hello world" |
$ | End of string | "world$" matches "hello world" |
Quantifiers
| Pattern | Meaning | Example |
|---|---|---|
* | 0 or more | a* matches "", "a", "aaa" |
+ | 1 or more | a+ matches "a", "aaa" (not "") |
? | 0 or 1 | colou?r matches "color", "colour" |
{3} | Exactly 3 | \d{3} matches "123" |
{2,4} | 2 to 4 | \d{2,4} matches "12", "123", "1234" |
Character Classes
regex
[abc] # Matches a, b, or c
[^abc] # Matches anything except a, b, or c
[a-z] # Matches any lowercase letter
[A-Za-z] # Matches any letter
[0-9] # Matches any digit (same as \d)
Common Patterns
regex
# Email (simplified)
[\w.-]+@[\w.-]+\.\w+
# URL
https?://[\w.-]+(/[\w.-]*)*
# Phone (US)
\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}
# Date (YYYY-MM-DD)
\d{4}-\d{2}-\d{2}
# IP Address (simplified)
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
# Hex Color
#[0-9A-Fa-f]{6}
Groups and Capturing
Parentheses create groups that capture matched text:
regex
# Capture area code from phone number
\((\d{3})\) \d{3}-\d{4}
# Input: "(415) 555-1234"
# Group 1 captures: "415"
# Named groups (modern regex)
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})
Where You'll See This
- Form validation - Email, phone, credit card patterns
- Search and replace - IDE find/replace, sed, grep
- Log parsing - Extracting timestamps, IPs, errors
- Web scraping - Pulling data from HTML
- URL routing - Express, Django, Rails routes
Common Gotchas
⚠️Escape Special Characters
Characters like . * + ? ^ $ [ ] ( ) { } | \ have special meaning. To match them literally, escape with backslash: \. matches a period.
- Greedy by default -
.*matches as much as possible. Use.*?for non-greedy. - Different flavors - JavaScript, Python, and PCRE regex have subtle differences.
- Backtracking - Complex patterns can be slow.
(a+)+on "aaaaaaaaaaaaaaaaaaaaaa!" is catastrophic. - Not for HTML - Don't parse HTML with regex. Use a proper parser.
In Code
javascript
// Test if string matches
/\d+/.test("abc123") // true
// Find first match
"abc123".match(/\d+/) // ["123"]
// Find all matches
"a1b2c3".match(/\d/g) // ["1", "2", "3"]
// Replace
"hello world".replace(/world/, "regex") // "hello regex"
Try It
Test Regex"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." — Jamie Zawinski