Translate any regular expression into plain English with a color-coded breakdown of every token. This free regex to plain English translator parses capture groups, quantifiers, character classes, lookahead and lookbehind assertions, anchors, alternation, and more. Paste a regex pattern to instantly visualize its structure — a powerful regular expression breakdown tool for understanding complex pattern matching. All processing runs client-side in your browser.
Regular expressions are the closest thing programming has to a universal mini-language. The same pattern, more or less, runs in your editor's find dialog, your shell's grep, your application's request validator, your CDN's URL rewriter, your database's LIKE upgrade, and your log analysis pipeline. The bad news: "more or less" is doing a lot of work in that sentence. JavaScript regex, Python re, PCRE, .NET, Go's RE2, and POSIX BRE/ERE all differ in features and even in the meaning of common syntax. This guide is a tour through what each token means, where dialects diverge, and the catastrophic backtracking trap that has caused outages at Stack Overflow (2016) and Cloudflare (2019).
| Token | Meaning | Notes |
|---|---|---|
. | Any character except newline | Set the s (dotall) flag to include newlines. |
^ / $ | Start / end of string | With m (multiline) flag, they match start/end of each line. |
\d \w \s | Digit, word char, whitespace | In JS, ASCII-only by default. u flag makes them Unicode-aware in modern engines. |
\D \W \S | Negation of above | |
\b | Word boundary (zero-width) | Between \w and \W or string edge. |
[abc] / [^abc] | Character set / negated set | Inside a set, most metacharacters lose their meaning. |
[a-z] | Range | Based on Unicode code point order. |
? * + | 0–1, 0–∞, 1–∞ | Greedy by default. |
{n} {n,} {n,m} | Exactly n / at least n / between n and m | Greedy by default. |
?? *? +? | Lazy versions of the above | Match as few characters as possible. |
(...) | Capturing group | Backreference with \1 in pattern, $1 in replacement. |
(?:...) | Non-capturing group | Group without saving the match — slightly faster. |
(?<name>...) | Named capture (JS / .NET / Python 3.7+) | Reference with \k<name> or $<name>. |
(?=...) / (?!...) | Positive / negative lookahead | Zero-width — does not consume. |
(?<=...) / (?<!...) | Positive / negative lookbehind | JS 2018+, supported in Node 10+ and all modern browsers. |
a|b | Alternation | Lowest precedence — wrap in a group when needed. |
| Flag | Effect | JS | Python | PCRE |
|---|---|---|---|---|
g | Global match (find all) | Yes | n/a (use findall) | n/a |
i | Case-insensitive | Yes | Yes | Yes |
m | Multiline (^$ per line) | Yes | Yes | Yes |
s | Dotall (. matches newline) | Yes (ES2018+) | Yes | Yes |
u | Unicode-aware | Yes | Default in Python 3 | Yes |
x | Extended (allow whitespace and comments in pattern) | No | Yes | Yes |
v | Unicode-set (set operations) | Yes (ES2024+) | No | No |
The dialect minefield: JavaScript's g flag turns the same regex into a stateful object where each call to .exec() advances .lastIndex. Python instead exposes re.match, re.search, re.findall, and re.finditer as separate functions. PCRE has neither, leaving iteration to the host language.
The most expensive regex bugs do not produce wrong matches; they produce infinite work. The pattern is called catastrophic backtracking, and it happens when the regex engine has to try exponentially many positions before declaring a non-match. The two ingredients:
(a+)+, (.*)*, or (\w+)*.The Stack Overflow outage of July 2016 was a regex on user-agent strings: ^[\s]+|[\s]+$. With a long input ending in non-whitespace, the engine tried every combination of starting positions before giving up. Cloudflare's July 2019 outage was the same pattern in a WAF rule. The defensive habits:
(\w+)+ as \w+.(?>...) or possessive quantifiers *+ ++ in PCRE / .NET to prevent backtracking into the group.Every junior developer writes ^[\w.-]+@[\w.-]+\.\w+$ at some point and ships a bug. The complete RFC 5321 / 5322 grammar is over 6 000 characters of regex, allows internal-domain literals, IP literals (user@[10.0.0.1]), quoted local parts ("a@b"@example.com), and Unicode (用户@例子.中国). The one-paragraph rule:
/^[^@\s]+@[^@\s]+\.[^@\s]+$/.email-validator in Python, email-addresses in Node.js, commons-validator in Java.x (extended) flag in any language that supports it (Python, PCRE, Ruby) so you can split a long regex across lines with comments.(?<year>\d{4}) instead of bare numbered groups. The replacement string and the future reader both benefit.[abc] beats (?:a|b|c).regex101.com, regexr.com, and this explainer help; for production, fuzz with long-repeating input to catch backtracking.regexp/no-super-linear-backtracking catches the (a+)+ family at lint time.example.com the regex matches example-com too.m flag. ^foo only matches at string start unless multiline is on..* across newlines without s. The match silently stops at the first \n.<.*> on <a>hi</a> grabs the entire string. Use <.*?> or <[^>]*>.str.replace(/(\d+)/, '$1!'), the $ belongs to the replacement template, not the regex.JSON.parse); regex will produce surprises.g flag is stateful. Reusing a global regex between callsites without resetting lastIndex causes mysteriously skipping matches.const m = "2026-05-02".match(/^(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})$/);
console.log(m.groups.y, m.groups.m, m.groups.d);import re
m = re.fullmatch(r'(?P<y>\d{4})-(?P<m>\d{2})-(?P<d>\d{2})', "2026-05-02")
print(m.group('y'), m.group('m'), m.group('d'))re := regexp.MustCompile(`^(?P<y>\d{4})-(?P<m>\d{2})-(?P<d>\d{2})$`)
m := re.FindStringSubmatch("2026-05-02")use regex::Regex;
let re = Regex::new(r"^(?P<y>\d{4})-(?P<m>\d{2})-(?P<d>\d{2})$").unwrap();
let caps = re.captures("2026-05-02").unwrap();
println!("{} {} {}", &caps["y"], &caps["m"], &caps["d"]);Search results for "regex explainer", "regex to plain english", and "what does this regex mean" return a mix of static cheat-sheets and interactive tools. Three things separate the good from the noise: dialect awareness (PCRE / JavaScript / Python / Java differ in lookbehind, possessive quantifiers, named groups), color-coded token highlighting (versus a wall of explanation text), and live match preview against sample strings. Here is how the most-used regex explainer tools compare in 2026:
| Tool | Multi-dialect | Color-coded breakdown | Live match preview | Free tier | Cost |
|---|---|---|---|---|---|
| FreeDevTool Regex Explainer | JS + PCRE + Python + Java + .NET | Yes | Yes | Free, no signup | Free |
| regex101.com | PCRE/PCRE2 + JS + Python + Go + Java | Hover-tooltip | Yes | Free + paid Pro | Freemium |
| regexr.com | JS only | Side panel | Yes | Free | Free |
| regextester.com | JS only | Limited | Yes | Free, ad-funded | Free |
| extendsclass.com/regex-tester | JS + PCRE | Limited | Yes | Free | Free, ad-funded |
| VS Code Regex Previewer extension | JS | Inline IDE | Live as you type | Free | Free |
Break it left to right by token. ^ = start of string. (?=.*[A-Z]) = positive lookahead asserting at least one uppercase letter exists somewhere ahead (zero-width — doesn't consume). (?=.*\d) = second lookahead, at least one digit ahead. .{8,} = any 8+ characters. $ = end of string. Combined: "the entire string is at least 8 characters long AND contains at least one uppercase letter AND at least one digit." This is a typical password-strength regex. Paste it into the explainer above and each lookahead, character class, and quantifier highlights with its meaning. Read complex regexes in this order: anchors first (^ $), then lookarounds ((?=) (?!) (?<=) (?<!)), then capture groups (()), then character classes ([] \d \w \s), then quantifiers (* + ? {n,m}).
| Feature | PCRE / PCRE2 | JavaScript (ES2018+) | Python (re) | Java |
|---|---|---|---|---|
| Lookbehind | Variable-length | Variable-length (Chrome 62+) | Fixed-length only | Variable-length |
| Named groups | (?P<name>...) or (?<name>...) | (?<name>...) | (?P<name>...) | (?<name>...) |
| Possessive quantifiers | ++ *+ ?+ | No native (use atomic groups) | No native | ++ *+ ?+ |
| Atomic groups | (?>...) | No | No (3.11 has (?>...)) | (?>...) |
| Recursive patterns | (?R) or (?0) | No | No (use regex module) | No |
| Unicode property escapes | \p{...} | \p{...} (with u flag) | No (use regex module) | \p{...} |
Same pattern can behave differently across engines. The classic gotcha: variable-length lookbehind (?<=foo|bar) works in PCRE/JS/Java but fails in Python's stdlib re. Pick the dialect dropdown matching your runtime when explaining a pattern; "regex" is not a single language.
(?P<name>) works in Python but is (?<name>) in JS).Pair the regex explainer with the Regex Tester for live match testing on real input, the String Escape Tool for escaping regex metacharacters in source code, and the Code & Text Tools hub for the broader text-manipulation toolkit.
Inherit a 200-character regex from a previous developer? Paste it here and the explainer translates it to plain English with color-coded breakdowns of every group, quantifier, character class, lookaround, and anchor. Useful for code review, learning, and figuring out why your pattern doesn't match what you thought.
(?i)) and you can toggle them via the flag buttons..* matching too much. By default .* grabs as much as possible. Use .*? (lazy) for shortest match — e.g. inside HTML tags.. matches any character. Use \. for a literal period.m flag. ^/$ match start/end of string, not line, unless you set multiline mode.(a+)+ on long input can hang for seconds. Use atomic groups, possessive quantifiers, or rewrite to be linear.re, Ruby Regexp. Lookbehinds, named groups, and Unicode escapes vary. Test in the dialect you'll deploy..+@.+\..+) and verify deliverability separately.\d, \w, and \s match character classes (digits, word characters, whitespace). Quantifiers like *, +, and ? control how many times the preceding token repeats. Anchors like ^ and $ mark the start and end of the string. Parentheses () create capturing groups, and square brackets [] define character sets. Use this regex explainer tool to instantly translate any pattern into plain English.\d matches any digit (0–9), equivalent to [0-9]. \w matches any word character (letters, digits, underscore), equivalent to [a-zA-Z0-9_]. \s matches whitespace (space, tab, newline). Their uppercase counterparts \D, \W, \S match the opposite — non-digit, non-word character, and non-whitespace respectively. The dot . matches any character except newline (unless the s flag is set).* means "zero or more" — the preceding element may appear any number of times or not at all. The plus + means "one or more" — the preceding element must appear at least once. For example, \d* matches an empty string or any sequence of digits, while \d+ requires at least one digit. Add ? after either (*?, +?) to make them lazy, matching as few characters as possible instead of as many.(). They group parts of a pattern together (to apply quantifiers or alternation) and capture the matched text for backreferences (\1, \2) or replacement strings ($1, $2). Non-capturing groups (?:...) group without saving the match, which is more efficient. Named groups (?<name>...) allow referencing by name instead of number. Groups are numbered left to right by their opening parenthesis.(?=...) is a positive lookahead: it asserts that what follows matches the pattern. (?!...) is a negative lookahead: asserts what follows does NOT match. (?<=...) is a positive lookbehind: checks what precedes. (?<!...) is a negative lookbehind: asserts the preceding text does NOT match. They are essential for complex pattern matching like password validation or extracting text between delimiters.All tools run in your browser, no signup required, nothing sent to a server.