Top 20 Super‑Advanced Regex Extraction in Excel (With Ready‑to‑Use Patterns & Formulas)
24 Aug 2025
Top 20 Super‑Advanced Regex Extraction in Excel (With Ready‑to‑Use Patterns & Formulas)

 Why Regex in Excel?

When text is messy—logs, emails, WhatsApp exports, bank statements—regular expressions (regex) give you superpowers to find and extract exactly what you need. Below are the 20 most useful real‑world patterns, each paired with a copy‑ready Excel/Sheets formula and a quick example.

 Excel note: If you’re on Google Sheets, REGEXEXTRACT works natively.
In Microsoft 365, you can create a one‑time LAMBDA named REGEXEXTRACT using Office Scripts or a community LAMBDA (or do the same in Power Query). I’ve included a quick setup at the end.


Quick Reference Matrix

# Use Case Regex Pattern Example Text → Result Formula (Sheets/Excel‑LAMBDA)
1 Extract 6‑digit PIN/OTP \b\d{6}\b “Your OTP is 495768” → 495768 =REGEXEXTRACT(C2,"\b\d{6}\b")
2 Email Address [\w\.\-]+@[\w\.\-]+ “Contact john.doe@example.com” → john.doe@example.com =REGEXEXTRACT(C3,"[\w\.\-]+@[\w\.\-]+")
3 Invoice No (INV‑1234) INV-\d+ “Invoice INV-2451 is ready” → INV‑2451 =REGEXEXTRACT(C4,"INV-\d+")
4 Date DD/MM/YYYY \b\d{2}/\d{2}/\d{4}\b “on 21/08/2023” → 21/08/2023 =REGEXEXTRACT(C5,"\b\d{2}/\d{2}/\d{4}\b")
5 Date YYYY‑MM‑DD \b\d{4}-\d{2}-\d{2}\b “Date: 2023-09-15” → 2023‑09‑15 =REGEXEXTRACT(C6,"\b\d{4}-\d{2}-\d{2}\b")
6 Time HH:MM \b\d{2}:\d{2}\b “Meeting at 14:30” → 14:30 =REGEXEXTRACT(C7,"\b\d{2}:\d{2}\b")
7 Mobile (10‑digit India) \b\d{10}\b “Call me at 9876543210” → 9876543210 =REGEXEXTRACT(C8,"\b\d{10}\b")
8 Price with ₹ & commas ₹?\s?\d{1,3}(?:,\d{3})*(?:\.\d{2})? “The cost is ₹1,29,999.00” → ₹1,29,999.00 =REGEXEXTRACT(C9,"₹?\s?\d{1,3}(?:,\d{3})*(?:\.\d{2})?")
9 URL https?://[^\s>]+ “Visit https://example.com” → https://example.com =REGEXEXTRACT(C10,"https?://[^\s>]+")
10 Hashtag #\w+ “Trending: #ExcelTips” → #ExcelTips =REGEXEXTRACT(C11,"#\w+")
11 Twitter/X Handle @\w+ “Follow me @vikaljain” → @vikaljain =REGEXEXTRACT(C12,"@\w+")
12 Hex Color Code #?[0-9a-fA-F]{3,6}\b “Color: #FF5733” → #FF5733 =REGEXEXTRACT(C13,"#?[0-9a-fA-F]{3,6}\b")
13 Words Starting with Capital \b[A-Z][a-z]+\b “Meet Vikal Jain at Delhi” → Vikal =REGEXEXTRACT(C14,"\b[A-Z][a-z]+\b")
14 IPv4 Address \b(?:\d{1,3}\.){3}\d{1,3}\b “Server: 192.168.1.1” → 192.168.1.1 =REGEXEXTRACT(C15,"(?:\d{1,3}\.){3}\d{1,3}")
15 Indian PIN Code \b\d{6}\b “Send to PIN 110034” → 110034 =REGEXEXTRACT(C16,"\b\d{6}\b")
16 Alphabets Only \b[A-Za-z]+\b “123 Delhi Road” → Delhi =REGEXEXTRACT(C17,"\b[A-Za-z]+\b")
17 Numbers Only \b\d+\b “Order 12 apples” → 12 =REGEXEXTRACT(C18,"\b\d+\b")
18 Domain Name `(?<=:// ^)([^/]+)` https://abc.co.in/page” → abc.co.in
19 Month‑Year (MM‑YYYY) \b\d{2}-\d{4}\b “Month: 08-2024” → 08‑2024 =REGEXEXTRACT(C20,"\b\d{2}-\d{4}\b")
20 First Word of Sentence ^\w+ “Hello world!” → Hello =REGEXEXTRACT(C21,"^\w+")

???? Detailed Walkthrough (with Tips & Variations)

1) 6‑Digit OTP/PIN

Pattern: \b\d{6}\b
Why: Pins/OTPs are usually 6 digits in India.
Tip: For 4–8 digits, use \b\d{4,8}\b.

2) Email Address

Pattern: [\w\.\-]+@[\w\.\-]+
Tip: For stricter TLDs: [\w\.\-]+@[\w\.\-]+\.[A-Za-z]{2,}

3) Branded Invoice IDs (INV‑1234)

Pattern: INV-\d+
Variations: Replace INV- with your brand prefix (e.g., BILL-\d+).

4) Date in DD/MM/YYYY

Pattern: \b\d{2}/\d{2}/\d{4}\b
Normalize: Wrap with DATE(VALUE(RIGHT(x,4)),VALUE(MID(x,4,2)),VALUE(LEFT(x,2))).

5) Date in YYYY‑MM‑DD

Pattern: \b\d{4}-\d{2}-\d{2}\b
Why: ISO format—easy to sort.

6) Time HH:MM

Pattern: \b\d{2}:\d{2}\b
12‑hour support: \b(0?\d|1\d|2[0-3]):[0-5]\d\b

7) Indian Mobile (10 digits)

Pattern: \b\d{10}\b
With country code: (?:\+?91[-\s]?)?\b[6-9]\d{9}\b

8) Price with ₹ & Commas

Pattern: ₹?\s?\d{1,3}(?:,\d{3})*(?:\.\d{2})?
Rupee only: prefix ₹\s?
Note: Converts as text—use SUBSTITUTE to remove /, before VALUE().

9) URL

Pattern: https?://[^\s>]+
Domain only: see #18.

10) Hashtag

Pattern: #\w+
Multiple: =TEXTJOIN(", ",TRUE,REGEXEXTRACTS(cell,"#\w+")) (custom function or Power Query).

11) Twitter/X Handle

Pattern: @\w+
Exclude email parts: use a negative lookbehind if needed: (?<!\w)@\w+.

12) Hex Color

Pattern: #?[0-9a-fA-F]{3,6}\b
Force full 6: #[0-9a-fA-F]{6}\b.

13) Capitalized Words

Pattern: \b[A-Z][a-z]+\b
All proper nouns in a sentence: run REGEXEXTRACTS (multi‑match) or use Power Query.

14) IPv4 Address

Pattern: \b(?:\d{1,3}\.){3}\d{1,3}\b
Strict 0–255: \b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b.

15) Indian PIN Code

Pattern: \b\d{6}\b
Starting 1–9 only: \b[1-9]\d{5}\b.

16) Alphabets Only

Pattern: \b[A-Za-z]+\b
Names incl. hyphen: \b[A-Za-z][A-Za-z\-]+\b.

17) Numbers Only

Pattern: \b\d+\b
Signed/decimal: [+-]?\d+(?:\.\d+)?.

18) Domain Name from URL

Pattern: (?<=://|^)([^/]+)
Subdomain‑free (root only): (?<=://|^)(?:www\.)?([^/:]+) then clean with LOWER().

19) Month‑Year (MM‑YYYY)

Pattern: \b\d{2}-\d{4}\b
To first day of month: =DATE(VALUE(RIGHT(x,4)),VALUE(LEFT(x,2)),1).

20) First Word of Sentence

Pattern: ^\w+
First token before comma/space: same, or use ^[^,\s]+.


One‑Time Setup: Using REGEXEXTRACT in Microsoft Excel

If you don’t have native regex functions:

Option A — Power Query (No code)

  1. Data → Get Data → From Table/Range

  2. Transform → Extract → Text Between Delimiters or use Column From Examples and let PQ infer a pattern.

  3. For full regex, use Add Column → Custom with Text.RegexReplace/Text.RegexMatch in M (Power Query supports regex via .NET when invoked through custom functions or with add‑ins).

Option B — LAMBDA (Community Function)

You can add a ready LAMBDA named REGEXEXTRACT that wraps the new Office Scripts/VBA call to .NET regex or use a community‑shared pure‑formula version. After adding it once, all formulas above work the same as in Sheets.

If you want, tell me your Excel version and I’ll drop a plug‑and‑play LAMBDA or a tiny VBA UDF for REGEXEXTRACTREGEXTEST, and REGEXREPLACE.


Best Practices & Pitfalls

  • Anchor wisely: Use \b and ^…$ to avoid partial junk.

  • Greedy vs lazy: Prefer +? when extracting minimal segments (e.g., URLs within quotes).

  • Validate numeric ranges: For IPs/dates use stricter patterns where correctness matters.

  • Clean before convert: Strip %, before converting to numbers.

  • Multiple matches: Excel needs a helper (Power Query / VBA / LAMBDA returning an array) for “all matches”.


FAQs

Q1. Will these work in Google Sheets?
Yes. Use them directly with REGEXEXTRACT.

Q2. What about multiple matches (e.g., many hashtags)?
Use Power Query, or a custom function that returns an array; then wrap with TEXTJOIN.

Q3. How do I replace instead of extract?
Use REGEXREPLACE(text, pattern, replacement) (Sheets) or an Excel UDF/LAMBDA.

Q4. Can I validate inputs (true/false)?
Yes—REGEXMATCH (Sheets) or a REGEXTEST UDF/LAMBDA in Excel.


Download / Next Steps

Want this as an Excel template with buttons to test each pattern and a Power Query version? Say “Send the Regex Template” and I’ll generate the files instantly.


Conclusion

With these 20 patterns you can clean, validate, and mine your data in seconds—emails, URLs, dates, money, IDs, social tags, and more. Bookmark the matrix above and start copy‑pasting into your sheets. Happy extracting!








© Copyright 2026 . All rights reserved. Technology Partner - TutorArc