Skip to main content

Regex

Tool zum Testen von Regex-Mustern:
regex101.com

Regex Quickreference

## Grundlagen
. - Beliebiges Zeichen außer Zeilenumbruch
^ - Anfang der Zeile
$ - Ende der Zeile
* - 0 oder mehr Wiederholungen
+ - 1 oder mehr Wiederholungen
? - 0 oder 1 Wiederholung
{n} - Genau n Wiederholungen
{n,} - Mindestens n Wiederholungen
{n,m} - Zwischen n und m Wiederholungen
| - Oder (z.B. `a|b` matcht "a" oder "b")
() - Gruppierung (subpatterns)
[] - Zeichenklasse (z.B. `[a-z]` für Kleinbuchstaben)

## Häufig verwendete Zeichenklassen
\d - Ziffern (0-9)
\D - Keine Ziffern
\w - Wortzeichen (a-z, A-Z, 0-9, _)
\W - Kein Wortzeichen
\s - Leerraum (Leerzeichen, Tab, Zeilenumbruch)
\S - Kein Leerraum

Basic Regex

Basic RegexOperatorMeaning
Period operator.Matches any one single character.
List operator[ ][^ ]Defines a list or range of literal characters that can match one character. If the first character is the negation ^ operator, it matches any character that is not in the list.
Asterisk operator*Matches zero or more instances of the previous character.
Front anchor operator^If ^ is the first character in the pattern, then the entire pattern must be present at the beginning of the line to match. If ^ is not the first character, then it is treated as an ordinary literal ^ character.
Back anchor operator$If $ is the last character in the pattern, then the pattern must be at the end of the line to match, otherwise, it is treated as a literal $ character.

Extended Regex

Extended RegexOperatorsMeaning
Grouping operator( )Groups characters together to form a subpattern.
Asterisk operator*Previous character (or subpattern) is present zero or more times.
Plus operator+Previous character (or subpattern) is present at least one or more times.
Question mark operator?Previous character (or subpattern) is present zero or one time (but not more).
Curly brace operator{,}Specify minimum, maximum, or exact matches of the previous character (or subpattern).
Alternation operator|Logical OR of choices. For example, abc|def|xyz matches abc or def or xyz.

Subpatterns ()

PatternMeaning
xyz+Matches the xy string followed by one or more of the z character
(xyz)+Matches one or more copies of the xyz string
xyz?Matches the xy string followed by zero or one of the z character
x(yz)?Matches the x character followed by zero or one of the yz string

OR |

PatternMeaning
abc|xyzMatches the abc string or the xyz string
ab(c|d|e) / ab[cde]Matches the ab string followed by a c or d or e character

Character Classes

CharacterLegendExampleSample Match
[ … ]One of the characters in the brackets[AEIOU]One uppercase vowel
[ … ]One of the characters in the bracketsT[ao]pTap or Top
-Range indicator[a-z]One lowercase letter
[x-y]One of the characters in the range from x to y[A-Z]+GREAT
[ … ]One of the characters in the brackets[AB1-5w-z]One of either: A,B,1,2,3,4,5,w,x,y,z
[x-y]One of the characters in the range from x to y[ -~]+Characters in the printable section of the ASCII table.
[^x]One character that is not x[^a-z]3A1!
[^x-y]One of the characters not in the range from x to y[^ -~]+Characters that are not in the printable section of the ASCII table.
[\d\D]One character that is a digit or a non-digit[\d\D]+Any characters, including new lines, which the regular dot doesn't match
[\x41]Matches the character at hexadecimal position 41 in the ASCII table, i.e. A[\x41-\x45]3ABE

Characters

CharacterLegendExampleSample Match
\dMost engines: one digit from 0 to 9file_\d\dfile_25
\d.NET, Python 3: one Unicode digit in any scriptfile_\d\dfile_9੩
\wMost engines: "word character": ASCII letter, digit or underscore\w-\w\w\wA-b_1
\w.Python 3: "word character": Unicode letter, ideogram, digit, or underscore\w-\w\w\w字-ま_۳
\w.NET: "word character": Unicode letter, ideogram, digit, or connector\w-\w\w\w字-ま‿۳
\sMost engines: "whitespace character": space, tab, newline, carriage return, vertical taba\sb\sca b c
\s.NET, Python 3, JavaScript: "whitespace character": any Unicode separatora\sb\sca b c
\DOne character that is not a digit as defined by your engine's \d\D\D\DABC
\WOne character that is not a word character as defined by your engine's \w\W\W\W\W\W*-+=)
\SOne character that is not a whitespace character as defined by your engine's \s\S\S\S\SYoyo

Quantifiers

QuantifierLegendExampleSample Match
+One or moreVersion \w-\w+Version A-b1_1
{3}Exactly three times\D{3}ABC
{2,4}Two to four times\d{2,4}156
{3,}Three or more times\w{3,}regex_tutorial
*Zero or more timesABC*AAACC
?Once or noneplurals?plu

Häufig verwendete Muster

### 1. Email-Adresse
`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`
`test.email@domain.com`
`test.email@domain,com`

### 2. URL
`https?://[^\s/$.?#].[^\s]*`
`https://example.com`
`http://sub.domain.net/path?query=1`

### 3. Dateipfad (Windows)
`[a-zA-Z]:\\(?:[^<>:"/\\|?*]+\\)*[^<>:"/\\|?*]*`
`C:\Users\Name\file.txt`
`C:/Users/Name/file.txt`

### 4. Dateipfad (Linux/Mac)
`(/[^/\0]+)+/?`
`/home/user/file.txt`
`home/user/file.txt`

### 5. IPv4-Adresse
`\b(?:\d{1,3}\.){3}\d{1,3}\b`
`192.168.1.1`
`999.999.999.999`

### 6. Telefonnummer (DE)
`(?:\+49|0)[1-9][0-9]{9,10}`
`+4917612345678`
`017612345678`

### 7. Postleitzahl (DE)
`\b\d{5}\b`
`10115`
`1234`

### 8. JSON-Formatierung prüfen
`\{(?:[^{}]|(?R))*\}`
`{"key": "value"}`
`{"key": "value"`
(fehlt `}`)

### 9. HTML-Tags entfernen
`<[^>]+>` → Ersetzen durch `""`
`<p>Hello</p>``Hello`

### 10. Leerzeilen entfernen
`^\s*$` → Ersetzen durch `""`

Anwendungsbeispiele

Email-Adresse prüfen (Python)

import re

email = "test.email@domain.com"
pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

if re.match(pattern, email):
print("✅ Gültige Email")
else:
print("❌ Ungültige Email")

URL aus einem Text extrahieren (Python)

import re

text = "Besuche meine Seite: https://example.com und http://test.de"
pattern = r"https?://[^\s/$.?#].[^\s]*"

urls = re.findall(pattern, text)
print("Gefundene URLs:", urls)

Windows Dateipfad prüfen (Bash)

echo "C:\Users\Name\file.txt" | grep -E "^[a-zA-Z]:\\(?:[^<>:\"/\\|?*]+\\)*[^<>:\"/\\|?*]*$"

Linux Dateipfad prüfen (Bash)

echo "/home/user/file.txt" | grep -E "^(/[^/\0]+)+/?$"

IPv4-Adresse validieren (JavaScript)

const ipPattern = /\b(?:\d{1,3}\.){3}\d{1,3}\b/;
const ip = "192.168.1.1";

if (ipPattern.test(ip)) {
console.log("✅ Gültige IP");
} else {
console.log("❌ Ungültige IP");
}

Telefonnummer bereinigen (DE - Python)

import re

number = "Meine Nummer ist +49 176 12345678"
pattern = r"(?:\+49|0)[1-9][0-9]{9,10}"
match = re.search(pattern, number)

if match:
print("Gefundene Nummer:", match.group())

Postleitzahlen filtern (Bash)

echo "Berlin 10115" | grep -Eo "\b\d{5}\b"

JSON-String validieren (Python)

import json

data = '{"name": "Max", "age": 25}'
try:
json.loads(data)
print("✅ Gültiges JSON")
except json.JSONDecodeError:
print("❌ Ungültiges JSON")

HTML-Tags entfernen (Python)

import re

html = "<p>Hello <b>World</b></p>"
clean_text = re.sub(r"<[^>]+>", "", html)
print(clean_text) # Output: Hello World

Leerzeilen aus Datei entfernen (Bash)

sed '/^\s*$/d' input.txt > output.txt

🔥 Zusätzliche Tipps:

Python: re.search() gibt das erste Vorkommen zurück, re.findall() alle Treffer.
JavaScript: .match() gibt Treffer zurück, .replace() kann direkt ersetzen.
Bash: grep -E für erweiterte Regex, sed für Ersetzungen.