Regular expressions, or regex, provide a way to search for patterns within strings. In Python, the re
module is used for regex-based pattern matching, enabling advanced string manipulation.
Regular Expressions in Python
1. Importing the `re` Module
The re
module in Python offers functions for working with regular expressions. Import it with:
import re
2. Basic Pattern Matching
Use re.search()
to find the first match of a pattern within a string:
import re
pattern = r"hello"
text = "hello world"
match = re.search(pattern, text)
if match:
print("Match found:", match.group())
The r
before the pattern indicates a raw string, which treats backslashes as literal characters.
3. Using `re.findall()`
re.findall()
returns all matches of a pattern in a list:
text = "cat bat rat mat"
matches = re.findall(r"\b\w+at\b", text)
print(matches) # Output: ['cat', 'bat', 'rat', 'mat']
This example uses the word boundary \b
to match words ending in "at".
4. Replacing Text with `re.sub()`
Use re.sub()
to replace matches in a string:
text = "I like cats"
new_text = re.sub(r"cats", "dogs", text)
print(new_text) # Output: I like dogs
5. Pattern Modifiers
Modifiers control the behavior of regex. Commonly used flags include:
re.IGNORECASE
orre.I
: Case-insensitive matching.re.MULTILINE
orre.M
: Multi-line matching for patterns like^
and$
.
text = "Hello world"
match = re.search(r"hello", text, re.IGNORECASE)
if match:
print("Case-insensitive match found!")
6. Common Regex Patterns
\d
: Matches any digit (0-9).\w
: Matches any alphanumeric character (a-z, A-Z, 0-9, _).\s
: Matches any whitespace character (space, tab, newline).^
: Matches the beginning of a string.$
: Matches the end of a string.
Example:
text = "My phone number is 123-456-7890"
pattern = r"\d{3}-\d{3}-\d{4}"
match = re.search(pattern, text)
if match:
print("Phone number found:", match.group())