Regular Expressions in Python: A Detailed Guide with Examples
Regular Expressions in Python: A Detailed Guide with Examples
Regular expressions (regex) are a powerful tool used for searching,
manipulating, and replacing text in a string. In Python, the
re
module is used to work with regular expressions. With regular
expressions, you can match patterns in strings, validate input, and extract
data efficiently. Understanding how regular expressions work in Python is
essential for tasks involving text processing and pattern
matching.
What Are Regular Expressions?
A regular expression is a sequence of characters that defines
a search pattern. These patterns are used to match strings in a flexible and
efficient manner. Python's re
module provides support for regex,
allowing you to perform operations like:
- Matching patterns
- Searching for specific patterns
- Replacing patterns
- Splitting strings
The re
module provides several functions for working with regular
expressions:
re.match()
: Matches a pattern at the beginning of the string.re.search()
: Searches for the first occurrence of a pattern.re.findall()
: Finds all occurrences of a pattern.re.sub()
: Replaces occurrences of a pattern with a specified string.re.split()
: Splits a string by occurrences of a pattern.
Importing the re
Module
Before using regular expressions in Python, you need to import the
re
module.
import re
Basic Example of Using Regular Expressions
Let’s start with a simple example to understand how regular expressions work.
Example:
import re
# Example string
text = "Hello, my name is @PythonBeeTelugu and I am learning Python."
# Define the pattern to search for 'Python'
pattern = r"Python"
# Search for the pattern in the text
match = re.search(pattern, text)
if match:
print("Pattern found:", match.group())
else:
print("Pattern not found.")
Output:
Pattern found: Python
In this example, we used the re.search()
function to search for
the word "Python" in the string. The group()
method returns the
matched string.
Using re.match()
The re.match()
function checks if the regular expression matches
at the beginning of the string. It returns a match object if the pattern is
found at the start of the string, or None
if the match is not
found.
Example:
import re
# Example string
text = "Hello, my name is @PythonBeeTelugu."
# Define the pattern
pattern = r"Hello"
# Match the pattern at the beginning
match = re.match(pattern, text)
if match:
print("Pattern found:", match.group())
else:
print("Pattern not found.")
Output:
Pattern found: Hello
In this case, the pattern "Hello" is found at the start of the string, so it matches successfully.
Using re.findall()
The re.findall()
function returns all non-overlapping matches of
the pattern in a string as a list of strings.
Example:
import re
# Example string
text = "Hello, my name is @PythonBeeTelugu and I am learning Python. Python is great!"
# Define the pattern
pattern = r"Python"
# Find all occurrences of the pattern
matches = re.findall(pattern, text)
print("All occurrences of 'Python':", matches)
Output:
All occurrences of 'Python': ['Python', 'Python']
In this example, re.findall()
finds all occurrences of the word
"Python" in the text.
Using re.sub()
for Replacing Text
The re.sub()
function is used to replace occurrences of a pattern
with a specified string. You can use this to clean or modify strings.
Example:
import re
# Example string
text = "Hello, my name is @PythonBeeTelugu."
# Define the pattern and replacement string
pattern = r"PythonBeeTelugu"
replacement = "PythonMaster"
# Replace the pattern with the replacement string
new_text = re.sub(pattern, replacement, text)
print("Modified text:", new_text)
Output:
Modified text: Hello, my name is @PythonMaster.
In this example, the word "@PythonBeeTelugu" is replaced with "@PythonMaster"
using re.sub()
.
Using re.split()
for Splitting a String
The re.split()
function splits a string by the occurrences of a
pattern. It returns a list of substrings.
Example:
import re
# Example string
text = "apple,banana,orange,grape"
# Define the pattern to split by comma
pattern = r","
# Split the string
fruits = re.split(pattern, text)
print(fruits)
Output:
['apple', 'banana', 'orange', 'grape']
In this example, the string is split into a list of fruits using
re.split()
.
Special Characters in Regular Expressions
Regular expressions in Python use special characters to define search patterns. Here are some of the most common ones:
.
: Matches any character except a newline.^
: Matches the start of the string.$
: Matches the end of the string.\d
: Matches any digit (0-9).\D
: Matches any non-digit character.\w
: Matches any word character (letters, digits, and underscore).\W
: Matches any non-word character.\s
: Matches any whitespace character.\S
: Matches any non-whitespace character.*
: Matches 0 or more repetitions of the preceding character.+
: Matches 1 or more repetitions of the preceding character.{n}
: Matches exactly n repetitions of the preceding character.
Example with Special Characters:
import re
# Example string
text = "My phone number is 123-456-7890."
# Define the pattern to match a phone number
pattern = r"\d{3}-\d{3}-\d{4}"
# Search for the pattern in the text
match = re.search(pattern, text)
if match:
print("Phone number found:", match.group())
else:
print("No phone number found.")
Output:
Phone number found: 123-456-7890
In this example, the regex pattern \d{3}-\d{3}-\d{4}
is used to
match a phone number in the format XXX-XXX-XXXX
.
Conclusion
Regular expressions in Python are a powerful tool for working with text. By
using the re
module, you can perform tasks such as searching for
patterns, replacing text, and splitting strings. Mastering regular expressions
can greatly improve your ability to handle text in your Python projects.
Comments
Post a Comment