Tech Study

Python RegEx | Python Regular Expressions

Python RegEx allows you to create patterns that match particular strings, search for text within longer strings, and extract particular sections of a string based on those patterns.

A RegEx, or Regular Expression, is basically a sequence of characters that forms or leads to a specific search pattern. Python RegEx is used to check whether a string contains the specified search pattern or not. 

Example for Python Regex

Let the  search pattern be  ^t…e$ 

This represents a five-letter word that starts with ‘t’ and ends with ‘e’.

Taste –match

Turtle – Not a match

Table – match

Python re module

Python already has a built-in package called re, which is used to work with Regular Expressions.

Syntax

Syntax to python re module :

Importing:

import re

Example:

import re 

search_pattern = '^t...e$' 

tester_string = 'taste' 

result = re.match(search_pattern, tester_string) 

if result: print("Search successful."
else: print("Search unsuccessful.")

Here we have used re.match() function to find out the search_pattern inside the tester_string.  This method doesn’t return anything if the search_pattern is not found in the tester_string and if it is present, it returns the matched object.

There are some more functions defined in the re module which are used when we work with regular expressions.

There are some more functions defined in the re module which are used when we work with regular expressions.

The re-module offers a set of functions listed below:

Python re findall

This method returns a list of strings that contain all the matches. 

# Program to extract all integers from the string 

import re 

tester_string = 'hello 12 hi 89. Howdy 34' 

search_pattern = '\d+' 

result = re.findall(search_pattern, tester_string) 

print(result)

 Output :

[’12’, ’89’, ’34’]

re.findall() returns an empty list if the search_pattern is not found in the tester_string.

Python split function

This split function python basically splits the string where the match occurs and it returns a list of strings where the splits occurred.

Ex:

import re 

tester_string = 'Twelve:12 Eighty nine:89.' 

search_pattern = '\d+' 

result = re.split(search_pattern, tester_string) print(result)

Output:

[‘Twelve:’, ‘ Eighty nine:’, ‘.’]

re.split() returns a list containing the original string if the pattern is not found.

Python sub function

The syntax for python re.sub() function is: 

re.sub(search_pattern, replace, tester_string)

This function returns a string where matched occurrences are replaced with the replaced variable.

Example :

import re 

# multiline string 

tester_string = 'xyz 24\ de 25 \n f45 6' 

# matches all whitespace characters 

search_pattern = '\s+' 

# empty string 

replace = '' 

new_string = re.sub(search_pattern, replace, tester_string) 

print(new_string)

Output:

xyz24de25f456

The original string is returned if the search_pattern is not found.

Python search function

This re. search() function takes two arguments: search_pattern and tester_string. The method finds the first location where the search_pattern matches with the tester_string. If match is found,it returns a match object else it returns None.

match = re.search(search_pattern, tester_string)

Example :

import re 

tester_string = "Python is fun" 

# check if 'Python' is at the beginning 

match = re.search('\APython', tester_string) 

if match:

 print("pattern found inside the string"
else
print("pattern not found")

Output:  pattern found inside the string

Here, a match contains a match object.

Match Object

A Match Object is an object containing data related to the result and the search.

If there is a match, then the match object is returned else, none will be returned. Match object has its own properties and methods which can be used to get data about the search and the result: 

 dir() function is used to access the methods and attributes of the match object. Some commonly used methods and attributes of match objects are explained below

Python match group

The Match.group() method returns the part of the string where there is a match.

Example :

import re 

string = '39801 356, 2102 1111' 

# 3 digit number then space followed by two-digit number 

pattern = '(\d{3}) (\d{2})' 

# match variable contains a Match 

object. match = re.search(pattern, string
if match: 

print(match.group()) 

else
print("pattern not found")

 Output: 801 35

Start() and End()

The start() function returns the starting index of the matched substring and end() returns the ending index of the matched substring.

>>> match.start() 2 >>> match.end() 8

Here, the match variable contains a match object. 

Python Match Span()

The match.span() function returns a tuple or a pair containing starting and ending index of the matched part of the string. 

>>> match.span() (2, 8)

CONCLUSION :

With the help of Python RegEx, we can define these patterns as simple or as complex as we need them to be, we can use them to search, replace, or manipulate text in different ways

Java Final keyword

Introduction : java final keyword The final keyword present in Java programming language is generally used for restricting the user. …

Read more

C++ Memory Management: new and delete

C++ Memory Management We know that arrays store contiguous and the same type of memory blocks, so memory is allocated …

Read more