Thursday, October 14, 2021

Regular Expressions in Python

 Regular Expression is an order or pattern of characters that helps us to match a string or set of strings. Like other programming languages python also supports regular expressions. re module in python gives us support for regular expressions in python.

There are different types of function in re module, popular functions are listed below.
  1. re.match
  2. re.search
  3. re.findall
  4. re.sub
Syntax: re.match(pattern, string)

Now let us discuss about patterns, As discussed above patterns are nothing but sequence of characters. We use meta characters also in patterns, popular meta characters are given below.

\ Used to drop the special meaning of character following it
[] Represent a character class
^ Matches the beginning
$ Matches the end
. Matches any character except newline
| Means OR (Matches with any of the characters separated by it.
? Matches zero or one occurrence
* Any number of occurrences (including 0 occurrences)
+ One or more occurrences
{} Indicate the number of occurrences of a preceding regex to match.
() Enclose a group of Regex

re.match: checks for a match only at the beginning of the string, re.match returns True if pattern matches else returns none.

Program:
import re
a = "python concepts with examples"
match = re.match(".*", a)
print(match.group())

Output:
python concepts with examples

re.search: checks for a match anywhere in the string

Program:
import re
a = "python concepts with examples"
match = re.search("python(.*)with", a)
print(match.group())
print(match.group(1))

Output:
python concepts with
 concepts

re.findall: Return all non-overlapping matches of pattern in string, as a list of strings.

Program:
import re
a = "python concepts with examples python is very fast python is super fast"
match = re.findall("python", a)
print(match)

Output:
['python', 'python', 'python']

re.sub: The ‘sub’ in the function stands for SubString, a certain regular expression pattern is searched in the given string(3rd parameter), and upon finding the substring pattern is replaced by repl(2nd parameter), count checks and maintains the number of times this occurs. 

Program:
import re
a = "python concepts with examples python is very fast python is super fast"
match = re.sub("python", "java", a)
print(match)

Output:
java concepts with examples java is very fast java is super fast

Sample Programs:

1. Match the IP Address

Program:
import re
a = "10.10.101.10"
regex = "(((25[0-5]|2[1-4][0-9]|1[0-9][0-9]|[0-9]{1,2})(\.|$)){4})"
match = re.match(regex,a)
print(match.group())

Output:
10.10.101.10

2. Match Mac Address

Program:
import re
a = "3D-F2-C9-A6-B3-4F"
regex = "(((([0-9A-Fa-f]){2})(-|:|$)){6})"
match = re.match(regex,a)
print(match.group())

Output:
3D-F2-C9-A6-B3-4F