Regular Expression or regex or regexp is used for String or text matching. You specify a regular expression and then run input String to see if any String matches with your regular expression. There are a lot of use cases of regular expression e.g. you can do find and replace and you can actually search in a big log file for the expression or text you are looking it.
Regular expression is not just limited to any programming language like Java but it present everywhere. You use Regular expression in Linux for pattern matching like with grep, find, sed and awk command.
Even VI editor allows you to search words using regular expression and popular Windows text editor like Notepad++ and Edit Plus they also support regular expression for search and replace.
10 Essential Regular Expression for Programmers
Here are the essential Regular expression every programmer should learn. This will be great for doing common task and also give you a feeling that you are a great programmer.
Exact Search
goat matches any string that has the text goat in itStartwith
^The matches any string that starts with The -> Try it!EndWith
end$ matches a string that ends with endIn Between
^The end$ exact string match (starts and ends with The end)Quantifiers?—?* + ? and {}
abc* matches a string that has ab followed by zero or more c -> Try it!
abc+ matches a string that has ab followed by one or more c
abc? matches a string that has ab followed by zero or one c
abc{2} matches a string that has ab followed by 2 c
abc{2,} matches a string that has ab followed by 2 or more c
abc{2,5} matches a string that has ab followed by 2 up to 5 c
a(bc)* matches a string that has a followed by zero or more copies of the sequence bc
a(bc){2,5} matches a string that has a followed by 2 up to 5 copies of the sequence bc
OR operator?—?| or []
a(b|c) matches a string that has a followed by b or c -> Try it!
a[bc] same as previous
*6. Character classes?—?\d \w \s and . *
\d matches a single character that is a digit -> Try it!
\w matches a word character (alphanumeric character plus underscore) -> Try it!
\s matches a whitespace character (includes tabs and line breaks)
. matches any character -> Try it!
Use the . operator carefully since often class or negated character class (which we’ll cover next) are faster and more precise.
\d, \w and \s also present their negations with \D, \W and \S respectively.
For example, \D will perform the inverse match with respect to that obtained with \d.
\D matches a single non-digit character -> Try it!
Important things to Remember
- In order to be taken literally, you must escape the characters ^.[$()|*+?{\with a backslash \ as they have special meaning.
\$\d matches a string that has a $ before one digit
Notice that you can match also non-printable characters like tabs \t, new-lines \n, carriage returns \r.
The quantifiers ( * + {}) are greedy operators, so they expand the match as far as they can through the provided text
That's all about essential Regular expressions every Programmer should know. I have purposefully not touched some advanced topics like grouping and back reference just to keep this article simple but if you are interested you can also check a comprehensive Regular expression resource to learn it better.
Thanks for reading this article so far. If you like this article then please share with your friends and colleagues. If you have any questions or feedback then please drop a note.