Snake case to camel case and back using regular expressions and Python

Raunak Ramakrishnan - May 10 '18 - - Dev Community

Snake case and Camel case are conventions of naming variables, functions and classes. Most teams and projects prescribe a particular case in their style guides.

Examples of camel case:

MyClass
MyClassFactory
MyClassFactoryBuilder
MyClassFactoryBuilderImpl
myInstance
myInstance2
abc
patternMatcher
Enter fullscreen mode Exit fullscreen mode

Examples of snake case:

add
matrix_add
diagonal_matrix_add
pseudo_inverse
Enter fullscreen mode Exit fullscreen mode

If we want to convert back and forth between these cases, we must look for the points of interest - the word boundaries. Camel case boundaries have the first letter capitalized while the snake case word boundaries have an _.

Snake case to camel case

Here is a regular expression for finding out the _ and the first letter in the next word:

(.*?)_([a-zA-Z])
Enter fullscreen mode Exit fullscreen mode

This regex has 2 parts:

  1. (.*?) finds everything upto the _.

    • The '.' means any character.
    • '*' stands for match 0 or more instances
    • '?' stands for non-greedy match. We must use '?' in the pattern because the regex engine will try to match as much as possible by default. So, if we use just (.*), the whole word will be consumed and nothing will be left for the rest of the pattern.
    • '()' stand for a group. A group is a way of saving a part of the match for later.
    • Together, they mean that find all characters upto the first '_' and capture them in a group.
  2. ([a-zA-Z]) finds the first alphabet after the _. We need this to convert to upper case for Camel case.

The Python code below converts words which are in snake case to camel case:

import re

REG = r"(.*?)_([a-zA-Z])"

def camel(match):
    return match.group(1) + match.group(2).upper()

def camel_upper(match):
    return match.group(1)[0].upper() + match.group(1)[1:] + match.group(2).upper()

words = """add
matrix_add
diagonal_matrix_add
pseudo_inverse""".splitlines()

results = [re.sub(REG, camel, w, 0) for w in words]
print(results)
# Output:
# ['add', 'matrixAdd', 'diagonalMatrixAdd', 'pseudoInverse']
Enter fullscreen mode Exit fullscreen mode

We use the regex we constructed earlier and the re.sub method to substitute our matched words. We pass a method called camel as an argument. This method allows us to change the case of the letter in the second group and keep the first group unchanged. Notice that the first letter can be either lower or upper depending on whether it is a variable or method (lower) or a class (upper). The camel_upper method can be used for class names.

Camel case to snake case

Similarly, for converting from camel to snake case, the regex is:

(.+?)([A-Z])
Enter fullscreen mode Exit fullscreen mode

And the Python code :

import re

REG = r"(.+?)([A-Z])"

def snake(match):
    return match.group(1).lower() + "_" + match.group(2).lower()

words = """MyClass
MyClassFactory
MyClassFactoryBuilder
MyClassFactoryBuilderImpl
myInstance
myInstance2
abc
patternMatcher""".splitlines()

results = [re.sub(REG, snake, w, 0) for w in words]

print(results)
# Output
# ['my_class', 'my_class_factory', 'my_class_factory_builder', 'my_class_factory_builder_impl', 'my_instance', 'my_instance2', 'abc', 'pattern_matcher']
Enter fullscreen mode Exit fullscreen mode
. . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player