Normalizing Fancy Text to Normal Text in Laravel

Hafiq Iqmal - Oct 6 - - Dev Community

Article originated from https://medium.com/@hafiqiqmal93/normalizing-fancy-text-to-normal-text-in-laravel-7d9ed56d5a78

Text input from users are not at all interesting. With the advent of Unicode in the smartphones, users now have the luxury (and sometimes the whimsy) to input text in a variety of styles and formats. From emojis to diacritics, ligatures to full-width characters, the range of โ€œfancy textโ€ can be extremely confusing or difficult to understand by the system. While visually appealing, these text variations pose a significant challenge for the system particularly in terms of data consistency, searchability, and user experience.

Here are the example of fancy text:-



๐˜•๐˜ฆ๐˜ช๐˜จ๐˜ฉ๐˜ฃ๐˜ฐ๐˜ณ ๐˜ฎ๐˜ข๐˜ฌ๐˜ฆ ๐˜ข ๐˜ฏ๐˜ฆ๐˜ธ ๐˜ค๐˜ฐ๐˜ฏ๐˜ฏ๐˜ฆ๐˜ค๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ถ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ ๐˜ต๐˜ฐ ๐˜ฐ๐˜ถ๐˜ณ ๐˜ฎ๐˜ฆ๐˜ต๐˜ฆ๐˜ณ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ธ๐˜ฆ ๐˜ฅ๐˜ช๐˜ด๐˜ค๐˜ฐ๐˜ท๐˜ฆ๐˜ณ๐˜ฆ๐˜ฅ ๐˜ช๐˜ต ๐˜ฃ๐˜ฆ๐˜ค๐˜ข๐˜ถ๐˜ด๐˜ฆ ๐˜ต๐˜ฉ๐˜ฆ๐˜บ ๐˜ด๐˜ธ๐˜ช๐˜ต๐˜ค๐˜ฉ ๐˜ฐ๐˜ง๐˜ง ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฎ๐˜ข๐˜ช๐˜ฏ ๐˜ฎ๐˜ฆ๐˜ต๐˜ฆ๐˜ณ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ช ๐˜จ๐˜ฐ ๐˜ฅ๐˜ฐ๐˜ธ๐˜ฏ ๐˜ต๐˜ฐ ๐˜ค๐˜ฉ๐˜ฆ๐˜ค๐˜ฌ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ด๐˜ฐ๐˜ฎ๐˜ฆ๐˜ฐ๐˜ฏ๐˜ฆ ๐˜จ๐˜ฐ ๐˜ฅ๐˜ฐ๐˜ธ๐˜ฏ ๐˜ข๐˜ญ๐˜ด๐˜ฐ ๐˜ต๐˜ฐ ๐˜ฐ๐˜ง๐˜ง ๐˜ช๐˜ต ๐˜ข๐˜จ๐˜ข๐˜ช๐˜ฏ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ค๐˜ญ๐˜ข๐˜ช๐˜ฎ๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฉ๐˜ข๐˜ต๐˜ช๐˜ด ๐˜ต๐˜ฉ๐˜ฆ๐˜ช๐˜ณ ๐˜ฎ๐˜ฆ๐˜ต๐˜ฆ๐˜ณ, ๐˜ช๐˜ต ๐˜ฐ๐˜ฏ๐˜ญ๐˜บ ๐˜ฉ๐˜ข๐˜ฑ๐˜ฑ๐˜ฆ๐˜ฏ๐˜ด ๐˜ต๐˜ฉ๐˜ช๐˜ด ๐˜ธ๐˜ฆ๐˜ฆ๐˜ฌ..๐˜ฏ๐˜ฆ๐˜ท๐˜ธ๐˜ณ ๐˜ช๐˜ฏ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฑ๐˜ข๐˜ด๐˜ต. ๐˜›๐˜ฉ๐˜ฆ ๐˜บ๐˜ฆ๐˜ญ๐˜ญ๐˜ฐ๐˜ธ ๐˜ฉ๐˜ฐ๐˜ด๐˜ฆ ๐˜ช๐˜ด ๐˜ซ๐˜ถ๐˜ด๐˜ต ๐˜ฏ๐˜ฆ๐˜ธ๐˜ญ๐˜บ ๐˜ค๐˜ฐ๐˜ฏ๐˜ฏ๐˜ฆ๐˜ค๐˜ต๐˜ฆ๐˜ฅ


Enter fullscreen mode Exit fullscreen mode

Looks like italic character but its not italic. Its actually belongs to Mathematical Alphanumeric Symbols.

Problem in PHP ๐Ÿ’ฅ

Well, a very obvious problem is that PHP can't JSON encode deformed UTF-8 characters upon receipt. In the modern way of doing web development, where APIs and frontend frameworks use JSON to transport data, this is a problem. If treated wrong, such deformed characters will result in data corruption, crash, or angry users.

Our goal is simple :- came out with the solution that will convert every fancy text into normal readable text.

PHP Normalizer

Normalization forms are pivotal to understanding the normalization process. They cater to different linguistic and technical needs. For instance, the NFC form combines characters into their composed forms, whereas NFD does the opposite, decomposing composed characters into their constituent parts. NFKC and NFKD forms go further, considering compatibility charactersโ€Š-โ€Šfolding variations of characters into a canonical form. These forms ensure that text comparison, searching, and storage are consistent and reliable.

The Solutionย ๐Ÿš€

The code snippet provided is a sterling example of PHP approach to solving complex problems with simplicity and efficiency. Let's dissect this solution, understand its components, and see how it seamlessly integratesย :-



public static function normalizeText($text): ?string
{
    if (!$text) {
        return null;
    }
    $intl = [
        \Normalizer::FORM_C,
        \Normalizer::FORM_D,
        \Normalizer::NFD,
        \Normalizer::FORM_KC,
        \Normalizer::NFKC,
        \Normalizer::FORM_KC_CF,
        \Normalizer::FORM_KD,
        \Normalizer::NFKD,
        \Normalizer::NFC,
        \Normalizer::NFKC_CF,
    ];
    foreach ($intl as $form) {
        if (!\Normalizer::isNormalized($text, $form)) {
            return \Normalizer::normalize($text, $form);
        }
    }
    return $text;
}


Enter fullscreen mode Exit fullscreen mode

The usage is simple:-



$normalText = Utils::normalizeText($YOUR_FANCY_STRING)


Enter fullscreen mode Exit fullscreen mode

You may register inside helper function to make it easier to use. For example:-



if ( ! function_exists('normalize_text')) {
    function normalize_text(string $text): string
    {
         return Utils::normalizeText($text)
    }
}

// USAGE
$normalText = normalize_text($YOUR_FANCY_STRING)


Enter fullscreen mode Exit fullscreen mode

At its core, this function leverages PHP's **Normalizer** class-a part of the Internationalization (intl) extension-to address the normalization. The **Normalizer** class offers several normalization forms, each tailored to different normalization needs. This function iterates through these forms, checking if the text is already normalized in a given form using **isNormalized** function. If not, it normalizes the text to that form and returns the normalized string.


Conclusion

While fancy text may add visual appeal to user input, it poses significant challenges for data processing and system interoperability. However, with the adoption of PHP's Normalizer class and the implementation of normalization forms, developers can overcome these challenges and ensure that their applications maintain data consistency and reliability in the face of diverse text inputs.


Do you have any experiences or challenges related to handling fancy text in your projects? How do you currently address such issues, and do you find PHP's Normalizer class useful in your workflow? Let's continue the conversation and share our insights to help each other navigate the complexities of modern web development. ๐Ÿคœ๐Ÿผ

. . . . .
Terabox Video Player