Count frequency of characters in a given file in python

CoderLegion - Jun 6 '21 - - Dev Community

In Python, the collections module used as container datatypes. By providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple, it implements specialized container datatypes. This module has a dict subclass for counting hashable objects which is known as counter. With no restrictions on its keys and values, the counter class itself considered as a dictionary subclass. The values are intended to be numbers representing counts, but you'll store anything within the value field. Including zero or negative counts, this counts are allowed to be any integer value. The Counter class is analogous to bags or multisets in other languages.

Count frequency of characters in a given file
There is no got to split words, at all; directly passing a string to the counter updates the counts per character. Use the flag “r” for input file and the flag “w” for output file. You furthermore may got to collect all counts first, and only then write them bent the output file:

from collections import Counter

def count_letters(in_filename, out_filename):
    counts = Counter()
    with open(in_filename, "r") as in_file:
        for chunk in iter(lambda: in_file.read(8196), ''):
            counts.update(chunk)
    with open(out_filename, "w") as out_file:
        for letter, count in counts.iteritems():
            out_file.write('{}:{}\n'.format(letter, count)```


It should be note that, the inputfile is processed in 8kb chunks instead of in one go; you'll adjust the block size (preferably in powers of 2) to maximise throughput.

If you would like your output file to be sorted by frequency (descending), then you could also use .most_common() rather than .iteritems().

Enter fullscreen mode Exit fullscreen mode
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player