Installation ============ This installs a package that can be used from Python (``import charset_normalizer``). To install for all users on the system, administrator rights (root) may be required. Using PIP --------- Charset Normalizer can be installed from pip:: pip install charset-normalizer You may retrieve the latest unicodedata backport as follow:: pip install charset-normalizer[unicode_backport] From git via master ----------------------- You can install from dev-master branch using git:: git clone https://github.com/Ousret/charset_normalizer.git cd charset_normalizer/ python setup.py install Basic Usage =========== The new way ----------- You may want to get right to it. :: from charset_normalizer import from_bytes, from_path # This is going to print out your sequence once properly decoded print( str( from_bytes( my_byte_str ).best() ) ) # You could also want the same from a file print( str( from_path( './data/sample.1.ar.srt' ).best() ) ) Backward compatibility ---------------------- If you were used to python chardet, we are providing the very same ``detect()`` method as chardet. This function is mostly backward-compatible with Chardet. The migration should be painless. :: from charset_normalizer import detect # This will behave exactly the same as python chardet result = detect(my_byte_str) if result['encoding'] is not None: print('got', result['encoding'], 'as detected encoding') You may upgrade your code with ease. CTRL + R ``from chardet import detect`` to ``from charset_normalizer import detect``.