Flowchart that can be used when Cyrillic characters used in Russia are garbled



The Cyrillic alphabet used in Russia is a type of alphabet, but because it has a different shape from the Latin alphabet used in English, it is often used as double-byte characters on the web, so garbled characters occur due to the difference in character code. It often happens. You can determine how Cyrillic characters are garbled by looking at the flowchart below.

https://vault.pmpc.ru/vf/16011417/6a7b205721142511253e4d581.png



Start with the question 'What does a crocodile look like?' And see what the characters are converted to. If most of the text is displayed with a symbol (red frame) like '╬╤╪', you can modify KOI8-R to CP 866.



When letters and symbols appear in a mixture, UTF-8 to KOI8-R will be displayed preferentially if the symbols include the letters 'п' and 'я', and '▓' and uppercase letters will be displayed preferentially in the symbols. If the time is displayed from UTF-8 to CP 866, and if a black square is displayed, correct CP 866 to KOI8-R and the garbled characters should be eliminated.



If there are no or few symbols, check each character for repeated characters.

UTF-8 to Windows-1251 if the character that is displayed many times is 'P' or 'C', UTF-8 to Windows-1252 if it is 'D', '◆' Change from UTF-16 to CP 866.



If the garbled character does not contain a symbol, check if the character is Cyrillic. If it is almost all Cyrillic and all lowercase letters are displayed in uppercase, you can change it from Windows-1251 to KOI8-R or from KOI8-R to Windows-1251. Also, if uppercase and lowercase letters are displayed alternately and '?' Is mixed, change from UTF-8 to ISO 8859-5.



If Cyrillic characters are displayed mixed with other characters, correct them from Windows 1251 to ISO8859-5 or CP 866 depending on the displayed symbols.



If the mixed characters are non-Cyrillic characters rather than symbols,

check for umlaut notation characters. Changed from CP866 to ISO8859-5 if umlauts are not included. If umlauts are included, CP 866 to Windows-1251 if non-letter symbols are included. If the displayed characters are uppercase, change from KOIR-8 to Windows-1252, if lowercase, change from Windows 1251 to Windows-1252, and if mixed, change from ISO8859-5 to Windows-1252 to eliminate the garbled characters. ..



in Note, Posted by log1i_yk