![]() ![]() But what’s the FEFF at the beginning? That’s a byte order mark (BOM) that my text editor inserted. So our ASCII characters-1, 2, 3, and space-are padded with a couple zeros, and we see the Unicode values of our Greek letters as we expect. If we looked at the same file with UTF-16 encoding, representing each character with 16 bits, the results look more familiar. The B1, B2, and B3 look familiar, but why do they have “CE” in front rather than “03”? This has to do with the details of UTF-8 encoding. Now let’s look at the file in our hex editor. The lower-case Greek alphabet starts at 0x03B1, so these three characters are 0x03B1, 0x03B2, and 0x03B3. By design, UTF-8 is backward compatible with the first 128 ASCII characters. If your file is saved as utf-8 rather than ASCII, it makes absolutely no difference, as long as the file is UTF-8 encoded. If you open this file in a hex editor you’ll see 3132 33īecause the ASCII value for the character ‘1’ is 0x31 in hex, ‘2’ corresponds to 0x32, and ‘3’ corresponds to 0x33. Suppose you type a little text into a text file, say “123”.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |