View Single Post
  #264  
Old December 9th 19, 02:40 PM posted to alt.computer.workshop,alt.comp.os.windows-10,alt.comp.freeware
Mayayana
external usenet poster
 
Posts: 6,438
Default 7 Best Alternatives To Microsoft Office Suite - 2019 Edition

"Carlos E.R." wrote

| | Not in the ANSI, it is in the IBMPC charset, 437. A bit different.
| |
|
| Chr 209 and 241 in English codepage are N and n
| with tilde.
|
| I'm not saying that. I say that the so called ANSI that contains some
| European chars is not ANSI, but the IBM-PC version of it, charset 437,
| with 8 bits.
|

Yes. We seem to have a conflict in terminology. On
Windows, ANSI is the system of 8-bit charsets using
codepages. Any text file on Windows is actually ANSI,
using 1 byte per character, not ASCII. The actual
characters displayed will be decided by the local
codepage. Even though most or all will be ASCII-
conforming.

If I write chr
149 into a text file it will show as a bullet, because
that text file is being read as ANSI text with the English
codepage. There's nothing like a ban on using the high bit.
All non-unicode text is 8-bit ANSI text.

If I enter chr 209 it will show as N with tilde.
And when I save that file from Notepad the
default option will be "ANSI". If I send that file to a Russian
or Turk they'll probably see different characters because
the characters for their language are using part of the
post-ASCII byte value range. But I suspect that on
your Spanish computer you'd see what I see, because
Spanish characters and other Euro characters can all
fit into a single ANSI charset.

Similarly, if you look up the docs for Win32 API conversion
functions like WideCharToMultiByte, you'll see the conversion
options are between unicode and ANSI. The default is ANSI
in the local codepage.

So that's another area where UTF-8 can mess things
up unnecessarily on an English-codepage computer. If
I happen to have text files with high-byte characters
they'll be corrupted as UTF-8.


Ads