A Windows XP help forum. PCbanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » PCbanter forum » Microsoft Windows 7 » Windows 7 Forum
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Problem displaying Unicode characters in CMD



 
 
Thread Tools Rate Thread Display Modes
  #16  
Old August 6th 17, 04:45 PM posted to alt.windows7.general
Paul[_32_]
external usenet poster
 
Posts: 11,873
Default Problem displaying Unicode characters in CMD

JJ wrote:


None of the mentioned fonts is accepted by the console, unfortunately.


There is Courier New, but the file isn't big enough.

How many versions of the Courier New font are there ?

There is Droid Sans Mono, but it's a smaller font file
than Courier New.

And the guys here provide some numbers, for just how
dire the situation is.

https://graphicdesign.stackexchange....nicode-support

Paul
Ads
  #17  
Old August 6th 17, 06:06 PM posted to alt.windows7.general
Mayayana
external usenet poster
 
Posts: 6,438
Default Problem displaying Unicode characters in CMD

"JJ" wrote

......

Following Paul's link,I found this:

http://unifoundry.com/unifont.html (a font)

https://upload.wikimedia.org/wikiped...3.20131006.png
(a picture of the characters in that font)

I don't know if windows will load it.
There are also interesting notes on console windows
he

https://stackoverflow.com/questions/...mmand-line-how

With the interesting idea of setting the display to UTF-8.

There's also this, from Michael Kaplan, who is, or
at least was, pretty much the language programming
expert at MS:

http://archives.miloush.net/michkap/...8/8306597.html

He shows how to programmatically jump through
hoops to show rectangles in the console window that
function as the characters they're supposed to be.
Whoopee. It doesn't sound promising. But maybe you'll
be the first.

| Interesting. Maybe that's coming across in dropdown text
| window as unicode but being interpreted as DBCS.
|
| That's impossible. The "Gothic" text can't possibly be "????" regardless
of
| what it was originally encoded with.
|

No, I wouldn't think so. But some kind of
fluke in the dropdown window is the only
explanation I can think of.

| Did you actually see the katakana characters in the news message from your
| news client? That (and this) message was encoded using Big5, BTW.
|

I see them in the window. If I look at the message source
it shows with the English code page, as a line of oddball
characters. If I save the post and open it in Notepad I see
rectangles. If I then paste that into an ANSI text window
as part of a webpage I get ??????... But if I replace those
with the rectangles from Notepad and save it as UTF-8,
IE will show the characters.
So... yes and no.



  #18  
Old August 7th 17, 07:25 PM posted to alt.windows7.general
JJ[_11_]
external usenet poster
 
Posts: 744
Default Problem displaying Unicode characters in CMD

On Sun, 6 Aug 2017 13:06:00 -0400, Mayayana wrote:
Following Paul's link,I found this:

http://unifoundry.com/unifont.html (a font)

https://upload.wikimedia.org/wikiped...3.20131006.png
(a picture of the characters in that font)

I don't know if windows will load it.


It won't, unfortunately.

There are also interesting notes on console windows
he

https://stackoverflow.com/questions/...mmand-line-how

With the interesting idea of setting the display to UTF-8.


Well, that SO question is about working with Unicode as data, not as
display. I don't have any problem on that too.

There's also this, from Michael Kaplan, who is, or
at least was, pretty much the language programming
expert at MS:

http://archives.miloush.net/michkap/...8/8306597.html

He shows how to programmatically jump through
hoops to show rectangles in the console window that
function as the characters they're supposed to be.
Whoopee. It doesn't sound promising. But maybe you'll
be the first.


That actually shows the problem. The Windows' console design in terms of
displaying characters, is not natively UCS2/UTF16. It's more like native
ANSI/OEM.

No, I wouldn't think so. But some kind of
fluke in the dropdown window is the only
explanation I can think of.


FYI, most cross platform applications use their own font rendering engine.
They don't rely on Windows' built in font rendering engine. Moreover,
Thunderbird, Firefox and other Gecko based applications use the Gecko
browser engine for their main application GUI (as GUI framework).

I see them in the window. If I look at the message source
it shows with the English code page, as a line of oddball
characters.


That would be the Big5 encoded text shown using ANSI character set.

If I save the post and open it in Notepad I see rectangles.


That's when the font used for Notepad doesn't have the glyph for that
characters.

If I then paste that into an ANSI text window
as part of a webpage I get ??????...


What application is that? ANSI character set is roughly the same as code
page. If the system code page is not CJK, Windows won't show the correct
character. Assuming that the font used for the display have the glyph for
that characters.

But if I replace those
with the rectangles from Notepad and save it as UTF-8,
IE will show the characters.
So... yes and no.


Well, IE has better internationalization support. Much better than the
console, apparently.

And if you take a look a the screenshot again, you'll notice that the
console removes both the "Courier New" and "Lucida Console" fonts from the
list when the system locale is set to CJK. So, it seems that the Windows'
console design (in terms of display) is bound to the system code page. I
think that's the main problem.

OK, I do believe now that there's no solution for this.
Thanks for your support.
  #19  
Old August 7th 17, 07:25 PM posted to alt.windows7.general
JJ[_11_]
external usenet poster
 
Posts: 744
Default Problem displaying Unicode characters in CMD

On Sun, 06 Aug 2017 11:45:29 -0400, Paul wrote:

There is Courier New, but the file isn't big enough.

How many versions of the Courier New font are there ?

There is Droid Sans Mono, but it's a smaller font file
than Courier New.

And the guys here provide some numbers, for just how
dire the situation is.

https://graphicdesign.stackexchange....nicode-support


In my collection, there a
- Courier (Raster, TrueType, PostScript)
- Courier New KOI-8 (PostScript; KOI-8 character set)
- Courier Std (OpenType)
- Courier10 BT (TrueType, PostScript)
- CourierMCY (TrueType, PostScript)

AFAIK, all Courier fonts are monospaced, but I haven't seen any that have
adquate Unicode subrange (which include CJK).

FYI, Windows' built in PostScript fonts support can only handle ANSI/OEM
character set.

I have a font information tool I wrote years ago. Here are the list of the
Unicode subrange some of the mentioned fonts have.

Courrier New:
https://pastebin.com/6GqRtHK7

Droid Sand Mono:
https://pastebin.com/aP52cu6x

FreeMono:
https://pastebin.com/4prhSNsZ

GNU UniFont: (mentioned by Mayayana)
https://pastebin.com/3V2XMiyQ

I have most of the fonts that has CJK Unicode subrange from many sources. I
even have the excellect "Osaka" TrueType font from Mac OS X which is
converted to Windows version (Mac TTF files are not binary compatible with
Windows because they use big endian format). Yet, none of the CJK fonts in
my collection is accepted by the console's settings dialog if I don't set
the system locale to CJK.

I don't think this problem has any solution.
So, thanks for your time.
  #20  
Old August 7th 17, 11:23 PM posted to alt.windows7.general
Mayayana
external usenet poster
 
Posts: 6,438
Default Problem displaying Unicode characters in CMD

"JJ" wrote

| If I then paste that into an ANSI text window
| as part of a webpage I get ??????...
|
| What application is that?

That's actually my own code editor. I made it with a
RichEdit window and included a toggle option for
ANSI or UTF-8. When set to ANSI I get ?s. When set
to UTF-8 I get rectangles. There's seems to be some
kind of "sniffing" built in. In ANSI I should get ANSI
characters, but Windows apparently picks up that it's
UTF-8 and just doesn't try to render it. Yet if I load
a UTF-8 webpage I don't get ?s for single UTF-8
characters. I get characters above 128 in English ANSI.

This has been an interesting exploration. The encoding
options are so complicated. But I guess it makes sense
that the console window would be ANSI. Most programming
is English. I imagine CD or DEL don't change. So the only
reason to support other languages would be for local
differences in file/folder names.


  #21  
Old August 9th 17, 04:24 PM posted to alt.windows7.general
JJ[_11_]
external usenet poster
 
Posts: 744
Default Problem displaying Unicode characters in CMD

On Mon, 7 Aug 2017 18:23:51 -0400, Mayayana wrote:

That's actually my own code editor. I made it with a
RichEdit window and included a toggle option for
ANSI or UTF-8. When set to ANSI I get ?s. When set
to UTF-8 I get rectangles. There's seems to be some
kind of "sniffing" built in. In ANSI I should get ANSI
characters, but Windows apparently picks up that it's
UTF-8 and just doesn't try to render it. Yet if I load
a UTF-8 webpage I don't get ?s for single UTF-8
characters. I get characters above 128 in English ANSI.


The Windows' RichEdit control is Unicode aware, even if the host application
uses an ANSI GUI. e.g. Wordpad in Windows 9x.

You can test it by using the RichEdit's built in ALT+X shortcut when your
application is set to ANSI mode. Press the shortcut when the input cursor is
placed after a character. Try that using two different characters where both
show as "?" or square characters.
  #22  
Old August 9th 17, 07:53 PM posted to alt.windows7.general
Mayayana
external usenet poster
 
Posts: 6,438
Default Problem displaying Unicode characters in CMD

"JJ" wrote

|
| The Windows' RichEdit control is Unicode aware, even if the host
application
| uses an ANSI GUI. e.g. Wordpad in Windows 9x.

Interesting. I just pasted your file name into
Wordpad and it got sniffed out as Japanese,
then rendered in a Japanese font that I didn't
know I had.

But my program is an editor for HTML and script.
I want it to be locked into either ANSI or UTF-8,
so I have a menu toggle, which changes the 3rd
parameter when I send an EM_STREAMIN message
to load a file.

More accurately, I want ANSI, but sometimes there
are UTF-8 webpages that are loaded and I want
to be able to handle those. It's kind of a shame,
really. English webpages don't need to be UTF-8.
ASCII is UTF-8 matching. But companies like Microsoft
often use things like curly quotes in UTF-8 which
then corrupt the text if they're rendered as ANSI.
They're using just enough to create a problem for
ANSI rendering.



 




Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off






All times are GMT +1. The time now is 04:16 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 PCbanter.
The comments are property of their posters.