PCbanter

PCbanter (http://www.pcbanter.net/index.php)
-   Windows 10 Help Forum (http://www.pcbanter.net/forumdisplay.php?f=52)
-   -   Append to UTF 16 (http://www.pcbanter.net/showthread.php?t=1108633)

T July 10th 19 07:10 AM

Append to UTF 16
 
Hi All,

Any of you guys know how to append to a UTF16 text file
from the command line?

echo abc SomeUTF16File.txt

All I get is goofy characters.

Many thanks,
-T

Andy Burns[_6_] July 10th 19 09:01 AM

Append to UTF 16
 
T wrote:

Any of you guys know how to append to a UTF16 text file
from the command line?

echo abc SomeUTF16File.txt

All I get is goofy characters.


Might not help, but does the file start with a BOM?

tried "chcp 1200" ?



Ralph Fox July 10th 19 09:12 AM

Append to UTF 16
 
On Tue, 9 Jul 2019 23:10:23 -0700, T wrote:

Hi All,

Any of you guys know how to append to a UTF16 text file
from the command line?

echo abc SomeUTF16File.txt

All I get is goofy characters.

Many thanks,
-T



CMD /U /C echo abc SomeUTF16File.txt


REFERENCE

Cmd | Microsoft Docs
https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/cmd

Parameters
...
/u Formats internal command output to a pipe or a file as Unicode.



--
Kind regards
Ralph

T July 10th 19 06:10 PM

Append to UTF 16
 
On 7/10/19 1:12 AM, Ralph Fox wrote:
On Tue, 9 Jul 2019 23:10:23 -0700, T wrote:

Hi All,

Any of you guys know how to append to a UTF16 text file
from the command line?

echo abc SomeUTF16File.txt

All I get is goofy characters.

Many thanks,
-T



CMD /U /C echo abc SomeUTF16File.txt


REFERENCE

Cmd | Microsoft Docs
https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/cmd

Parameters
...
/u Formats internal command output to a pipe or a file as Unicode.




Thank you!

T July 10th 19 06:11 PM

Append to UTF 16
 
On 7/10/19 1:01 AM, Andy Burns wrote:
T wrote:

Any of you guys know how to append to a UTF16 text file
from the command line?

echo abc SomeUTF16File.txt

All I get is goofy characters.


Might not help, but does the file start with a BOM?

tried "chcp 1200" ?



I just got garble, which told me it was UTF 16 and I was
appending utf8u


T July 11th 19 06:13 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
On 7/10/19 12:43 PM, "Jeff-Relf.Me@."@ wrote:
UTF-8/ASCII, not UTF-16, is the standard for:

-- FileNames ( all operating systems ).
-- CommandLine Utilities ( e.g. "YouTube-DL.EXE" ).
-- The Internet ( e.g. ".HTML" files ).

Hence ( http://Jeff-Relf.Me/Win10.REG.TXT ):

[HKEY_CURRENT_USER\Software\Microsoft\Command Processor]
"autorun"="C:\\__\\_Source\\Init-CMD.BAT"

"Init-CMD.BAT":

@chcp 65001
@prompt $P$_$g

Also, I wrote my own console ( http://Jeff-Relf.Me/X.HTM ).
Mostly it's used for my Visual C++ routines,
but it also runs "YouTube-DL.EXE" directly, without "CMD.EXE".

From "X.CPP" in "http://Jeff-Relf.Me/X.ZIP":

SECURITY_ATTRIBUTES Inherit = { sizeof( SECURITY_ATTRIBUTES ) };
Inherit.bInheritHandle = 1 ; static HANDLE r, w ;
CreatePipe( &r, &w, &Inherit, 0 );
SetHandleInformation( r, HANDLE_FLAG_INHERIT, 0 );

STARTUPINFO SU ={ sizeof( STARTUPINFO ) };
SU.dwFlags = STARTF_USESTDHANDLES | STARTF_USESHOWWINDOW;
SU.wShowWindow = SW_HIDE, SU.hStdOutput = SU.hStdError = w;
SU.hStdInput = GetStdHandle( STD_INPUT_HANDLE );


I wrote and maintain a program to check the Internet for new program
revisions of software I use at customer sites and download
any new revisions I found. Some of the the TRASH I have to
deal with on downloaded web pages is a thing to behold.

I gave up trying to get the UTF8 and UTF8U converters
to solve the issue. I eventually read one site as a binary
buffer of bytes, lopped off the high bit, and then converted
to a string. Problem solved.

Why the heck would you use the high bit on a ASCII web page,
but there is not explaining some of the things I found. I
find all kinds of weird characters at the end of lines too.
I have gotten really good at regex's and can lop that
trash off with alacrity.

One thing that always amuses me is folks that use 1000+
character long lines in their HTML code. Must be a total
nightmare to maintain. Regex to the rescue.




nospam July 11th 19 06:19 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
In article , wrote:

Why the heck would you use the high bit on a ASCII web page,


because the desired characters aren't available otherwise.

Siri Cruise July 11th 19 07:59 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
In article ,
nospam wrote:

In article , wrote:

Why the heck would you use the high bit on a ASCII web page,


because the desired characters aren't available otherwise.


The HTML source can and should be ASCII only. You can use entity references for
anything outside of ASCII. That lets the browser know what character is intended
and present it as the browser pleases rather than make the browser guess what
octets 80 to FF mean.

--
:- Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed

T July 11th 19 08:28 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
On 7/11/19 11:59 AM, Siri Cruise wrote:
In article ,
nospam wrote:

In article , wrote:

Why the heck would you use the high bit on a ASCII web page,


because the desired characters aren't available otherwise.


The HTML source can and should be ASCII only. You can use entity references for
anything outside of ASCII. That lets the browser know what character is intended
and present it as the browser pleases rather than make the browser guess what
octets 80 to FF mean.


Ya, you think? "Suppose to" and "does" are two different things.
And good luck complaining to the web site about it.


I see a lot of
a href=/downloadsdownloads[trash][trash][trash]/a


But I have gotten pretty good and zapping the trash. Regex's
will drive you insane until your learn how to use them Then
they are kind of fun.

Now exactly what is s/\/\/\\/\/\\/\//g anyway?

:-)

T July 11th 19 08:54 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
On 7/11/19 12:41 PM, "Jeff-Relf.Me@."@ wrote:
Siri_Cruise:
The HTML source can and should be ASCII only.


Yes, and all puppies should go to heaven.


Chuckle.


nospam July 11th 19 09:42 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
In article
, Siri
Cruise wrote:

Why the heck would you use the high bit on a ASCII web page,


because the desired characters aren't available otherwise.


The HTML source can and should be ASCII only.
You can use entity references
for
anything outside of ASCII. That lets the browser know what character is
intended
and present it as the browser pleases rather than make the browser guess what
octets 80 to FF mean.


no need to guess when done correctly.

Siri Cruise July 11th 19 10:19 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
In article ,
nospam wrote:

In article
, Siri
Cruise wrote:

Why the heck would you use the high bit on a ASCII web page,

because the desired characters aren't available otherwise.


The HTML source can and should be ASCII only.
You can use entity references
for
anything outside of ASCII. That lets the browser know what character is
intended
and present it as the browser pleases rather than make the browser guess
what
octets 80 to FF mean.


no need to guess when done correctly.


Just ASCII with entity references in ASCII is always correct. And it's easy to
do.

--
:- Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed

Siri Cruise July 11th 19 10:19 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
In article , Jeff-Relf.Me @.@ wrote:

Siri_Cruise:
The HTML source can and should be ASCII only.


Yes, and all puppies should go to heaven.


They do.

--
:- Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed

T July 11th 19 10:21 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
On 7/11/19 2:19 PM, Siri Cruise wrote:
In article , Jeff-Relf.Me @.@ wrote:

Siri_Cruise:
The HTML source can and should be ASCII only.


Yes, and all puppies should go to heaven.


They do.


How about old dogs?

Siri Cruise July 11th 19 10:21 PM

UTF-8/ASCII, not UTF-16, is the standard.
 
In article , T wrote:

On 7/11/19 11:59 AM, Siri Cruise wrote:
In article ,
nospam wrote:

In article , wrote:

Why the heck would you use the high bit on a ASCII web page,

because the desired characters aren't available otherwise.


The HTML source can and should be ASCII only. You can use entity references
for
anything outside of ASCII. That lets the browser know what character is
intended
and present it as the browser pleases rather than make the browser guess
what
octets 80 to FF mean.


Ya, you think? "Suppose to" and "does" are two different things.
And good luck complaining to the web site about it.


I see a lot of
a href=/downloadsdownloads[trash][trash][trash]/a


On any sites not using microsoft?

--
:- Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed


All times are GMT +1. The time now is 03:03 AM.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2004 - 2006 PCbanter
Comments are property of their posters