If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Rate Thread | Display Modes |
#106
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
"harry newton" wrote
| What did I do wrongly? You pasted into Notepad and saved as UTF-8, but then you opened it in VIM, which is rendering ANSI. Try opening the same file in Notepad. I know you don't want to hear this, but it's complicated. If you save text as UTF-8 you solve the problem, but then you need to read/edit as UTF-8 and need a program that recognizes it. Suddenly you're dealing with a number of plain text formats. Notice the first 3 weird marks in your VIM screenshot? Those are the UTF-8 marker characters, bytes EF BB BF, that tell the editor it's UTF-8. Vim doesn't see them. It's just rendering them as characters. But your sample provides a good example of what you're up against. There are more UTF-8 characters in that text than just curly quotes. That's why I was saying that it's not as simple as a basic replacement. The number and variety of UTF-8 characters in copied text is an unknown. If you just need to do a replace on curly quotes that's doable, but any more than that may take time. You can try my scripts, which I linked to yesterday: http://www.jsware.net/jsware/scrfiles.php5#u2a Even that's not a totally simple solution because UTF-8 does not entirely translate to ANSI. I explained it in the readme file. One script translates UTF-8 characters that match English ANSI characters. It's basically what you want. Another converts the text to unicode and then converts that to ANSI. The 3rd script copies webpage text and converts the encoding to English codepage using ADODB. In many cases UTF-8 - ANSI works fine. But it's like translating between dialects. It's not just a simple case of correcting byte values. I recently read that one can now depict a pile of **** emoji in unicode. It's been assigned a unicode code point! But there's no pile of **** character in the English ANSI codepage. So it won't translate. ..... And that gets into a whole other kettle of fish that I mentioned yesterday: Your translations will be limited not only by your codepage but also by fonts. Most fonts don't have a pile of **** character. Most fonts don't cover much if any of the unicode range. That's why you get the boxes. You're using a primitive console font in Vim that's very limited in its coverage. Fine if you're just hanging around coding with your Babylonian friends, but inadequate to handle text from a "modern" webpage. |
Ads |
#107
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
"harry newton" wrote
| I've installed Notepad++ but I don't know anything about it yet. It's a basic coding editor. Like Vim, it's primarily designed to provide garden variety coding features, like line numbers and syntax highlighting. But it's solidly built and includes most extras you might want. There's a dedicated menu just for encoding options. |
#108
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is J. P. Gilliver (John) said on Mon, 9 Oct 2017 13:41:49 +-0100:
Does it accept the Alt-plus-NumPad method of entering characters (in Windows)? Check this screenshot out I just made for you! http://wetakepic.com/images/2017/10/09/digraphs.jpg I know more now than I did yesterday, so I can say that this works in vi: control +- q +- 147 == adds an opening curly quote unprintable character control +- q +- 148 == adds a closing curly quote unprintable character See also: http://vim.wikia.com/wiki/Entering_special_characters Try this in Notepad first, to see how it works: 0. Ensure the Num Lock light is on. 1. Press _and hold down_ the Alt key. 2. Type some digits - on the numeric pad, _not_ the top row of keys. 3. Release the Alt key. (It's a lot smoother than it sounds.) In vi, this process of [Numlock]Alt +- {0,1,2,...9} didn't seem to do anything differently than if I had used the top row of number keys. That is, it typed in the numbers. Sad, I know, but I had actually memorised the codes for several characters I used a lot: 0177 IIRR for plus-or-minus +ALE-, 0215 for multiply +ANc-, 0176 and 0186 for degrees and raised o +ALA- +ALo- (though I can never remember which is which), 230 for micro +A7w-. (Note that a lot of those start with a zero, which _is_ needed.) Look at this, which I didn't know VIM could do until now! http://wetakepic.com/images/2017/10/09/digraphs.jpg Typing the following digraphs all worked just fine in VIM: control +- q +- 177 == resulted in a single +-/- (+ALE-) character control +- q +- 215 == resulted in a multiply (+ANc-) character control +- q +- 176 == resulted in a degree (+ALA-) character control +- q +- 186 == resulted in an underscore-degree (+ALo-) character control +- q +- 230 == resulted in a single ae (+AOY-) character :digrapahs (will list all of them in vim) The zero wasn't needed as is explained here in section 12:32 [This is fine if you _have_ a numeric pad. Since my default machine is this netbook which doesn't, I now use Allchars which generates them by two-character sequences that are fairly intuitive - +-- for the plus-or-minus, "o for o umlaut +APY-, and so on.] Now that I know something about the digraphs, I have no problem entering them into VI (using control+-q or control+-k and then the decimal value). That means I now have (almost) no problem *removing* and/or *replacing* them in vi, simply because the digraphs (what a funny name) are easy to type into the search and replace command. So, with the use of "digraphs", the problem has been instantly solved! Anyway, this might be a ways to get these characters into your vi s/x/y/ string, since you now know their codes; try it. (Try with and without the leading zero.) I have no problem now entering the digraphs [147] and [148] for the curly quotes. I also have no problem *identifying* the digraphs for anything else since simply sidling up to the character and pressing "ga" will inform me of the digraph's digital, hex, and octal value in vim. Once I have the digraph value, I can enter it into the search command. Everyone has an opinion; everyone has a favorite, a certain one they absolutely swear by." Including you of course (-: The beauty of choosing the most well known of all command-line text editors is that my choice can't be wrong since everyone has to compare to it. That generally means two things in canonical freewa 1. The canonical freeware generally does almost *everything* 2. Anyone copying it has to compare to the canonical freeware In general, since #1 exists, they copy in #2 by eliminating the faults (e.g., a better GUI or a faster GUI, or whatever) but not in functionality for if they had better functionality, they'd be the canonical freeware and not the copy. See previous post, that just _perhaps_ you've got "muscle memory" for some of the _in_efficiencies of vi? Not that I'm suggesting you change - I was the same over Windows 98 (and still use XP in preference to 7). Well, to support your point, I admit it's not intuitive that a search and replace of a to b is the set of instructions: :s/a/b which means: : = begin a command s/ = search for a = a /b = and replace it with b But those commands are run ten thousand times a year, so, it's not hard for the muscles to remember them. However, the great news is that a search and replace of the curly quotes uses the *same* method - so all I needed to learn was what digraphs were! https://www.merriam-webster.com/dictionary/digraph "a group of two successive letters whose phonetic value is a single sound (such as ea in bread or ng in sing)" I don't know everything there is to know about freeware. I'm always learning. But less open than you once were. (I know; I am too.) The thing about freeware, which I probably know as well as anyone here does, is that the *price* of freeware is in the immense time and energy to find and use the best freeware. For example, I've spent entire nights trying to get Shotcut to do exactly what I want it to do. It's not that Shotcut freeware can't do what I want it to do - it's that I don't know the exact steps yet to make Shotcut do what I want it to do. Once you go through that process, when someone suggests another freeware, e.g., VirtualDub, you look at it for just a few seconds and see the huge faults, and you drop it based on that "hunch". With freeware, to cut the immense cost of studying each one (we don't get paid to evaluate freeaware - because if I did - I would LOVE that job), we have to go on hunches just like an HR rep does when evaluating a stack of incoming resumes. So maybe Notepad is far more efficient than vi is. Now, now - nobody ever claimed that. OK. You're the second person to chastise me for that; but I only belatedly realized people were talking about Notepad+-+- which is nothing like Notepad. I've installed Notepad+-+- to test. But then why can't I find Notepad yet on any list of the most efficient text editors (at least not in my cursory search just now)? Probably (a) because it clearly isn't (b) the sort of people who vote on such things may include a lot of vi/emacs enthusiasts. (Actually, I doubt there's anything as democratic as a _vote_ anyway: virtually all such lists will be one person's view.) I disagree. There will almost always be "canonical" software that does a certain job where *most* people will agree and *all* people will have heard of it such that *every* competitor *must* compare against the canonical software. My humble opinion is that if vi isn't in any given list of "canonical" text editors, then it's not a list of canonical text editors. Back on topic ... since the problem is solved, I'll write up a summary so that others can jump to the chase, which increases our tribal knowledge overall. http://wetakepic.com/images/2017/10/09/digraphs.jpg |
#109
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is Mayayana said on Mon, 9 Oct 2017 10:13:17 -0400:
And that works? Your link is talking about ANSI replacments, replacing 147 and 148 with 34. It's not common for people to be using those in ANSI text because they'll be corrupted in a non- English codepage. Since you're copying from webpages I'm assuming that you're dealing with UTF-8. But if your method works then.... Everything seems to be working now in VIM once you guys helped me figure out how to determine the special character hex/octal/decimal value, which was to simply type "ga" with the character selected. Then it was an easy matter to search and replace using standard syntax: :s/a/b = which means search for "a" and replace it with "b" I had trouble with the multiple search & replace syntax though: :s/[a,b]/c = which means search for a and/or b & replace it with c. But I just switched to a different syntax and it worked: %d147 & %d147 = this is one way to use the digraph in a command control+q+147 & control+q+148 = this is another way to use digraphs The first syntax didn't work in the multiple search and replace command, but the second syntax did work. FAILED: :s/[x,y]/z/g == works fine to replace x and/or y with z throughout the file :s/[\%d147,\%d148]/"/g == fails to replace for every line of the file Likewise: :s/x\|y/z/g == works fine to replace x and/or y with z throughout the file :s/\%d147\|\%d148/"/g == fails to replace for every line of the file WORKED: :%s/[control+q+147,control+q+148]/"/g == replaces both with straight quotes |
#110
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
Mayayana wrote:
"harry newton" wrote | Yikes. Is my name in the MS Word document I already uploaded? | http://www.filedropper.com/fixquote | Beats me. The site won't let me download without an account. Are you really serious? You need to get a less obnoxious place to store files. If you don't have anything like your own website or private space, you can probably use dropbox. I find that links from there work fine for direct download if I just change the parameter dl=0 to dl=1. | Is my name in my Word file? | If it's a DOCX rename it to .ZIP and open it. Extract all the files, then use that famous VIM jack-of-all-trades to search for your name. You might also do a binary search in a hex editor. But be warned: DOCX is a bloated mess. I managed to find a DOCX on my system and looked at it just now. No less than 10 XML files are contained in it. In a DOC file the whole thing is stored as semi-binary. The actual ext may be ANSI or unicode. They set it up like a binary file, with sections. (It's a general type that Microsoft refer to as a compound storage file.) I think the personal data comes near the end and may include local paths. I just opened an old Microsoft EULA DOC in HxD hex editor and found this that doesn't appear in the doc itself: Christina Olson... Microsoft Word 10. I've never used MS Office and only use Libre Office as necessary for business contracts and such. So I'm not familiar with how one cleans tracks. I didn't have a problem downloading from Filedropper. You probably needed to have scripts turned on :-) ******* One difference between .docx and .doc, is the .doc format used to have a problem with "tips&tails". When writing out a .doc, the very end of the document might be rounded to the nearest sector, and the parts of the sector not needed, were filled with crap. And that crap might have personally identifiable materials (leakage). If you sent the .doc to someone as an attachment, the non-valid end of the file might be included. The difference with .docx, is you might not notice. Since it's a ZIP format, you might not notice binary placed after the end of file. I doubt DOCX has this problem, but it would be pretty hard to tell with just a hex editor. The above .docx download, seems to have no encrypted content, so I have to assume the document was actually empty. Microsoft likes to apply a default encryption to the content, even when a password is not set on a document. The "password" on the default encryption is "VelvetSweatshop". You can use that keyword in a Google search, to get more info. This is why naively dropping a Microsoft document onto your hex editor and "looking for the owners name", isn't going to work. Lots of stuff you might want to look at, has had 50,000 or 100,000 passes of an encryption method applied to it. That's part of the reason a document might "look like binary". Hardly any useful text to be seen. I tried my hand at tracing the handling of such, in the LibreOffice source code (which should have an implementation) but gave up after a while. My shovel wasn't big enough. There is a handling path for virtually every version of Word that ever existed. There are a couple 500 page specs to look at. And so on. Need a big shovel... One of the funny bits about the 500 page spec, is the word "VelvetSweatshop" cannot be found by doing a document search. To hide their embarrassment, Microsoft shows the password as a series of individual characters, as in "the first letter in the password is 0x56". I thought that was pretty clever. You would not want that to be easy to find, now would you. But since the Libreoffice designers figured it out (with the help of the available MS documentation), a Black Hat could figure it out too. That's if there actually was some info to be had. In something other than the Properties the author sets in the document, claiming ownership. I was hoping, by being inside the LibreOffice engine, I could find a buffer with plaintext in it. That was my basic plan. Paul |
#111
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
"Paul" wrote
| Beats me. The site won't let me download without an account. | I didn't have a problem downloading from Filedropper. | You probably needed to have scripts turned on :-) | I assumed that much. I'm not about to allow a site to run javascript just to download a file. But in the page code, the script that makes the download option appear after clicking the captcha button is this: input type="button" onClick="window.location='http://www.filedropper.com/createaccount.php'" It didn't do anything for me because it was specifically designed to malfunction without script. At any rate, his later link was fine, to sendspace.com. | This is why naively dropping a Microsoft document onto | your hex editor and "looking for the owners name", isn't | going to work. It might not work in docx but it works fine in doc. I was only able to find 2 docx on my system. Both were from MS docs downloads. I opened one and it's all in plain text, if you can call XML plain text. So I don't know what you mean about encryption. But I didn't find much revealing info. Core.xml looks like it can contain such info, though. Here's the closest thing to private data I found: dc:creatormon11/dc:creatorcp:keywords/cp:keywordsdc:description/dc:descriptioncp:lastModifiedBymon11/cp:lastModifiedBycp:revision44/cp:revisiondcterms:created xsi:type="dcterms:W3CDTF"2009-05-27T09:25:00Z/dcterms:createddcterms:modified xsi:type="dcterms:W3CDTF"2009-05-29T08:18:00Z/dcterms:modified/cp:coreProperties Not very racy. | One of the funny bits about the 500 page spec, is the | word "VelvetSweatshop" cannot be found by doing a | document search. Are you sure you're getting out enough, Paul? There's something about looking for "VelvetSweatshop" in a Word doc that seems a bit, well, unseemly. |
#112
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is harry newton said on Sat, 7 Oct 2017 21:38:03 +0000 (UTC):
How can we convert those dastardly curly quotes to straight quotes on Windows? http://i67.tinypic.com/2h5mjbr.jpg SOLVED! http://wetakepic.com/images/2017/10/09/samplede854.jpg The problem statement (to summarize the thread): Cut-and-pasted text from web pages has unprintable characters in vi. The solution (to summarize the tribal knowledge gained & leveraged): a. Learn how to identify the non-printable diagraphs encountered b. Learn how to replace them with desired printable characters http://wetakepic.com/images/2017/10/09/fixquote3de984.jpg SUMMARY: I learned a new word today, which is "diagraphs", defined in VIM as: :diagraphs http://wetakepic.com/images/2017/10/09/digraphs.jpg Whenever I cut-and-paste from web pages such as this one: https://practicaltypography.com/straight-and-curly-quotes.html Certain diagraphs appeared as black boxes in my vim pure text editor. We tried to fix the problem using Microsoft Word & Notepad but they failed! http://wetakepic.com/images/2017/10/09/fix4deee6.jpg We first went off on the wrong track trying to use "smart quotes" in Word: http://wetakepic.com/images/2017/10/09/smartquotesff622.jpg Then we went on the wrong track of saving as "pure text" in Word & Notepad: http://wetakepic.com/images/2017/10/09/fix1.jpg Then we went on another wrong track of character encoding: http://wetakepic.com/images/2017/10/09/fix3aa9bb.jpg But the right track was hiding in plain sight, all along, in gvim. http://wetakepic.com/images/2017/10/09/fixquote32fc33.jpg I knew how to do a search & replace in gvim (as does everyone); but I had entered in the backslash thinking that the backslash delimiter was needed to delimit the diagraph's leading percent sign which I also had thought was needed! In short, this is a normal search and replace for multiple items in gvim: :%s/[x,y]/z/g = search for x and/or y, and replace all with z The "ga" command tells me the following decimal values in vim: '66' smartquote is decimal 147 (aka %d147) '99' smartquote is decimal 148 (aka %d148) http://wetakepic.com/images/2017/10/09/hex.jpg Therefore, this worked for entering the curly quote digraphs in vi: :%s/\%d147/"/g == replaces all '66' curly quotes with straight quotes :%s/\%d148/"/g == replaces all '99' curly quotes with straight quotes And this worked for entering the curly quote digraphs in pairs in vi: :%s/[control+q+147,control+q+148]/"/g == for Windows :%s/[control+v+147,control+v+148]/"/g == for Linux But this did not work to enter the diagraphs in pairs in vi: :%s/[\%d147,\%d148]/"/g == fails to replace for every line of the file I didn't realize the digraph percent sign was not always needed in vi: :%s/[\d147,\d148]/"/g So the syntax problem of replacing curly quotes in gvim is now solved! The same method can be used to replace any non-printable character. Thanks! |
#113
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
On Mon, 9 Oct 2017 12:51:20 +0100, "J. P. Gilliver (John)"
wrote: In message , harry newton writes: He who is Char Jackson said on Mon, 09 Oct 2017 00:22:09 -0500: The reason I'm using GVIM on Windows is that vi allows for quick edits. I tried to use Google to find "gvim quick edits" and came up empty. What does GVIM do that other text editors don't do? What are quick edits? I assume those two words don't have their obvious meaning, because every text editor does that, so what is it? I apologize. My words were not clear. Hence I caused unnecessary confusion. I appreciate that you followed the implied meaning of my words, which makes it my fault for not explaining that "vi" is all I ever wanted. GVIM just came with the package when I searched for the best "vi" editor on Windows. I never use the GUI in GVIM. I never use the mouse. Both hands are on the keyboard, because all I do with GVIM is use the "vi" part of VIM. So what I meant by "quick edits" was my own concoction of words to describe that fact that I can cut and paste and copy and rename and delete and reorder, etc., all using keyboard presses. For example, I can delete a line by typing "dd" and I can yank a line by typing "yy" and I can paste those deleted or yanked lines ten lines lower by typing "p" after jumping down ten lines with "10j" such that the self-concocted phrase "quick edits" means something like this sequence: yy10jp == this is a "quick edit" yanking a line to place it 10 lines lower Just out of curiosity: do you _ever_ use Ctrl-C, X, and V? Or highlighting to select (which can be done from the keyboard - I usually do - rather than the mouse)? Or alternatively, select with either the keyboard or the mouse, then either press delete or drag it where you want it. He persists in thinking that the text editor he uses is his only choice, when there are many others that will do at least as good a job as his, and at least as easily. |
#114
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is Mayayana said on Mon, 9 Oct 2017 10:44:38 -0400:
And then what will you do when you need to type a Euro sign, copyright, or other such symbol? It's easy now that I know what a digraph is & how to handle them: http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg I didn't know this simple answer at the start of this thread because I knew nothing of digraphs (I didn't even know that the word existed). http://vimdoc.sourceforge.net/htmldoc/digraph.html Knowing a little bit now about digraphs, the answer to your question happens to be very simple using the gvim editor of choice. http://vim.1045645.n5.nabble.com/Solved-Display-Euro-character-in-gvim-7-2-td1176874.html That is, to find all the digraphs in gvim, I simply enter the command: :digraphs Here is what results for me, on my old 32-bit gvim freewa http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg It seems that the Euro sign is digraph decimal 128, hex 80, octal 200. https://en.wikipedia.org/wiki/European_Currency_Unit https://en.wikipedia.org/wiki/Euro_sign And that the Copyright sign is the digraph decimal [169]. https://en.wikipedia.org/wiki/Copyright_symbol To answer your question, I need to do only type the decimal [169] digraph to show the copyright symbol, and to type the decimal [128] digraph and choose a font (e.g., Courier New) that knows about that digraph to show the Euro symbol. http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg Notice that solving the Euro problem you proposed just now, also caused me to accidentally solve the smartquote digraph problem by mistake! |
#115
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is harry newton said on Mon, 9 Oct 2017 17:01:05 +0000 (UTC):
SOLVED! http://wetakepic.com/images/2017/10/09/samplede854.jpg SOLVED ANOTHER WAY! http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg Mayayana asked me how to show the Euro symbol in gvim, and when I figured it out (using my newly found keyword "digraph"), I found yet another and far easier way to solve the smartquote digraphs being non-viewable characters! http://vimdoc.sourceforge.net/htmldoc/digraph.html All I needed to do, in gvim, was change the font from the default to something that recognized the Euro (and smartquote) digraphs, such as "Courier New". http://vim.1045645.n5.nabble.com/Solved-Display-Euro-character-in-gvim-7-2-td1176874.html Our combined tribal knowledge now records TWO answers to the OP question! |
#116
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is Mayayana said on Mon, 9 Oct 2017 10:02:35 -0400:
It's easy enough to look these things up. In fact I already mentioned them. I agree with you that I didn't have the knowledge that you have to solve the problem when I first asked the question. The problem with "looking these things up" is that I was looking up these things in the wrong direction. I first thought that MS Word "smart quotes" would work - but they failed. Then I thought, as you did, that character encoding would work - but it too was never the problem. So the problem with 'looking things up' is that you have to pretty much know what the answer is before you can find the answer, in such cases. It turns out the answer is all about searching and replacing what is a new word to me, which is digraphs! Here's a sample that I found listed third at DDG searching for curly quotes in UTF-8. It's not a set of characters for typing curly quotes. It's 3 bytes that define a curly quote in UTF-8 but read as 3 characters in ANSI. And you can't type them. I think the whole UTF-8 direction is a bust. It's not about encoding (as far as I can tell), especially since GVIM seems to handle any encoding simply by typing the encoding: :set encoding=cp1252 -- tells vim to use Windows-1252 character encoding :set coding=utf-8 -- tells vim to return to UTF-8 encoding. I think the entire encoding issue was a red herring. (I admit though that I don't understand encoding; all I know is the empirical evidence which is that *every* encoding suggestion I tried failed - and - the empirical evidence that the problem was solved by diagraphs, no matter what the encoding). You don't seem to really be reading the information people are giving you. Why not look into the encoding? Why not switch to a better editor? (Global search and replace is not a rare animal.) I read everything everyone suggested. I installed all the programs everyone suggested. In a way, that was the problem which took me all night to figure out, which is that most of the suggestions were red herrings. I'm pretty sure that encoding is a red herring. However, I readily admit that I don't understand encoding so I base that assumption only on the purely empirical evidence of: 1. I set the encoding in every way suggested - and all failed. 2. I solved the problem without dealing with encoding. https://groups.google.com/d/msg/alt.usage.english/H5lxAMNh5Kw/zbIkHZ_xAQAJ But I expect you'll also find other issues if you want to keep that webpage text as ANSI. That's why it may not be the simple solution that you insist on finding. Sometimes spaces are done with UTF-8 character combinations. Things like a copyright sign will be different in UTF-8 vs ANSI. And so on. Thank you for that advice Mayayana, where I openly admit I really don't understand this encoding stuff (because it has never mattered in what I do). Since I aim to stay in pure text, the only encoding I care about is the KISS super simple most compatible US American Keyboard encoding. That's it. I didn't know this when I first asked, but since the answer is readily solved with only a simple search and replace, I don't think the encoding was ever the problem. Also, the answer was solved a second way, by accident, when I tried to answer your question about how to display the Euro, which was to simply change my font from the default to something that recognizes the Euro (and smartquote) digraphs! http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg But I could be wrong that encoding isn't the underlying issue since I don't understand why I even have to deal with encoding, since I'm perfectly happy with the characters on the US American keyboard in any standard font. |
#117
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is J. P. Gilliver (John) said on Mon, 9 Oct 2017 12:46:40 +0100:
Note You can find and replace all instances of single or double curly quotes with straight quotes in your document. To do this, clear the "Straight quotes" with "smart quotes" check box on the AutoFormat As You Type tab. On the Edit menu, click Replace. In both the Find what and Replace with boxes, type ' or ", and then click Find Next or Replace All. I haven't tried this. Note it says put ' or "" in _both_ boxes, i. e. tell it to replace " with " - I grant that would _not_ have occurred to me!) Now that the problem is solved two different ways, both written up to leverage the tribal knowledge gained... https://groups.google.com/d/msg/alt.usage.english/H5lxAMNh5Kw/zbIkHZ_xAQAJ I'm catching up on all the helpful responses. Your suggestion above turned out to be *perfect* which was that the second simplest solution of all was to figure out what the digraphs were (a new word for me) for the non-printable characters and then to run a global search and replace. The diagraphs were found by the gvim "ga" command as decimal 147 and decimal 148 respectively for the opening & closing curly quotes. So the standard gvim search and replace command replaces them with straight quotes. The only problem was that I had to learn a few basics about text editors: a. I had to realize that non-printable diagraphs were the problem. b. I had to realize that encoding to "pure text" was not the problem. c. I had to learn how to find a diagraph's decimal equivalent. d. I had to syntactically learn how to search-and-replace diagraphs. Once those hurdles were solved, the great news is that replacing diagraphs is the same as replacing anything else - only with some added syntactically required characters. Hence, your approach of learning the search-and-replace was spot on perfect! It turns out that Mayayan's recent suggestion of displaying the Euro caused me to accidentally find the *simplest* solution, which was simply to change the font in gvim to Courier New (or any font that knows about the smartquote (and euro) digraphs!). http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg |
#118
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is J. P. Gilliver (John) said on Mon, 9 Oct 2017 12:51:20 +0100:
Just out of curiosity: do you _ever_ use Ctrl-C, X, and V? Or highlighting to select (which can be done from the keyboard - I usually do - rather than the mouse)? In vi, to copy and paste and to change a word or to change a sentence is so easy that you "often" *don't need* the control+x, control+v, and control+c keyboard shortcuts. But I use them all the time in *all* my programs, as they are embedded into muscle memory just as much as the vi commands are. So, I use them in vi if/when my muscle memory knee-jerks them into place! (Also: the above would move an individual line. How easy is it to move, say, a sentence, that starts in the middle of a line and ends in the middle of the next line - to a position in the middle of another line?) The gvim editor isn't perfect, but the long answer to your question is that a regular expression can find *anything* and then the rest of the commands just move around what the regular expression found. I almost never need to use the gvim GUI, but if I needed to do something that was easier done with the GUI (aka the mouse) than by regular expressions, my "lazy" attribute (inherent in all computer knowledgeable people) would take over. In that case, I'd just select the line and cut and paste it just like you would. Having the option to do whatever you want is what the best freeware is all about. GVIM isn't the only pure text editor, as I'm sure they *all* can do what you just asked ... can't they? |
#119
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is Ken Blake said on Mon, 09 Oct 2017 10:59:18 -0700:
Or alternatively, select with either the keyboard or the mouse, then either press delete or drag it where you want it. Exactly. I don't think there is a text editor alive that can't cut and paste with the mouse. He persists in thinking that the text editor he uses is his only choice, when there are many others that will do at least as good a job as his, and at least as easily. I don't want to argue with you, especially as you're helping me, and as, together, we just increased the tribal knowledge of the ng as a whole because the answer to the original question turned out to be super simple as summarized he http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg https://groups.google.com/d/msg/alt.usage.english/H5lxAMNh5Kw/pSNETZP2AQAJ The only reason this thread is long is simply because nobody knew the answer until now (one day later). However, having said I don't want to argue, I will state that I'll use the best of the freeware that is out there, where the *cost* of the freeware is in learning which is the best and in learning how to use it. For example, I am just now learning how to use Shotcut freeware, which, like GIMP freeware, does *everything* you need to do - but in a super secret way. Why is it the best then? Because it does *everything* you need it to do. The cost is in *learning* how to do it. Once that cost is expended, then trying out dozens of other video editors isn't fruitful anymore. SO you want to stick with the canonical software that can do everything. Gvim is such software. Notepad++ might be such software - I don't know - it's too early to tell. |
#120
|
|||
|
|||
Convert those dastardly curly quotes to straight quotes on Windows?
He who is Mayayana said on Mon, 9 Oct 2017 11:13:29 -0400:
You pasted into Notepad and saved as UTF-8, but then you opened it in VIM, which is rendering ANSI. Try opening the same file in Notepad. I know you don't want to hear this, but it's complicated. I think the super simple simplest answer is he https://groups.google.com/d/msg/alt.usage.english/H5lxAMNh5Kw/pSNETZP2AQAJ If you save text as UTF-8 you solve the problem, but then you need to read/edit as UTF-8 and need a program that recognizes it. Suddenly you're dealing with a number of plain text formats. I can't explain why it worked, but simply changing the GVIM font from the default to "Courier New" solved the problem of me not seeing the smartquote digraphs! http://wetakepic.com/images/2017/10/09/digraphs98b40.jpg Notice the first 3 weird marks in your VIM screenshot? Those are the UTF-8 marker characters, bytes EF BB BF, that tell the editor it's UTF-8. Vim doesn't see them. It's just rendering them as characters. How does that explain that it works when I change the font? But your sample provides a good example of what you're up against. There are more UTF-8 characters in that text than just curly quotes. I agree. The problem seems to also show itself with the Euro. That's why I was saying that it's not as simple as a basic replacement. The number and variety of UTF-8 characters in copied text is an unknown. If you just need to do a replace on curly quotes that's doable, but any more than that may take time. The problem seems to be solved by changing the font, so, that's about as simple as things get (since I don't care what font I use for text editing). Can anyone explain why simply changing the font worked? |
Thread Tools | |
Display Modes | Rate This Thread |
|
|