If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Scanning text
I need advice on the correct way to scan a text document in order to add it
to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair |
Ads |
#2
|
|||
|
|||
Scanning text
Send it in an image file format rather than a document file format, e.g.
jpg, bmp, tiff are all image files (e.g. a picture, jpg keeps size down, bmp is original, both 100% no problems unless pc has problems, tiff maybe a problem) that IE or even Paint can open where-as pdf is a document file format for Adobe Reader hence need that (or similar) installed to view it. "bm" wrote in message ... I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair |
#3
|
|||
|
|||
Scanning text
bm wrote:
I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair You're discussing optical character recognition, as an option for handling content. Here are a few examples, of sending a newspaper article to a friend. ******* Scan ------ bitmap ----- bitmap stored in .PDF file In that case, the resulting newspaper.PDF is neatly and accurately preserved. It looks almost as good, as the original piece of newspaper. The recipient's eyeballs, will do an excellent job of looking at any marginally captured printed letters, and figuring out what the article says. (Humans are better at that, than any computer program. OCR sucks.) In addition, if the newspaper article includes pictures, they'll be preserved. This requires that the recipient, own a PDF reader program. The intermediate case, works like this. Scan ------ bitmap ------+------ bitmap plus words, stored in .PDF file | ^ | | Text +-- OCR ---+ Some optical character recognition software (may come with scanner, or be purchased separately), has the ability to find letters in the image, and lay a text string over top of the scanned bitmap image. Any text which is not recognized, is not converted. In some cases, this gives an enhanced appearance to the text. But the recognition is at best perhaps 99%, so there will be errors. And the error-filled letters, will hide the part of the bitmap underneath with the originally captured image. In that case, you send newspaper.PDF to your friend, and the OCR may make a few, distracting mistakes. Now, in the third case, the operation looks like this. Again, OCR is used, to convert what looks like text in the bitmap scan, into letters. The letters are then stored in a text file, with poor formatting. If the original newspaper article was three columns of text, I can't guarantee all the words are in neat columns. It may take significant text editing after the OCR step, to make a presentable document. Scan ------ bitmap | | +-- OCR --- words only, stored in a text file That third option, of using a text file, means that more tools at the recipient's end can be used. You can even copy the text, right into the email, and not bother making an attachment from it. But any transcription errors in the text, have to be fixed at your end, to prevent annoying the recipient with a multitude of errors. I think the first option, scanning as a bitmap, saving as PDF and sending that, is the best compromise. It requires no labor on your end, there is no possibility of OCR errors (because you're not using OCR). The only disadvantage, is the recipient must have a PDF reader program. If you could find a small enough PDF reader program, you could send that by email as well. At least one non-Adobe free one, is near the end of the list (Foxit Reader). Perhaps that is smaller than the multi-megabytes of some Adobe download. http://en.wikipedia.org/wiki/PDF_reader There is a fourth conversion option. It would look like this. Scan ------ bitmap ----- bitmap stored in JPEG picture file In that case, the recipient needs an image viewer program. And the computer may already have one of those, even if it doesn't have a PDF viewer. Choose an image format, which you know the OS of the recipient can handle. There are formats other than JPEG for example. For highest compression, I might select CCITT compression in a TIFF format, with thresholding used to convert the scan into black and white. By adjusting the threshold, you can send a page of text in about 50 KB as an image file. And that is easier for someone who is on dialup to download. That is the computer equivalent of using a FAX machine, since the CCITT compression algorithm is the same one as is used for faxes. http://en.wikipedia.org/wiki/Fax "The transferred image formats are called ITU-T (formerly CCITT) fax group 3 or 4." The TIFF image format, has an option for that kind of compression. But on your end, you need to use an image editing program, to reduce the image to black and white from the original color scan. "Setting the threshold" determines whether a dot is black or white. You adjust the threshold, until you can read the newpaper print in the image. And you'd only do the extra work, if you knew the recipient was on dialup. If your friend is on broadband, then just send the JPEG image file as an attachment to the email. The best options, don't require excessive work on your part. PDF or JPEG, and you could be done in five minutes. I'm a big fan of 1) Knowing the recipient's skills and the capabilities of their computer. 2) Only sending something they can open. I consider it rude, to send something they don't have a chance of opening. (Like if someone sends me a Christmas morning movie they shot, in a format I can't even figure out what it is.) But, that's just me :-) ******* When you're using the scanner, you don't need to scan at extremely high dots per inch. Newspaper and fine art magazines, are printed with "dots" of ink. To scan them, select the "de-screen" option in your scanner software, and tell the software what the screen print frequency is. For example, on a fine art magazine, there might be about 150 dots of ink per inch. You tell the scanner, what that screen frequency is. You can scan, somewhere between 1x and 2x the screen frequency, for adequate results. If the newspaper uses 75 dots of ink per inch, then scanning at no more than 150 dots per inch, with de-screen turned on, should suffice. (That's just going from memory. I haven't used my scanner in years :-) It's all dusty. ) You can do a practice scan at high resolution (1000 DPI), on a tiny section of newspaper first. Then, open the image in your image editor, and count how many dots of ink there are per inch. On your final scan, you can use the information gathered, to set up the de-screen. http://en.wikipedia.org/wiki/Color_printing "Screens with a "frequency" of 60 to 120 lines per inch (lpi) reproduce color photographs in newspapers. The coarser the screen (lower frequency), the lower the quality of the printed image. Highly absorbent newsprint requires a lower screen frequency than less-absorbent coated paper stock used in magazines and books, where screen frequencies of 133 to 200 lpi and higher are used." This article gives more practical advice. "Why we use descreening" http://www.scanhelp.com/288int/scontent/descreen.html HTH, Paul |
#4
|
|||
|
|||
Scanning text
But, sir, Adobe Reader is a ubiquitous program that is absolutely free
and freely downloadable. The whole idea of .pdf files is that (once the absolutely free Adobe Reader is downloaded and installed) they can be opened on any computer. And there is no excuse why such a useful free program should not be on any computer especially when some versions of ..pdf files are also searchable. Surely this facility makes it far more desirable to send a .pdf file as opposed to .jpg or some suchlike image files. choro ***** On 10/12/2010 15:29, pjp wrote: Send it in an image file format rather than a document file format, e.g. jpg, bmp, tiff are all image files (e.g. a picture, jpg keeps size down, bmp is original, both 100% no problems unless pc has problems, tiff maybe a problem) that IE or even Paint can open where-as pdf is a document file format for Adobe Reader hence need that (or similar) installed to view it. wrote in message ... I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair |
#5
|
|||
|
|||
Scanning text
It is very doubtful that there is anyone in the world that doesn't have a
PDF reader on their computer. PDF is an extremely common format and it is used all the time. "bm" wrote in message ... I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair |
#6
|
|||
|
|||
Scanning text
You asked for the "best" way to scan, save and send your text document.... The best way, if you have Microsoft Office installed, or your Epson scanner came with OCR (Optical Character Recognition). If you have M$ Office, you can use OCR with Microsoft Office Document Imaging under Office Tools in the Office Start Menu. The point being, that converting your text-document with OCR software into a digital text file such as a plain [.TXT] or Microsoft [.DOC] files, are going to save on size dramatically - being in the order of %1 of the size of an image-file. Your PDF solution seems a good idea, and it is about the best of the imaging choices. I say imaging because, although PDF format uses a mixture of compressed imaging *and* formatted text in the way it stores it's presentations, if you choose not to employ OCR, the PDF will contain just a compressed image - measurably larger.than a text-based PDF file. Because there's a big difference in the size of a PDF document that has used OCR and has converted the actual text from the scanned document, and one that has simply made a compressed bit-map out of the scanned image. By using Microsoft Office Document Imaging, or OCR software bundled with your Epson scanner, you can save your file (once converted to text) as a character-based file like [.TXT] [.RTF] [.WRI] or [.DOC] amongst by far the most popular text formats. Take a typical A4 letter that has been scanned and left as an image, or converted to .PDF [NOT USING CHARACTER RECOGNITION] will turn out around : 200 KB for a PDF without OCR (.pdf) 200 KB for a JPEG compressed image (.jpg). 1.5 Mb for an uncompressed bitmap (.bmp) However, a scanned document, converted to text using OCR software, will turn out at around 13 KB for a PDF *using* OCR (.pdf). 3 KB for M$ Word Document (.doc). 1 KB for a plain text file (.txt). So you can see, that getting your letters into text (character) format, will be immensely beneficial on the end size of the file you are trying to attach and email. == Cheers, Tim Meddick, Peckham, London. :-) "bm" wrote in message ... I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair |
#7
|
|||
|
|||
Scanning text
"bm" wrote in message
... I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? If you doubt the recipients can read Adobe PDF files (see other replies) you must ask what formats they can read, or at least what Operating Systems their PCs use. We know WinXP PCs are configured to display standard graphic formats (JPG, BMP, TIF etc.) -- Don Phillipson Carlsbad Springs (Ottawa, Canada) |
#8
|
|||
|
|||
Scanning text
"Paul" wrote in message ... bm wrote: I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair You're discussing optical character recognition, as an option for handling content. Here are a few examples, of sending a newspaper article to a friend. ******* Scan ------ bitmap ----- bitmap stored in .PDF file In that case, the resulting newspaper.PDF is neatly and accurately preserved. It looks almost as good, as the original piece of newspaper. The recipient's eyeballs, will do an excellent job of looking at any marginally captured printed letters, and figuring out what the article says. (Humans are better at that, than any computer program. OCR sucks.) In addition, if the newspaper article includes pictures, they'll be preserved. This requires that the recipient, own a PDF reader program. The intermediate case, works like this. Scan ------ bitmap ------+------ bitmap plus words, stored in .PDF file | ^ | | Text +-- OCR ---+ Some optical character recognition software (may come with scanner, or be purchased separately), has the ability to find letters in the image, and lay a text string over top of the scanned bitmap image. Any text which is not recognized, is not converted. In some cases, this gives an enhanced appearance to the text. But the recognition is at best perhaps 99%, so there will be errors. And the error-filled letters, will hide the part of the bitmap underneath with the originally captured image. In that case, you send newspaper.PDF to your friend, and the OCR may make a few, distracting mistakes. Now, in the third case, the operation looks like this. Again, OCR is used, to convert what looks like text in the bitmap scan, into letters. The letters are then stored in a text file, with poor formatting. If the original newspaper article was three columns of text, I can't guarantee all the words are in neat columns. It may take significant text editing after the OCR step, to make a presentable document. Scan ------ bitmap | | +-- OCR --- words only, stored in a text file That third option, of using a text file, means that more tools at the recipient's end can be used. You can even copy the text, right into the email, and not bother making an attachment from it. But any transcription errors in the text, have to be fixed at your end, to prevent annoying the recipient with a multitude of errors. I think the first option, scanning as a bitmap, saving as PDF and sending that, is the best compromise. It requires no labor on your end, there is no possibility of OCR errors (because you're not using OCR). The only disadvantage, is the recipient must have a PDF reader program. If you could find a small enough PDF reader program, you could send that by email as well. At least one non-Adobe free one, is near the end of the list (Foxit Reader). Perhaps that is smaller than the multi-megabytes of some Adobe download. http://en.wikipedia.org/wiki/PDF_reader There is a fourth conversion option. It would look like this. Scan ------ bitmap ----- bitmap stored in JPEG picture file In that case, the recipient needs an image viewer program. And the computer may already have one of those, even if it doesn't have a PDF viewer. Choose an image format, which you know the OS of the recipient can handle. There are formats other than JPEG for example. For highest compression, I might select CCITT compression in a TIFF format, with thresholding used to convert the scan into black and white. By adjusting the threshold, you can send a page of text in about 50 KB as an image file. And that is easier for someone who is on dialup to download. That is the computer equivalent of using a FAX machine, since the CCITT compression algorithm is the same one as is used for faxes. http://en.wikipedia.org/wiki/Fax "The transferred image formats are called ITU-T (formerly CCITT) fax group 3 or 4." The TIFF image format, has an option for that kind of compression. But on your end, you need to use an image editing program, to reduce the image to black and white from the original color scan. "Setting the threshold" determines whether a dot is black or white. You adjust the threshold, until you can read the newpaper print in the image. And you'd only do the extra work, if you knew the recipient was on dialup. If your friend is on broadband, then just send the JPEG image file as an attachment to the email. The best options, don't require excessive work on your part. PDF or JPEG, and you could be done in five minutes. I'm a big fan of 1) Knowing the recipient's skills and the capabilities of their computer. 2) Only sending something they can open. I consider it rude, to send something they don't have a chance of opening. (Like if someone sends me a Christmas morning movie they shot, in a format I can't even figure out what it is.) But, that's just me :-) ******* When you're using the scanner, you don't need to scan at extremely high dots per inch. Newspaper and fine art magazines, are printed with "dots" of ink. To scan them, select the "de-screen" option in your scanner software, and tell the software what the screen print frequency is. For example, on a fine art magazine, there might be about 150 dots of ink per inch. You tell the scanner, what that screen frequency is. You can scan, somewhere between 1x and 2x the screen frequency, for adequate results. If the newspaper uses 75 dots of ink per inch, then scanning at no more than 150 dots per inch, with de-screen turned on, should suffice. (That's just going from memory. I haven't used my scanner in years :-) It's all dusty. ) You can do a practice scan at high resolution (1000 DPI), on a tiny section of newspaper first. Then, open the image in your image editor, and count how many dots of ink there are per inch. On your final scan, you can use the information gathered, to set up the de-screen. http://en.wikipedia.org/wiki/Color_printing "Screens with a "frequency" of 60 to 120 lines per inch (lpi) reproduce color photographs in newspapers. The coarser the screen (lower frequency), the lower the quality of the printed image. Highly absorbent newsprint requires a lower screen frequency than less-absorbent coated paper stock used in magazines and books, where screen frequencies of 133 to 200 lpi and higher are used." This article gives more practical advice. "Why we use descreening" http://www.scanhelp.com/288int/scontent/descreen.html HTH, Paul I am very grateful to you and others who have replied. I have learned a lot which will stand me in good stead in future. It looks that my decision to use PDF file was the best one and I sent the first trial to a friend and he tells me he read it OK but the image was very small and he couldn't increase the size. Blair |
#9
|
|||
|
|||
Scanning text
bm wrote:
I am very grateful to you and others who have replied. I have learned a lot which will stand me in good stead in future. It looks that my decision to use PDF file was the best one and I sent the first trial to a friend and he tells me he read it OK but the image was very small and he couldn't increase the size. Blair The Acrobat Reader has a zoom function. Depending on the version, it can probably magnify at least 16 times. So that could be used to fix it, at the recipient's end. (Try looking in the "View" menu.) You should review your documents, in your own copy of Acrobat Reader, before sending them. That will allow you to anticipate problems, before they happen. I realize the latest version of Acrobat Reader is not very friendly. One of the reasons I haven't upgraded, is Acrobat Reader 9 has a dreadful interface. I continue to use Acrobat 6 version, as it is easier to use, and I'm more productive. (The search function works better.) I don't know "what Adobe was smoking", when they wrote version 9. It's a step backwards. Sometimes I'm forced to use version 9, because the document I get, won't open in 6. And I really hate having to use another computing environment, to do that. I'm surprised your dedicated scanning program, with PDF output, didn't do a better job for you. There isn't much point in such a program, unless it is easy to use and produces perfect results. I've converted lots of documents, using half-baked free tools, and it can take many tried to get all the scale, DPI, and other issues, sorted out. When you pay for a program to do it, it's supposed to work :-) Paul |
#10
|
|||
|
|||
Scanning text
Again, the best way to crate an Acrobat PDF file (if you haven't got
£700 to spend on the full Acrobat Suite) is to create your document in Word and then use one of the third-party PDF Virtual Printers that are currently being distributed as freeware (Cute PDF is one). That way, you can get a very clear idea of what your PDF file is going to look like as a finished article - pictures and text. Otherwise, you may find that you have your pictures in too small a resolution to be able to gain any benefit from Acrobat Reader's zoom facility. The "Virtual Printer" is installed like any software and appears as an installed printer. When selecting to print to the Virtual Printer instead of your "real" one, a "Save As..." dialog appears for you to select a file to save a PDF document to. This is not the same as the "Print to file" check-box option, which does not produce any coherent results. Cute PDF can be obtained from : http://www.cutepdf.com/download/CuteWriter.exe Cute PDF Writer requires the installation of GohstScript also : http://www.cutepdf.com/download/converter.exe == Cheers, Tim Meddick, Peckham, London. :-) "Paul" wrote in message ... bm wrote: I am very grateful to you and others who have replied. I have learned a lot which will stand me in good stead in future. It looks that my decision to use PDF file was the best one and I sent the first trial to a friend and he tells me he read it OK but the image was very small and he couldn't increase the size. Blair The Acrobat Reader has a zoom function. Depending on the version, it can probably magnify at least 16 times. So that could be used to fix it, at the recipient's end. (Try looking in the "View" menu.) You should review your documents, in your own copy of Acrobat Reader, before sending them. That will allow you to anticipate problems, before they happen. I realize the latest version of Acrobat Reader is not very friendly. One of the reasons I haven't upgraded, is Acrobat Reader 9 has a dreadful interface. I continue to use Acrobat 6 version, as it is easier to use, and I'm more productive. (The search function works better.) I don't know "what Adobe was smoking", when they wrote version 9. It's a step backwards. Sometimes I'm forced to use version 9, because the document I get, won't open in 6. And I really hate having to use another computing environment, to do that. I'm surprised your dedicated scanning program, with PDF output, didn't do a better job for you. There isn't much point in such a program, unless it is easy to use and produces perfect results. I've converted lots of documents, using half-baked free tools, and it can take many tried to get all the scale, DPI, and other issues, sorted out. When you pay for a program to do it, it's supposed to work :-) Paul |
#11
|
|||
|
|||
Scanning text
Or download and install the free OpenOffice.Org where you can open any
MS Office user file and save it as a PDF file. Simple and Free.... Freeee....! -- choro ***** On 13/12/2010 10:55, Tim Meddick wrote: Again, the best way to crate an Acrobat PDF file (if you haven't got £700 to spend on the full Acrobat Suite) is to create your document in Word and then use one of the third-party PDF Virtual Printers that are currently being distributed as freeware (Cute PDF is one). That way, you can get a very clear idea of what your PDF file is going to look like as a finished article - pictures and text. Otherwise, you may find that you have your pictures in too small a resolution to be able to gain any benefit from Acrobat Reader's zoom facility. The "Virtual Printer" is installed like any software and appears as an installed printer. When selecting to print to the Virtual Printer instead of your "real" one, a "Save As..." dialog appears for you to select a file to save a PDF document to. This is not the same as the "Print to file" check-box option, which does not produce any coherent results. Cute PDF can be obtained from : http://www.cutepdf.com/download/CuteWriter.exe Cute PDF Writer requires the installation of GohstScript also : http://www.cutepdf.com/download/converter.exe == Cheers, Tim Meddick, Peckham, London. :-) "Paul" wrote in message ... bm wrote: I am very grateful to you and others who have replied. I have learned a lot which will stand me in good stead in future. It looks that my decision to use PDF file was the best one and I sent the first trial to a friend and he tells me he read it OK but the image was very small and he couldn't increase the size. Blair The Acrobat Reader has a zoom function. Depending on the version, it can probably magnify at least 16 times. So that could be used to fix it, at the recipient's end. (Try looking in the "View" menu.) You should review your documents, in your own copy of Acrobat Reader, before sending them. That will allow you to anticipate problems, before they happen. I realize the latest version of Acrobat Reader is not very friendly. One of the reasons I haven't upgraded, is Acrobat Reader 9 has a dreadful interface. I continue to use Acrobat 6 version, as it is easier to use, and I'm more productive. (The search function works better.) I don't know "what Adobe was smoking", when they wrote version 9. It's a step backwards. Sometimes I'm forced to use version 9, because the document I get, won't open in 6. And I really hate having to use another computing environment, to do that. I'm surprised your dedicated scanning program, with PDF output, didn't do a better job for you. There isn't much point in such a program, unless it is easy to use and produces perfect results. I've converted lots of documents, using half-baked free tools, and it can take many tried to get all the scale, DPI, and other issues, sorted out. When you pay for a program to do it, it's supposed to work :-) Paul |
#12
|
|||
|
|||
Scanning text
On 12/10/2010 7:48 AM, bm wrote:
I need advice on the correct way to scan a text document in order to add it to an email I have an Epson scanner which allows me to select the type from JPEG, BITMAP,Tiff PDF etc. I chose PDF and selected Text and scanned the newspaper cutting . I saved it to my scanning file and it transpires that it is now an Adobe Acrobat Document. This I can open but can my recipients open it if they do not have the Adobe Reader? Alternatively how should I send it to ensure that my friends can read it? Blair Many good suggestions have been offered. I would like to offer a different way to decide what format to use. Over several years I have sent many e-mail messages to a group of about 500 people. I have tried many different formats and have received several complaints. My experience tells me the following: TEXT: Everyone can read text. RTF: Few complaints. MS Word: A few complaints. MS Excel: Quite a few more complaints. PDF: Very few complaints. JPG: Several more complaints than PDF. TIF: A big mistake to send it this way. PUB: Huge problems, it was a mistake on my part to use this. Wilby |
Thread Tools | |
Display Modes | |
|
|