PCbanter - View Single Post - Acrobat failed to connect to a DDE server"

**Arlen Holder** · June 22nd 18, 07:04 PM posted to alt.comp.os.windows-10,alt.comp.freeware

On Fri, 22 Jun 2018 16:13:32 +0100, wasbit wrote:

This is what I understand by a clickable link (hyperlink).

Ah. We're using the same words, sort of, but we assign different meanings
to them. No worries. I'm an old hand at creating archives of web sites
using PDF, where it seems only Paul understood the problem set.

At the moment, there only two known tools that will create the type of
archive Paul and I are speaking about, only one of which is free:
* The Adobe Acrobat payware (control+shift+o)
* The https://wkhtmltopdf.org command line tool

It's funny but I've used the Adobe Acrobat payware so many times to create
a self-contained archive of an entire web site that my "muscle memory"
still remembers, after all these years, the keyboard command sequence!

I go to a web site & in my browser select File/Print, then PDF Creator as
the printer, the options that I want, & save. I have never selected the
Print to File option. In fact for the first time ever, I opened the PDF
Creator settings today.
I open the saved PDF & if there is a link within, I click on it & it takes
me to the relevant web page using my browser.

Thanks for explaining. That's not what we (Paul and I) were talking about.
Almost every PDF creator does that, which is why I knew, sort of, that we
had a disconnect when you mentioned CutePDF and the others, as I know they
don't do what Paul's freeware wk-html-to-pdf command-line tool does.

The resulting PDF has to be *completely self contained*.
That is, EVERY link takes you to a page within the PDF in all cases.

Obviously there are dangers, which Paul directly noted when he stated how
*huge* the Microsoft PDF he referenced was. The one PDF document contains
potentially the entire Internet on it.

That is, every single link that is referenced at the web site is "crawled"
and included, and then ever link that is on every web site is crawled and
included, ad infinitum.

Clearly this gets huge real fast, so, often, you limit the switches to just
the home domain as Paul and I were discussing, or you limit the switches to
stop at the second, third, fourth, fifth, whatever level deep.

The result though is *always* a single PDF file, which is self-contained.
Every link is active but it goes to a link that is inside the PDF file.

The one PDF file, as Paul noted, is a self-contained mirror of the entire
web site (if you go that deeply).

This is immensely useful for a variety of reasons (otherwise it wouldn't
exist). But it's hard to find in freeware. As I noted, for years, it was
the only reason I paid for Adobe Acrobat payware.

Unless there is something that I've forgotten, this is, for me, the standard
procedure. I don't need to alter any settings.

Now you know that we were talking about different things.
I was talking about mirroring, for example, the entire Microsoft web site
into a single PDF, where every link that you click on in that PDF goes to a
page that is in that PDF.

There are only two known tools that do that.
1. Adobe Acrobat payway (control + shift + o)
2. Paul's suggestion of wk-html-to-pdf

HINT: I don't generally ask easy questions.
Paul knew, instantly, that it was a difficult task.

I don't blame you in the least for trying to help, as I *appreciate* your
help. I just provided my log file of PDF Creator, and I had forgotten that
I already had it (on another machine which had been bricked by Cortana).

I've tested *all* the known PDF writers over time as you don't know this,
but I was one of the first people to get on the PDF bandwagon oh, I don't
know, so very many years ago around the time Apple did too. I thought it
was great stuff at the time for portability but my company constantly told
me that they never heard of it and that they didn't want to support yet
another file format. How dumb they were ... as everyone supports PDF
nowadays. But (if you believe me), that proves I was early on the
bandwagon.

What happens occasionally is that the PDF will only save as the first page
or a link doesn't work. I have always put this down to (but don't know) the
permissions set by the author of the web page rather than a failure of the
printer.

Yeah. There are *plenty* of places for such tools to fail, especially as
some pages only exist in code, others require flash, some only work with
certain browsers, others have password needs, etc.

Ok, I see where you are coming from now. I printed the search page from my
browser & none of the links were clickable.

Right. Thanks for testing. Not only must the links be clickable, but they
have to reference pages within the PDF itself. In essence, the PDF becomes
a single copy of the entire web site, and, potentially, the entire
Internet.

That's why Paul mentioned that the Microsoft PDF he referred to was huge.

To my mind, you need a web site copier like HTTrack rather than a PDF
printer
- http://www.httrack.com/

Thanks for that reference, where, it seems to simply mirror the HTML web
site locally (which wget already does, does it not?).

Once you mirror it, you still need to print it to a single PDF which has
clickable links which resolve to other pages in that clickable PDF.

I appreciate the htttrack suggestion - but if I wanted to mirror a web site
in HTML (millions of files, of course) I'd just use wget, I think.

To mirror a site in a single PDF, that's what Paul and I were talking about

You'll say - I asked for a PDF option
I'll say - what difference does the type of saved file make as long as it
achieves the end result?

Well, a PDF is a single file. The HTML mirror you speak about would be a
billion files. They're completely different things, like motorcycles and
cars are completely different things.

If you need to haul a refrigerator in a single trip, then a pickup is what
you need, not a zillion motorcycles.

Anyway, Paul answered the question, which is that there are two known apps
which will create a single self-contained clickable PDF of an entire web
site, one of which is free, the other of which is the Adobe payware.

Really, this discussion should be in alt.comp.freeware but there is very
little relevant activity in that group now.

I'm an old hand at a.c.f and I agree that it has gone to ruin.