Mayayana wrote:
"J. P. Gilliver (John)" wrote
| (Unfortunately, TP stores it with strong encryption, that can't be
| turned off.)
According to this...
https://www.mailxaminer.com/blog/inv...client-emails/
...it uses Berkeley Mailbox format, or at least can
export in that format as a .mbox file. That's also
the format that TBird uses, or rather a variation
of it. (I wasn't aware of the name or of its being
a standard. I just had to figure out what the
pattern of storage was in TBird and it turns out
to be like BMF. It's not a very sensible format
because it allows for ambiguity, but I suppose
What ambiguity?
the Mozillians have a higher regard for tech
"historicity" than for efficiency. Most geeks do.
They're a surprisingly conservative bunch.)
At any rate, it's a simple format to parse. They
use an empty line followed by "From -", which is
then followed by 3 lines of superfluous officiality.
That pattern separates each email. I say it's
poorly designed because "From -" is not a unique
string, and empty lines are common in emails.
In mbox format, any leading "From " in the body of an email message is
escaped with a '', i.e. "From " becomes "From ". This is done
because - as you've found - a "From " line signifies the start of the
next message.
An example from :-) my News/posted mbox file:
From the 2006 The Washington Post's Style Invitational [1]:
I.e. this is not a *quoted* line, but an *escaped* unquoted line,
[...]