View Single Post
  #17  
Old March 16th 19, 10:59 PM posted to alt.windows7.general
Paul[_32_]
external usenet poster
 
Posts: 11,873
Default is "Everything" doing some mining?

J. P. Gilliver (John) wrote:
In message , Mayayana
writes:
[Everything]
I'm surprised it needs to index if it doesn't do
content search. Can it actually be faster to search
its own list than to search the file system?

I think maybe it _does_ search the file system; certainly, when you are
typing in the part filename you are looking for, it seems to amend the
list it's presenting to you with each character you type, even at normal
typing speed.

Go on, give it a try - you can always uninstall it. (And I agree,
Everything is a confusing name!)


"Everything.exe" indexes in two stages, if it is starting
from scratch.

It reads the $MFT and parses it. This gives a list
of file names, but not their size or creation date.
This might take two seconds. The voidtools designers
really should have stopped at this point.

It's the second phase which is more expensive. If they
want to add dates and sizes and so on, file details,
that requires "walking the tree".

Finally, at some point, the "list" needs sorting.
Perhaps as a means of reducing search time on actual
searches (binary probing?).

None of these activities should particularly leave
the CPU railed, forever... Sorting a list takes
time. But not infinite time.

*******

Later, if a new file is created, it is added to the
NTFS USN Journal. "Everything.exe" hears of this, the item
is added to the list (insertion sort maybe), the size
and date noted. Adding a single file to the existing
index should not be particularly expensive. Only
the first scan of the day should be expensive (assuming
it regenerates the list for fun, once a day).

*******

How does Everything.exe handle Junction Points ?

Does it use a timer to "shut down" aberrant behavior ?

On newer OSes and the C: drive, a lot more care must
be exercised to avoid the "usual traps". A Junction Point
can cause looping recursion until the attempt hits a "path too long"
error and the software in question moves to the next file
or directory. This could potentially take slightly longer
on file systems where the 64K path length has been
enabled (Win10? Optional?).

Another "trap" for file traversing software, is to step
into a certain crypto directory, attempt to read a
certain file... and discover it's a named pipe and it
blocks on input and the visiting program stops in its
tracks. Everything.exe should not fall for that, because
Everything.exe does not "read" files, it only stats() them.
Programs like hashdeep have to be aware of things like
that, and the hashdeep command line is filled with various
letter options for all the "traps" Windows can throw at it.

*******

Even ProcMon can't help us in cases like this.

If Everything.exe was "scanning" the file system, then
ProcMon logs all the file system calls. That activity is
visible.

But if someone puts a "list" in memory and does QuickSort
on it, that's a CPU/memory bound activity, creating
zero entries in ProcMon. So we can't know what it is doing.
When an activity carries out no system calls, we are left
in the dark. Now, it's time for the logic analyzer and
"logging every address".

And that stopped being particularly feasible, a long time ago.
I used to have a $1500 sampling head for a $50K logic analyzer
to make this possible, and those *were* the good old days.

The interface, if still present on modern processors, is likely
to slow down execution so much, as to "disturb" whatever it is
you're trying to study. (At one time, we simply traced every address
and data pin. And "back-filled" using disassembly and software
trace analysis. Modern processors have a "thin" debug interface,
which, while it works, it would take a lot of cycles to
trace a single instruction. And if you had a 28 core processor,
you'd have 28 times as much of that to do.)

Paul
Ads