View Single Post
  #45  
Old March 18th 19, 12:20 AM posted to alt.windows7.general
Paul[_32_]
external usenet poster
 
Posts: 11,873
Default is "Everything" doing some mining?

Bill in Co wrote:
Char Jackson wrote:
On Sat, 16 Mar 2019 20:27:39 -0600, "Bill in Co"
surly_curmudgeon@earthlink wrote:

Mayayana wrote:
"Bill in Co" surly_curmudgeon@earthlink wrote

I use an older version of FileLocator Pro, the big brother to Agent
Ransack, which has a lot more options, like excluding directories from
searches, which I find very advantageous. But it's not free, and
unfortunately, has gotten a bit pricey over the years. But the option
to exclude directories from searches greatly speeds up finding things,
especially since I'm not using any indexing, by choice.

AR lets you start at any level/location. Maybe
you mean doing something like excluding System32
when searching Windows?
Exactly. And even more than that!

Search Everything lets you exclude specific files, specific folders,
hidden files and folders, and/or system files and folders.


But I'm assuming requires indexing. Thanks, but no thanks, for reasons
previously enumerated. :-)


Turning off indexing, doesn't speed anything up.

Indexing works fine on NTFS for these reasons:

1) A once-daily enumeration by an indexer, builds a
reliably list of items for that day. Leaving too long
between "full" indexes means an index could be corrupted
or out-of-sync. On Windows 10, Microsoft believes this
interval to be 3 months (90 days). That's how often the
Indexing feature on Win10 re-does the entire index. On
a moderate sized volume, a content index can take 3 hours.

2) The NTFS USN Journal notes both the creation and
deletion of files. I could have eight programs running
at the same time, each one creates a file... None of the
"events" get lost. The USN Journal accurately records all
eight of them. The various search programs (more than one
program if you want), keep a read pointer to the USN Journal,
and like a laundry list, they read the list and make note
of the current "point-in-time", then add those eight files
to their list (by Insertion Sort perhaps). *Nothing*
gets lost on NTFS. You can even read the USN Journal
your own self, and in plain text, see what it logs.

The Everything.exe search of the index file uses
binary probing of a sorted list. It splits the list
in two. It splits the list in four. And so on,
until the exact "fraction" of the list, with only
one entry, is located. If there are 1048576 files,
it takes 20 probes. If there are 16777216 files, it
takes 24 probes. It's log base 2. The probes themselves
are file reads. Even pessimistically, I could probably
do 40-50 of those a second, so a search could take
half a second on a largish index file. With read ahead
and track caching, the probes are likely to be significantly
faster than that, on a modern HDD with large DRAM cache.

The sequential indexing by programs such as Agent Ransack
(which uses no index), are not even remotely close to
the half second on a bad day that it's going to take
Everything.exe with one of its prepared lists.

The thing Everything.exe cannot help with, is FAT32 volumes.
No USN journal. The FAT32 volumes need more frequent
indexing if you don't want to "lose anything" concerning
the thoroughness of the indexing process. But NTFS is
well set up to aid this sort of indexing activity.

You could make an argument, that the content indexing
Windows 10 does, adds "noise" to the search list output,
and I could agree with that. But with a little care
in use of the search language, that is easily
controlled. For example "ext:jpg" ensures you
see only JPG files. And you can also say stuff
like "filename:fluffy.jpg", and prevent content
with the word "fluffy" from accidentally appearing
in the list. On the old Windows XP search, the usage
of separate boxes means "less language" need be
learned with regard to trimming that sort of noise.

Paul
Ads