A Windows XP help forum. PCbanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » PCbanter forum » Windows 10 » Windows 10 Help Forum
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Why is search so brain dead these days?



 
 
Thread Tools Rate Thread Display Modes
  #136  
Old June 26th 20, 11:46 PM posted to alt.comp.os.windows-10,alt.windows7.general
Yousuf Khan[_2_]
external usenet poster
 
Posts: 2,447
Default Why is search so brain dead these days?

On 6/26/2020 6:49 AM, Frank Slootweg wrote:
I assume your mean*extra* context menu stuff, i.e. tools which are
added to the default (right-click) context menu.

If so, are you sure these extra tools do not work in the context menu
of a file dialog box?

I do not have many such tools, but I did a quick test and if I do a
'Save as...' in Chrome, the context menu in the 'Save as' popup offers
'Share with Skype' (probably not a good example), '7-zip' and 'Convert
to PDF in Foxit PhantomPDF'. For other items, it offers things like VLC
and Google Drive.

So if the context menu of a file dialog box shows*these* third-party
tools, why shouldn't it show the mentioned third-party*search* tools?


The context menus may or may not work, but it's irrelevant if you can't
make use of the findings. If you're searching something in the file
access dialog box with Agent Ransack or something, AR's findings can't
be used by the open/save dialog box for use by the program trying to
open or save a file.

Yousuf Khan
Ads
  #137  
Old June 26th 20, 11:52 PM posted to alt.comp.os.windows-10,alt.windows7.general
Yousuf Khan[_2_]
external usenet poster
 
Posts: 2,447
Default Why is search so brain dead these days?

On 6/23/2020 8:04 PM, Stan Brown wrote:
On Tue, 23 Jun 2020 13:09:57 -0400, Yousuf Khan wrote:
Remember there used to be
a time when if you wanted to delete entire groups of files or folders in
DOS, and you used a "del *.*" command, and the whole thing would be done
in under 1 second? But then later in Windows, doing the same thing would
take minutes, just because the Explorer is doing it in a braindead way,
where it deletes each file individually?


Maybe I'm having a lapse of memory, but I can't remember ever seeing
that happen -- unless files were in use, of course, but then they
couldn't be deleted at all.


It was actually a well-known feature, which I think was a leftover from
CP/M days (predecessor to DOS), which DOS retained but was deprecated in
favour of the newer DOS commands. But Microsoft itself just kept using
this older CP/M call within command.com itself, so it never got rid of
the old call, since it was so much faster than DOS's own slow version
which did things one file at a time.

Yousuf Khan
  #138  
Old June 28th 20, 01:26 PM posted to alt.comp.os.windows-10,alt.windows7.general
philo
external usenet poster
 
Posts: 4,807
Default Why is search so brain dead these days?

On 6/23/20 12:09 PM, Yousuf Khan wrote:
On 6/23/2020 10:48 AM, Ken Blake wrote:
On 6/22/2020 12:45 PM, Alan Baker wrote:
All the hits on my sister-in-law's name on the Mac: 3 seconds for
1405 hits.

Just sayin'

:-)



That's very slow compared to Search Everything on Windows.


Another area where Microsoft has ****ed something up, even though it was
working simply beforehand, is file deletions. Remember there used to be
a time when if you wanted to delete entire groups of files or folders in
DOS, and you used a "del *.*" command, and the whole thing would be done
in under 1 second? But then later in Windows, doing the same thing would
take minutes, just because the Explorer is doing it in a braindead way,
where it deletes each file individually? Then it took 3rd party utils to
bring back the 1 second deletes?

Â*Â*Â*Â*Yousuf Khan




I think , starting with Vista, to delete a file there was a long wait as
Windows needed to calculate disc space.

Pretty annoying and absurd
  #139  
Old June 28th 20, 06:50 PM posted to alt.comp.os.windows-10,alt.windows7.general
Paul[_32_]
external usenet poster
 
Posts: 11,873
Default Why is search so brain dead these days?

Alan Baker wrote:
On 2020-06-21 7:42 a.m., philo wrote:
On 6/21/2020 8:38 AM, philo wrote:
On 6/20/2020 6:06 PM, Rene Lamontagne wrote:
On 2020-06-20 5:54 p.m., Yousuf Khan wrote:
I'm referring mainly to Windows search, but this applies to a lot
of other search algorithms all over the place and on the Internet
too. In the olden days, search was very efficient and somewhat
intuitive. For example, let's say you try to do a search for
"virtual" and expect you might find something like VirtualBox,
VirtualPC, whatever. But for some reason, the current Windows
search cannot find these. If you do a search for the full name,
then it may find them (hit and miss). In the old days, these
searches would find all instances where the string would occur,
even as part of a substring. It was very easy to do searches, and
you could even do multiple words to narrow down the searches. What
has gone wrong with search algorithms now?

Yousuf Khan

I really can't help you here because I never use Windows search.
I use "Search Everything" and "Agent Ransack" exclusively. sorry

Rene



Thanks for the info.
As one who recently did a search that found close to nothing, I am
happy with the much improved results using the free version of Agent
Ransack.




Ransack : 52 hits in ten minutes

From Explorer, after one hour , four hits...search nowhere near complete


Spotlight: all the hits on the entire drive in 15 seconds.


The Microsoft search is able to do this in under a second.

*But*, there are caveats.

The Windows Search is accessible as an SQL operation (requiring
scripting), or through File Explorer (the normal user path).

I was able to create a test partition with a million files on it,
and the search can correctly figure out how many text files
are present, in less than a second.

But if we do the SQL benchmark instead, why are the results
different ? Well, it's a hint.

*******

I prepared a "tree" of folders, deep enough to hold a billion
files, but only partially filled it (16 files per folder at the
bottom level.

The test pattern is pathological, in that an inverted
index will not be able to deal with the pattern all that well.
It means the dictionary has a vocabulary of a million words,
not an English dictionary of fifty thousand.

filename 00012345.txt contains "00012345"

I didn't start with a million files. I started with 32 million
files, and the indexer could not index more than around 2 million
of them. It looked like the merge step failed silently
(inverted indexer merges main index with the small index
it creates as it's scanning).

As a result, I wiped the test partition and loaded up a million
files. With "real user files", this limitation might never
be evident. It's the pathology ("incompressibility" if you will),
of the content, which eventually prevents indexing. The index
file stopped in this case, at around 8GB. In the following
test, the index file is around 3.5GB or so, for 1048576 txt files.

String files-returned time(sec)

0000000* none 0.042
000000* 27 0.054
00000* 4080 0.184 === reasonably good
0000* 65520 2.528
000* FAIL 30.01 ("permission error" ???)

Filename != 'A' 1114100 44.4 sec

Rather than the SQL being incapable of returning more
than 65520 file references, it's the nature of the
query that seems to break it. A query intended to
"light up" the index, works as expected. It takes
44.4 seconds to produce a 45MB file listing, with all
of the hits in it.

So now I switch over to File Explorer.

Any of the above searches completes in under a second.

How is this possible ?

Well, the "results display" is obviously not completely
computed. Whwen you use the scroll bar, as you scroll down,
additional searches take place. Just enough to populate the
screen. In the SQL results, you can see for small numbers
of screen objects (like 27 items), the results can come back
reasonably quickly.

OK, so why am I not crowing about this result ?

Well, you can't save the output in the File Explorer window.

You cannot *print* the output in the File Explorer window.

You can do a copy/paste, but the copy/paste only has
the "path" column. Not all columns are copied if you
do copy paste.

If you use Nirsoft sysexp.exe (a tool to copy a recalcitrant
window like this one), there is wheel spin for an *hour*,
and no result.

Like an expensive hooker, Windows Search is "pretty to look
at, but can't boil an egg".

Who needs fast results exactly, if you can't do anything
with them ?

*******

Summary:

The "easy to obtain" results, in under a second, are to
me, mostly useless. I may want to post-process a result,
which seems just about impossible with the GUI.

I *can* get *a* result from Windows Search, but it takes a script
calling into an SQL engine.

The runtime of using SQL, depends on the pathology of what's
in the index.

The results hint that the technology does not scale well. Even with
an easier-to-index regular file mix, it's eventually going
to have trouble. Just at some slightly higher number of files.

The index was stored on rotating rust. The test was
not done using an SSD.

Users have described having 50GB Windows.edb files, but I don't
know if that's still the case on the current release of Win10.

Paul
  #140  
Old June 28th 20, 09:27 PM posted to alt.comp.os.windows-10,alt.windows7.general
tesla sTinker[_2_]
external usenet poster
 
Posts: 9
Default Why is search so brain dead these days?



On 6/20/2020 3:54 PM, Yousuf Khan scribbled:
I'm referring mainly to Windows search, but this applies to a lot of
other search algorithms all over the place and on the Internet too. In
the olden days, search was very efficient and somewhat intuitive. For
example, let's say you try to do a search for "virtual" and expect you
might find something like VirtualBox, VirtualPC, whatever. But for some
reason, the current Windows search cannot find these. If you do a search
for the full name, then it may find them (hit and miss). In the old
days, these searches would find all instances where the string would
occur, even as part of a substring. It was very easy to do searches, and
you could even do multiple words to narrow down the searches. What has
gone wrong with search algorithms now?

Yousuf Khan


explorer is not made right, it is a liar.
The true layout to the directory tree shows you that it is.
And that is why, no one can figure it.
Neither can its search engine.

Everyone should know Bill Gates is a massive thief, and
a criminal. its to bad, the 666 does not see it this way.
You can get, a file viewer that is not of Bill Gates, and the search
file may work better and faster. I mean, without to get lost.
  #141  
Old June 28th 20, 10:07 PM posted to alt.comp.os.windows-10,alt.windows7.general
Alan Baker[_3_]
external usenet poster
 
Posts: 145
Default Why is search so brain dead these days?

On 2020-06-28 10:50 a.m., Paul wrote:
Alan Baker wrote:
On 2020-06-21 7:42 a.m., philo wrote:
On 6/21/2020 8:38 AM, philo wrote:
On 6/20/2020 6:06 PM, Rene Lamontagne wrote:
On 2020-06-20 5:54 p.m., Yousuf Khan wrote:
I'm referring mainly to Windows search, but this applies to a lot
of other search algorithms all over the place and on the Internet
too. In the olden days, search was very efficient and somewhat
intuitive. For example, let's say you try to do a search for
"virtual" and expect you might find something like VirtualBox,
VirtualPC, whatever. But for some reason, the current Windows
search cannot find these. If you do a search for the full name,
then it may find them (hit and miss). In the old days, these
searches would find all instances where the string would occur,
even as part of a substring. It was very easy to do searches, and
you could even do multiple words to narrow down the searches. What
has gone wrong with search algorithms now?

Â*Â*Â*Â* Yousuf Khan

I really can't help you here because I never use Windows search.
I use "Search Everything" andÂ* "Agent Ransack" exclusively. sorry

Rene



Thanks for the info.
As one who recently did a search that found close to nothing, I am
happy with the much improved results using the free version of Agent
Ransack.



Ransack : 52 hits in ten minutes

Â*From Explorer, after one hour , four hits...search nowhere near
complete


Spotlight: all the hits on the entire drive in 15 seconds.


The Microsoft search is able to do this in under a second.


Ummm... ...it's not faster than Spotlight.

Sorry.


*But*, there are caveats.


Right!


The Windows Search is accessible as an SQL operation (requiring
scripting), or through File Explorer (the normal user path).

I was able to create a test partition with a million files on it,
and the search can correctly figure out how many text files
are present, in less than a second.


With an SQL query... ...or in an Explorer window?


But if we do the SQL benchmark instead, why are the results
different ? Well, it's a hint.

*******

I prepared a "tree" of folders, deep enough to hold a billion
files, but only partially filled it (16 files per folder at the
bottom level.

The test pattern is pathological, in that an inverted
index will not be able to deal with the pattern all that well.
It means the dictionary has a vocabulary of a million words,
not an English dictionary of fifty thousand.

Â*Â* filenameÂ* 00012345.txtÂ* contains "00012345"

I didn't start with a million files. I started with 32 million
files, and the indexer could not index more than around 2 million
of them. It looked like the merge step failed silently
(inverted indexer merges main index with the small index
it creates as it's scanning).

As a result, I wiped the test partition and loaded up a million
files. With "real user files", this limitation might never
be evident. It's the pathology ("incompressibility" if you will),
of the content, which eventually prevents indexing. The index
file stopped in this case, at around 8GB. In the following
test, the index file is around 3.5GB or so, for 1048576 txt files.

StringÂ*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* files-returnedÂ*Â* time(sec)

0000000*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* noneÂ*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* 0.042
000000*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* 27Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* 0.054
00000*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* 4080Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* 0.184Â*Â* === reasonably good
0000*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* 65520Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* 2.528
000*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* FAILÂ*Â*Â*Â*Â*Â*Â*Â*Â*Â* 30.01Â*Â*Â* ("permission error" ???)

Filename != 'A'Â*Â* 1114100Â*Â*Â*Â*Â*Â*Â*Â*Â* 44.4 sec

Rather than the SQL being incapable of returning more
than 65520 file references, it's the nature of the
query that seems to break it. A query intended to
"light up" the index, works as expected. It takes
44.4 seconds to produce a 45MB file listing, with all
of the hits in it.

So now I switch over to File Explorer.

Any of the above searches completes in under a second.

How is this possible ?

Well, the "results display" is obviously not completely
computed. Whwen you use the scroll bar, as you scroll down,
additional searches take place. Just enough to populate the
screen. In the SQL results, you can see for small numbers
of screen objects (like 27 items), the results can come back
reasonably quickly.

OK, so why am I not crowing about this result ?

Well, you can't save the output in the File Explorer window.

You cannot *print* the output in the File Explorer window.

You can do a copy/paste, but the copy/paste only has
the "path" column. Not all columns are copied if you
do copy paste.

If you use Nirsoft sysexp.exe (a tool to copy a recalcitrant
window like this one), there is wheel spin for an *hour*,
and no result.

Like an expensive hooker, Windows Search is "pretty to look
at, but can't boil an egg".

Who needs fast results exactly, if you can't do anything
with them ?

*******

Summary:

The "easy to obtain" results, in under a second, are to
me, mostly useless. I may want to post-process a result,
which seems just about impossible with the GUI.

I *can* get *a* result from Windows Search, but it takes a script
calling into an SQL engine.

The runtime of using SQL, depends on the pathology of what's
in the index.

The results hint that the technology does not scale well. Even with
an easier-to-index regular file mix, it's eventually going
to have trouble. Just at some slightly higher number of files.

The index was stored on rotating rust. The test was
not done using an SSD.

Users have described having 50GB Windows.edb files, but I don't
know if that's still the case on the current release of Win10.


And Spotlight works from the command line.

man mdfind

mdfind -- finds files matching a given query


 




Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off






All times are GMT +1. The time now is 02:16 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 PCbanter.
The comments are property of their posters.