A Windows XP help forum. PCbanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » PCbanter forum » Windows 10 » Windows 10 Help Forum
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Sort files by aspect ratio?



 
 
Thread Tools Rate Thread Display Modes
  #61  
Old August 19th 18, 02:27 PM posted to alt.comp.freeware,alt.comp.os.windows-10
Paul[_32_]
external usenet poster
 
Posts: 11,873
Default Sort files by aspect ratio?

Terry Pinnell wrote:


I baulked at 50,000 but I doubled up my 100 successively to 3,200. That
5 GB folder had a wide range of ARs, mostly JPGs, a few BMPs.

The first attempt failed because AWK apparently dislikes filenames
containing spaces, and FE created lots of those, like
20020302-125739-Ashdown6 - Copy - Copy - Copy - Copy - Copy.JPG

But after renaming them simply in Bulk Renamer Utility (0001 to 3200),
the elapsed time of Reinhard's AWK/BAT combo was 18 secs, by stop watch.
If the relationship is roughly linear that would imply 4:40 for 50,000;
close to five minutes.

So, ProcMon, not something to leave running while you have a coffee
then!


Terry, East Grinstead, UK


Here's a new version. query.ps1 shouldn't have
changed. There are now three files in the kit,
plus having to acquire a copy of gawk.exe version 3.

skimmer.awk gawk script
query.ps1 powershell query of Windows Search database
copy.ps1 powershell used to copy output files

gawk.exe gnuwin32 GAWK version 3 for Windows

**************** Helper script "query.ps1" ********************
# powershell -file query.ps1 -TREEDIR "'C:\'"

param([string]$TREEDIR="'C:\'")

$sql = "SELECT System.ItemFolderPathDisplay, `
System.ItemName, `
System.Image.HorizontalSize, `
System.Image.VerticalSize FROM SYSTEMINDEX `
WHERE System.Image.HorizontalSize0 AND `
System.Image.VerticalSize0 AND `
SCOPE=$TREEDIR"

$provider = "provider=search.collatordso;extended properties=’application=windows’;"

$connector = new-object system.data.oledb.oledbdataadapter -argument $sql, $provider

$dataset = new-object system.data.dataset

if ($connector.fill($dataset)) { $dataset.tables[0] | Export-CSV query.csv }
**************** end of Helper script "query.ps1" **************

**************** "skimmer.awk" ********************
# gawk -f skimmer.awk width height percent scan_path out_dir
#
# gawk -f skimmer.awk 16 9 1 "C:\\" "C:\users\user name\downloads\outdir" NUL
#
# 0 1 2 3 4 5 (no input file)
#
# ARGC = 6 ARGV[0] .. ARGV[5]
#
# query.csv looks like this, skip the first two lines. There can be commas in the filename!
#
# #TYPE System.Data.DataRow
# "SYSTEM.ITEMFOLDERPATHDISPLAY","SYSTEM.ITEMNAME"," SYSTEM.IMAGE.HORIZONTALSIZE","SYSTEM.IMAGE.VERTICA LSIZE"
# "C:\Users\user name\Downloads\JPG2","0000014994_1.jpg","669","600 "
# "C:\Users\user name\Downloads\JPG2","04.jpg","500","375"
#
# Powershell copy loop
# Get-Content .\abspathnfile.txt | Foreach-Object { copy-item -Path $_ -Destination "X:\out\"}
#
################################################## #########################
# This is a first cut script, with no error handling or disaster proofing!
# No warranty expressed or implied. Paul.
#
# Aug19,2018 Switch to Powershell for file copying. About 1000 files per second.

BEGIN {
if (ARGC != 6) {
print "Usage: width height percent_tol source_tree destdir"
print "gawk -f skimmer.awk 16 9 1 " "\"C:\\\\\" " "\"C:\\users\\user name\\downloads\\outdir\" NUL"
print ""

print "The program needs five arguments."
print "In some cases, two backslashes may be required on the end of a path, to work."
print "This proof print will then show one backslash as having made it through."
print ""
for (i = 1; i ARGC; i++) print ARGV[i]
exit 0
} else {
print "Called with"
print ""
for (i = 1; i ARGC; i++) print ARGV[i]
print ""
width = ARGV[1]+0
height = ARGV[2]+0
percent = ARGV[3]+0
outdir = ARGV[5]
}

# houseclean before run - no collision protection, run one copy only!

cmd = "\"del query.csv copyme.txt\""
system( cmd )

cmd = "\"powershell -executionpolicy bypass -file query.ps1 -TREEDIR \"'" ARGV[4] "'\"\""
print "Query: " cmd
print ""
system( cmd )

# You can redirect stderr output to clean the output a bit.
# Here, I'm hiding the warning that the directory already exists.

cmd = "\"md \"" outdir "\" 2NUL\""
print "Cmd: " cmd
print ""
system( cmd )

# Checking whether any files in the output folder, will conflict.
# Assumes outdir is one big flat folder, and not a tree!
# The "a-d" removes directory names from the listing.

cmd = "dir /b /a-d \"" outdir "\" 2NUL"
print "Cmd: " cmd
print ""

dest[ "no file by this name" ] = 0
src[ "no file by this name" ] = 0

while ((cmd | getline) 0) {
dest[ $0 ] = 0 # holds current outdir filename list
}
close( cmd )

high = width/height * (1 + percent/100)
low = width/height * (1 - percent/100)

if ( (high 0) || (low 0) ) exit 0

i=0
j=0
trouble=0

# No FPAT in Gawk3

FS="\""

while ( (getline "query.csv") 0 ) { # scan for collisions, make copyme.txt
if ( i = 2 ) {

# "C:\Users\xxxx yyyyyyy\Downloads\JPG2","04.jpg","500","375"
# 2 3 4 5 6 7 8
aspect = $6/$8
if ( (high = aspect) && (low = aspect) ) {
if ( $4 in dest ) {
trouble++
if (trouble = 20) {
print $2 "\\" $4 " already exists in output directory"
}
}
if ( $4 in src ) {
trouble++
src[ $4 ]++
if (trouble = 20) {
print $4 " exists " src[ $4 ] " times on copyme.txt list"
}
} else {
src[ $4 ] = 1
}
# do the simplified copy scheme here and make a file list
print $2 "\\" $4 "copyme.txt"
j++
}
}
i++
}
close( "query.csv" )
close( "copyme.txt" )

if (trouble 0) { # collision detection on file names...
print ""
print "Trouble detected, " trouble " problems, exiting run before copying anything"
print " Only the first 20 problems are printed to screen"
exit 0
}

# Only copy files if there is no trouble.

cmd = "\"powershell -executionpolicy bypass -file copy.ps1 -TREEDIR \"" outdir "\"\""
print "Cmd: " cmd
print ""
system( cmd )

print i " images in tree, of which " j " got copied"
}
************* end of "skimmer.awk" ****************

**************** Helper script "copy.ps1" ********************
# Source filelist is hardwired to "copyme.txt".
# Accepts a text file of absolute_path_filenames to copy to TREEDIR
# powershell -executionpolicy bypass -file copy.ps1 -TREEDIR 'F:\outdir test'

param([string]$TREEDIR="'X:\does_not_exist'")

Get-Content .\copyme.txt | Foreach-Object { copy-item -Path $_ -Destination "$TREEDIR" }
**************** end of Helper script "copy.ps1" **************

Time for 47 files copied from a scan of 50100 files = 3 seconds
Time for 50000 files copied from a scan of 50100 files = 55 seconds

Paul
Ads
 




Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off






All times are GMT +1. The time now is 12:14 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 PCbanter.
The comments are property of their posters.