If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Windows vim script to sort lines & remove duplicates & concatenate data based on a given field
Save this script if you ever need to sort a text file based on a field.
o Windows vim script to sort lines & remove duplicates & concatenate data based on a given field In a recent thread, this question was asked, which has, to my knowledge, absolutely ZERO workable solutions on Windows which stay wholly inside of the vi VIM gvim freeware text editor, even though the question has been asked many times on the net over the years, and particularly given it's trivial to do on Linux - but VERY (very) hard to do inside of VIM on Windows. Given the domain in the first space-separated field, the question was asked how to create a file of unique domains based on data autogenerated for multiple domains of the generic format http://domain1.com (various data fields) http://domain1.com (more data fields) http://domain1.com (stuff) http://domain2.net (random data fields http://domain3.xyz (more data fields aplenty) Where the original file was 10,000 lines long so a program was needed to not only eliminate duplicate domains but also to concatenate the domain-specific fields on a single line, resulting in a file of format: http://domain1.com (various data fields) (more data fields) (stuff) http://domain2.net (random data fields http://domain3.xyz (data fields aplenty) This is the alt-os-linux thread on the subject: o Finding first field duplicate lines in a sorted text file without uniq or awk or col using only vim - is it possible? https://groups.google.com/forum/#!topic/alt.os.linux/aZlsGxn_nEE And this is the solution, which, I repeat, has NEVER been done on Windows to my knowledge, without adding Linux commands (like Cygwin) or without using Access or Excel or some other non-VIM database tool to eliminate duplicates based on a given field. [START HERE] function! C(blah) redir = cnt silent exe "%s#^" . a:blah . "##gn" redir END let res = strpart(cnt, 1, stridx(cnt, " ")) let i = 0 while i res - 1 normal! @q let i += 1 endwhile endfunction function! All() let i = 0 while i g:dc normal! "ayW call C(getreg("a")) normal! j let i += 1 endwhile endfunction let dc = 10000 let @q='0jdWkJ0' sort u normal! gg call All() [END HERE] NOTE: Change the "dc" variable to fit your file size in number of lines. NOTE: Source the file within VIM (e.g., :source doitall.vim). Note: The problem set sounds simple until you actually _try_ to solve it. https://stackoverflow.com/questions/1915636/is-there-a-way-to-uniq-by-column https://stackoverflow.com/questions/22849757/how-to-delete-duplicated-rows-based-in-a-column-value https://unix.stackexchange.com/questions/104525/sort-based-on-the-third-column https://stackoverflow.com/questions/17847799/sort-and-remove-duplicates-based-on-column https://superuser.com/questions/416134/how-can-i-sort-objects-by-third-column-in-powershell https://www.biostars.org/p/271720/ https://unix.stackexchange.com/questions/77406/sort-only-on-the-second-column https://unix.stackexchange.com/questions/171091/remove-lines-based-on-duplicates-within-one-column-without-sort https://www.unix.com/shell-programming-and-scripting/179059-remove-duplicate-lines-based-field-sort.html https://stackoverflow.com/questions/17847799/sort-and-remove-duplicates-based-on-column https://stackoverflow.com/questions/33265934/duplicates-in-an-unix-text-file-based-on-multiple-fields https://superuser.com/questions/332268/windows-7-sorting-contents-of-a-file/332269 https://stackoverflow.com/questions/22306490/how-to-sort-csv-by-columns-using-batch-scripting https://stackoverflow.com/questions/12850909/how-to-remove-duplicates-in-a-csv-file-based-on-two-columns https://stackoverflow.com/questions/1450085/list-only-duplicate-lines-based-on-one-column-from-a-semi-colon-delimited-file https://stackoverflow.com/questions/6438896/sorting-data-based-on-second-column-of-a-file https://www.mathworks.com/matlabcentral/answers/278956-sort-matrix-based-on-unique-values-in-one-column https://www.ultraedit.com/support/tutorials-power-tips/ultraedit/advanced-column-based-sort.html |
Ads |
Thread Tools | |
Display Modes | |
|
|