bash - How to sort array of strings by function in shell script

Question

Welcome To Ask or Share your Answers For Others

bash - How to sort array of strings by function in shell script

asked Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

bash - How to sort array of strings by function in shell script

I have the following list of strings in shell script:

something-7-5-2020.dump
another-7-5-2020.dump
anoter2-6-5-2020.dump
another-4-5-2020.dump
another2-4-5-2020.dump
something-2-5-2020.dump
another-2-5-2020.dump
8-1-2021
26-1-2021
20-1-2021
19-1-2021
3-9-2020
29-9-2020
28-9-2020
24-9-2020
1-9-2020
6-8-2020
20-8-2020
18-8-2020
12-8-2020
10-8-2020
7-7-2020
5-7-2020
27-7-2020
7-6-2020
5-6-2020
23-6-2020
18-6-2020
28-5-2020
26-5-2020
9-12-2020
28-12-2020
15-12-2020
1-12-2020
27-11-2020
20-11-2020
19-11-2020
18-11-2020
1-11-2020
11-11-2020
31-10-2020
29-10-2020
27-10-2020
23-10-2020
21-10-2020
15-10-2020
23-09-2020

So my goal is to sort them by date, but it's in dd-mm-yyyy and d-m-yyyy format and sometimes there's a word before like word-dd-mm-yyyy. I would like to create a function to sort the values like any other language so it ignores the first word, casts the date to a common format and compares that format. In javascript it would be something like:

arrayOfStrings.sort((a, b) => functionToOrderStrings())

My code to obtain the array is the following:

dumps=$(gsutil ls gs://organization-dumps/ambient | sed "s:gs://organization-dumps/ambient/::" | sed '/^$/d' | sed 's:/$::' | sort --reverse --key=3 --key=2 --key=1 --field-separator=-)
echo "$dumps"

I would like to say that I've already searched this in Stackoverflow and none of the answers did help me, because all of them are oriented to sort dates in correct format and that's not my case.

question from:https://stackoverflow.com/questions/66046178/how-to-sort-array-of-strings-by-function-in-shell-script

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T03:16:33+0000

If you have the results in a pipeline, involving an array seems completely superfluous here.

You can apply a technique called a Schwartzian transform: add a prefix to each line with a normalized version the data so it can be easily sorted, then sort, then discard the prefix.

I'm guessing something like the following;

gsutil ls gs://organization-dumps/ambient |
awk '{ sub("gs://organization-dumps/ambient/", "");
    if (! $0) next;
    sub("/$", "");
    d = $0;
    sub(/^[^0-9][^-]*-/, "", d);
    sub(/[^0-9]*$/, "", d);
    split(d, w, "-");
    printf "%04i-%02i-%02i%s
", w[3], w[2], w[1], $0 }' |
sort -n | cut -f2-

In so many words, we are adding a tab-delimited field in front of every line, then sorting on that, then discarding the first field with cut -f2-. The field extraction contains some assumptions which seem to be valid for your test data, but may need additional tweaking if you have real data with corner cases like if the label before the date could sometimes contain a number with dashes around it, too.

If you want to capture the result in a variable, like in your original code, that's easy to do; but usually, you should just run everything in a pipeline.

Notice that I factored your multiple sed scripts into the Awk script, too, some of that with a fair amount of guessing as to what the input looks like and what the sed scripts were supposed to accomplish. (Perhaps also note that sed, like Awk, is a scripting language; to run several sed commands on the same input, just put them after each other in the same sed script.)

Categories

bash - How to sort array of strings by function in shell script

bash - How to sort array of strings by function in shell script

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags