How about this:
hdfs dfs -ls /tmp | tr -s " " | cut -d' ' -f6-8 | grep "^[0-9]" | awk 'BEGIN{ MIN=5; LAST=60*MIN; "date +%s" | getline NOW } { cmd="date -d'''"$1" "$2"''' +%s"; cmd | getline WHEN; DIFF=NOW-WHEN; if(DIFF < LAST){ print $3 }}'
Explanation:
List all the files:
hdfs dfs -ls /tmp
Replace extra spaces:
tr -s " "
Get the required columns:
cut -d' ' -f6-8
Remove non-required rows:
grep "^[0-9]"
Processing using awk:
awk
Initialize the DIFF duration and current time:
MIN=5; LAST=60*MIN; "date +%s" | getline NOW
Create a command to get the epoch value for timestamp of the file on HDFS:
cmd="date -d'''"$1" "$2"''' +%s";
Execute the command to get epoch value for HDFS file:
cmd | getline WHEN;
Get the time difference:
DIFF=NOW-WHEN;
Print the output depending upon the difference:
if(DIFF < LAST){ print $3 }
You just need to change the variable value for MIN
depending upon your requirement (here its 5 minutes).
HTH
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…