Tuesday, October 22, 2019

Delete Old log files from HDFS - Older than 10 days

Script: ( Tested ) :


today=`date +'%s'`
hdfs dfs -ls /file/Path/ | grep "^d" | while read line ; do
dir_date=$(echo ${line} | awk '{print $6}')
difference=$(( ( ${today} - $(date -d ${dir_date} +%s) ) / ( 24*60*60 ) ))
filePath=$(echo ${line} | awk '{print $8}')

if [ ${difference} -gt 10 ]; then
    hdfs dfs -rm -r $filePath
fi
done



Note: Change file path and It is not skipping the trash.

No comments:

Post a Comment

Hive Architecture

Hive Architecture in One Image