How to delete all but the most recent X subfolders in a folder?

Posted on

Problem :

I’ve seen some one liners using ls, but I’d like to avoid that.

I’m writing a script that copies code to my server in a timestamped folder. The ‘current’ version is always symlinked to a constant path. That way, I can roll back if anything goes wrong. Now, every deployment makes a new folder, but I’d like to keep only the 3 latest.

Here’s my current version:

ls -tp | grep '/$' | grep -v 'current|shared' | tail -n +4 | xargs -d 'n' rm -rf --

This is executed in the containing folder, excludes files, and the two folders ‘current’ and ‘shared’ that I want to keep (current actually being the aforementioned symlink), and deletes all but the 3 newest as sorted by ls -t.

Is there a way I can do this without ls, and only using bash and the gnu toolchain?

All folders meant for pruning are named following this format:

$timestamp.$branch.$sha1

Where SHA1 and branch are info from git about what exactly was deployed.

The server runs Ubuntu Xenial.

Edit: Provide some examples of what the folders look like

Here’s a listing from the deployment folder, how it looks right now:

drwxr-x--- 13 app www-data 4096 Mar 29 00:10 1490738956.develop.b806/
drwxr-x--- 13 app www-data 4096 Mar 29 00:19 1490739485.develop.ae01/
drwxr-x--- 14 app www-data 4096 Mar 29 03:33 1490751118.develop.f5b0/
lrwxrwxrwx  1 app www-data   40 Mar 29 03:33 current -> /home/app/deploy/1490751118.develop.f5b0/
drwx------  5 app root     4096 Mar 10 04:12 shared/

Solution :

How to delete all but the most recent 3 subfolders in a folder?

You can make use of find, sort, awk, xargs and finally rm:

find * -maxdepth 0 -type d -not -path "current" -not -path "shared" -printf "%T@ %pn" | sort -nr | awk 'NR > 3 {print $2}' | xargs rm -rf

Breakdown:

find versatile tool to look for files and directories and possibly execute commands on them

* take into consideration elements in the current folder

-maxdepth 0 don’t look in subfolders

-type d look for directories

-not -path "current" exclude the directory named “current”

-not -path "shared" exclude the directory named “shared”

-printf "%T@ %pn" print the results, appending the timestamp in front of the directory names. This could be omitted in your case, since the directories are already timestamped.

| sort -nr sort the list according to the timestamp, in reverse order

| awk 'NR > 3 {print $2}' print all but the first three results, omitting the timestamp that was added earlier. This is where you choose how many to keep, just substitute the number 3

| xargs rm -rf delete those directories and all their contents

Edit: in your case, since the directories’ names already start with a timestamp, there’s no need to add the timestamp again for sort -nr to be effective. Note that the awk syntax needs to be edited accordingly.

find * -maxdepth 0 -type d -not -path "current" -not -path "shared" | sort -nr | awk 'NR > 3 {print $0}' | xargs rm -rf

Reference: see these two useful/similar Q&A.

In it’s most basic form

touch -t 201003160120 some_file
find . -maxdepth  0 -type d ! -newer some_file -name *.develop.* -delete

As taken from https://serverfault.com/questions/122824/linux-using-find-to-locate-files-older-than-date

! -newer some_file tells it to match anything not newer than ‘some_file’

-name *.develop.* tells it to only match names matching this pattern ( *.develop.*)

-delete – I hope this requires no explaination, be careful! (Use -print when testing!)

-type d – only directories

-maxdepth 0 – don’t recurse into subfolders.

Leave a Reply

Your email address will not be published. Required fields are marked *