Sunday, June 26, 2016

Find & Copy Files While Adding The MD5 Hash To The Filename

Today I thought I'd show you a small script I'm using to decommission an old computer.  To make sure important files weren't deleted I wrote a BASH script that finds all files on a drive with a certain extension and copies them to an external drive.  This works well but sometimes two different files have the same name.  To make sure one doesn't overwrite the other, the file name is appended to its MD5 hash.  I know MD5 has security issues, but it's fine for this.

#!/bin/bash

find /mnt/sda3 -iname '*.pdf' -o -iname '*.svg'|while read line; do
    fname=$(basename "$line")
    md5=$(md5sum "$line" | awk '{print $1}')
    echo "'$line'"
    echo "'$md5'"
    cp -n "$line" /mnt/sdb1/"$md5""$fname"
done


The find command specifies the file types to look for, and where to look, in this case /mnt/sda3.  A while loop then processes each result by copying the file to an external drive (/mnt/sdb1) and changing it's name by adding the md5 hash to the front of it.  Copy is set to not overwrite any duplicate file names quietly. That's fine.  Two files with the same name and MD5 hash are exactly the same so you don't need both copies.

The script is written in BASH because I'm using a live version of Linux to recover the files on the computer.  It's an old Windows computer, but I'd rather do this with Puppy Linux than Vista.  Anyway, if you plan on using the script, do some small tests first.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.