Using find,sed and xargs to strip files from one folder that exist in another

Handy bash commands to remove copies of files that exist in two places when you only want unique files in each folder:

Scenario:
Folder A contains 20000 files
Folder B contains 50000 fies

… and you have accidentally copied all of folder A into folder B so you now have thousands of duplicate files in folder B.

You can easily strip out any files in Folder B from Folder A using 3 bash commands piped together like so:

find foldera/* | sed s@foldera/@folderb/@ | xargs rm -rf

  • find – locate all files in folder a (you can use any combination of find paramaters here of course)
  • sed – replace foldera with folder b in each line of the results
  • xargs – run the rm -rf command on the output from sed
This just save me a lot of  time. Hope it helps someone else.
Credits to CiaranG for the assist with piping find results into sed, nice work!
Advertisements

3 responses to “Using find,sed and xargs to strip files from one folder that exist in another

  1. It does not work if one of the files is called:

    My brother’s 12″ records.txt

    Consider using GNU Parallel instead:

    find foldera -type f | sed s@foldera/@folderb/@ | parallel rm

    Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ
    Also if you have the files:

    foldera/foo/bar
    folderb/foo/bar
    folderb/foo/fubar

    then your solution will remove folderb/foo (including folderb/fubar).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s