Convert hardlinks to reflinks

I have nested folders with a bunch of files inside that are hardlinked to each other. I would like to break the hardlinks (convert them into separate files), but then immediately convert each pair into a reflink (so they have different inodes but use the same section of disk).

find -type f -links +1

will find all the hardlinks, while a command like

cp --reflink=always my_file.bin my_file_copy.bin

will copy a file without using any more disk space, creating it as a reflink.

How do I combine these to go through a whole set of nested folders and convert each hardlink into a reflink, replacing them with the same filename?

2 Answers

You tagged ubuntu, I understand you are not limited to strictly POSIX tools and their POSIX options.

find . -type f -links +1 -execdir sh -c ' tmp="$(TMPDIR=. mktemp)" && cp -p --reflink=always -- "$1" "$tmp" && mv -f -- "$tmp" "$1"
' find-sh {} \; -print

Notes:

This converts hardlinks to reflinks, i.e. my_file.bin hardlink becomes my_file.bin reflink. There will be no my_file_copy.bin. (This note is in case you want to create my_file_copy.bin reflink while leaving my_file.bin hardlink intact. The question is not crystal clear in this matter, it introduces my_file_copy.bin for some reason.)
If mktemp or cp fails then mv will not be performed. In any case you shouldn't lose the original content, unless some other process modifies the temporary file.
Because find tests files one by one, it will never overwrite (convert) all the hardlinks to any inode. If all the hardlinks are processed by find then -links +1 will fail for the last one. The original inode will survive. This means if the original file is open and going to be modified in place (without changing the inode number) then the modification will survive somewhere (but it's hard to tell in advance which hardlink will be processed last and will keep its inode number). A situation when an open file gets totally unlinked, modified as such and removed from the filesystem as soon as it's closed shouldn't happen.
If cp or mv fails then the temporary file will survive. You may want to capture stderr to a file (2>some_file) and investigate later.
-print will act if the shell code succeeds. It's only there so you can see something happens.
find-sh is explained here: What is the second sh in sh -c 'some shell code' sh?

Edit: As pointed out by Kamil, don't do the for x in $(find ...). Using the find -execdir sh -c format is the proper way to use find output. I'll leave my answer here however.

You can write a small Bash script or directly write a for loop in your bash shell:

$ for filename in $(find -type f -links +1); do echo "I found this file: ${filename}"; done

This example will take each line from the find command and place it in a ${filename} variable that you can then use. Here, we are just printing a I found this file: $filename for each one, but you can replace that with your copy command, which would probably look something like this:

$ for filename in $(find -type f -links +1); do echo "Copying ${filename} to ${filename}_copy.bin"; cp --reflink=always ${filename} ${filename}_copy.bin; done

Or, if you want to put this in a Bash script instead so it's easier to read and work with. Create a file copy_script.sh with these contents:

#!/bin/bash
for filename in $(find -type f -links +1); do echo "Copying ${filename} to ${filename}_copy.bin" cp --reflink=always "${filename}" "${filename}_copy.bin"
done

Then save and run with $ bash ./copy_script.sh

Convert hardlinks to reflinks

2 Answers

Your Answer

Sign up or log in

Post as a guest

You Might Also Like

Are the cards a one-time use?

How to install skyrim using wine?

Lost equipment quest?