Tuesday, January 29, 2013

Duplicity Backup Restore process

I have used duplicity backup script for backing up user data and also I have used Private and Public keys encryption method for encrypting the backup.

To restore backup used below commands and initially copied Duplicity volume gpg files from Remote location to local Computer.
Then we must issue below commands to decrypt the backup volumes. After the decryption we can view there are two folders created in the backup restore location. Folders are named as “multivol_snapshot” and “snapshot”.

If you encrypted your backup, first you must decrypt the volume by using your private key. Say you have duplicity-full.20110127T131352Z.vol1.difftar.gpg:
gpg --output duplicity-full.20110127T131352Z.vol1.difftar --decrypt duplicity-full.20110127T131352Z.vol1.difftar.gpg
Or to do all at once (This is the easiest way to do ...):
gpg --multifile --decrypt duplicity-*.*.*.difftar.gpg
Now you have either a .difftar or a .difftar.gz volume (depending on whether you had to decrypt it or not). Use tar on whichever one you have to extract the individual patch files:
tar xvf duplicity-full.20110127T131352Z.vol1.difftar
Or again, to do all at once:
for t in duplicity-*.*.*.difftar; do tar xf $t; done

If your file is in snapshot/ then you're done. Otherwise find the directory in multivol_snapshot/ at the path where your file used to be: you need to join together all the files in this directory to recreate the original file. The files are numbered, and can be joined together using the cat command. Depending on how large the original was, there may be many parts.

cat * > rescued-file


Problem with original instructions
The directions linked above suggest using cat * > rescued-file. Unfortunately this simple approach fails if you have more than 9 parts. Since * expands in dictionary order, not numeric order, 10 would be listed before 2, and the file would be reconstructed in the wrong order.
Workaround
One simple approach is to remember that dictionary order does work when numbers are the same length, and that ? matches a single character. So if your largest file has three digits, you can manually enter:
cat ? ?? ??? > rescued-file
Add or remove ? patterns as necessary, depending on the largest file number.

If there are more than 9 parts then you should go for scripting to recover the data. There you can use either shell scripting or Java code which include below.
But there is major limitation that it won't work with Incremental backups.

If you have a lot of files to recover and don't fancy typing that for all of them, you might prefer to use a script such as this. It lists the containing directory for every file, removes duplicates from the list, then goes to each directory and creates a content file from the fragments there. (spacer is just to make$1 work.)


find multivol_snapshot/ -type f -printf '%h\0' | \
  sort -uz | \
  xargs -0 -n 1 sh -c 'cd "$1" ; cat $(ls | sort -n) > content' spacer

Now you just have to add /content to the end of any filename you were looking for, and you should find it.


Or you can use below Java code to restore backup without issue. (No need to change single letter of the code it works fantastically for me.)



import java.io.File;
import java.io.FileFilter;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Arrays;
import java.util.Comparator;
import java.util.ListIterator;
import java.util.Vector;

import org.apache.commons.io.filefilter.DirectoryFileFilter;

public class DeFrankensteiner {

static String untaredRoot = "/media/big5wf/untared2test";
static Vector<File> resultDirs = new Vector<File>();

public static void main(String[] args) {
if (args.length>0) {
if (args[0] != null) {
untaredRoot = args[0];
if(!new File(untaredRoot).exists()){
System.err.println("Directory does not exist");
}
}

} else {
System.out.println("Program takes two arguments: root folder of the backup and an optional target folder");
System.out
.println("Please rerun and specifiy the root of your untared duplicity backup");
System.out
.println("The directory contains two folders, 'snapshot' and 'multivol_snapshot'");
System.exit(0);
}
getLeafDirectories(new File(untaredRoot
+ System.getProperty("file.separator") + "multivol_snapshot"));

ListIterator iter = resultDirs.listIterator();
while (iter.hasNext()) {
File sourceDir = (File) iter.next();
File[] the64KbBlocks = sourceDir.listFiles();
// We need a non alphabetic, simple higher is better sort
Arrays.sort(the64KbBlocks, new IntValueComparator());
String targetFileName = sourceDir.getAbsolutePath().replace(
"multivol_snapshot", "snapshot");
System.out.println("Will save file to " + targetFileName
+ " after merging " + the64KbBlocks.length
+ " blocks.");
// instead of /bin/bash try to use java onboard methods
try {

File targe = new File(targetFileName);

if(targe.exists())targe.delete();

FileOutputStream fos = new FileOutputStream(targetFileName,
true);
int i = 0;
for (File file : the64KbBlocks) {
i++;
FileInputStream fis = new FileInputStream(file);
byte[] bytesOfA64KbBlock = bytesOfA64KbBlock = getBytesFromFile(file);
fos.write(bytesOfA64KbBlock);
fis.close();
}

System.err.println("Written file file://" + targetFileName);
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
}

}

public static byte[] getBytesFromFile(File file) throws IOException {
InputStream is = new FileInputStream(file);

// Get the size of the file
long length = file.length();

// Create the byte array to hold the data
byte[] bytes = new byte[(int) length];

// Read in the bytes
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}

// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file "
+ file.getName());
}

// Close the input stream and return bytes
is.close();
return bytes;
}

public static class IntValueComparator implements Comparator<File> {

public int compare(final File file1, final File file2) {

int res = 0;
try {
res = (Integer.valueOf(file1.getName()) - Integer.valueOf(file2
.getName()));
} catch (NumberFormatException e) {
System.err.println("Something is here that shouldnt be here: "
+ e.getMessage());
System.err.println(file1.getAbsolutePath());
System.err.println(file2.getAbsolutePath());
// e.printStackTrace();
System.exit(-1);
}
return res;
}

}

public static void getLeafDirectories(File dir) {
File listFile[] = dir.listFiles();
if (listFile != null) {
for (int i = 0; i < listFile.length; i++) {
if (listFile[i].isDirectory()) {

File[] subdirs = listFile[i]
.listFiles((FileFilter) DirectoryFileFilter.DIRECTORY);
if (subdirs.length == 0)
resultDirs.add(listFile[i]);
getLeafDirectories(listFile[i]);
}
}
}

}

}

You must have installed Java run time and IDE for compile Java code. I have installed NetBeans IDE and then create Public class called “DeFrankensteiner” and added above code to that class and compile it.
You should take “<name>.jar” file from the comlied location which is under “dist” folder. Then issue the below commands to Recover the “multivol_snapshot” folder.

  • Go into the dist folder
    • Ex : - cd /user/Desktop/java/backuprestore/dist
  • Run the jar file with the source directory and Destination directory arguments.
    • Ex:- java -jar Backuprestore.jar /root/Documents/Test/ /root/restoreddata


Limitations
This doesn't restore any of the original file permissions or ownership. It also doesn't deal with incremental backups, but then the inked instructions also hit a bit of a dead end on this point — they just suggest using rdiff to stitch the files together' and refer the reader to man rdiff.


https://answers.launchpad.net/ubuntu/+source/duplicity/+question/186098

6 comments:

  1. Hi Sashika,

    I followed your instructions till extracting the data from difftar. Now I have over few hundred files split in few thousand files. I am trying to use the Java code as is. I have a couple of issues with that.
    1) Can you clarify if the path in the line -- static String untaredRoot = "/media/big5wf/untared2test" -- should be changed to the location where the multivol_snapshot and snapshot folders are stored?
    2) Either way I tried after changing the path. The project builds without errors. I tried executing with the source path and target path. It seems to be running but no (stitched) files appear in the target folder nor in the source folder. I am not a programmer and simply following instructions. FYI I am using NetBeans IDE to build the application. Can you suggest what could be wrong?

    Thank you.

    Shardul

    ReplyDelete
    Replies
    1. But I have run this java code and it restored data without issue. If you didn't get any error when executing the java code it means check the target directory it might be some where in the directory.

      Delete
  2. i always get the message: package org.apache.... does not exist

    ReplyDelete
    Replies
    1. you may import commons-io library (which it's a simple jar) to the project... download it and import under libraries in IDE with rightclick

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Hi, thanks for your valueable support! I could restore some of the backup-files in multivol, but most of the files are still in multivol_snapshot directory. Using the "find multivol_snapshot/ -type f -printf '%h\0' | \
    sort -uz | \
    xargs -0 -n 1 sh -c 'cd "$1" ; cat $(ls | sort -n) > content' spacer" creates errors.WHich Interpreter do you use herefore, python3? Has there been any changes so far? I use Ubuntu 19.1 and python 3.7.

    ReplyDelete