Safe copying of files and partitions.

Idea:

The main problem with damaged storage hardware is, that once you get an
unrecoverable IO error further reading from the file / device often fails until
the file has been closed and re-opened.

The normal copy tools like cat, cp or dd do not allow creation of an image file
from a disk or CD-ROM once reading of a sector failed.

Safecopy tries to get as much data from the source as possible without device
dependent special tricks.

(for example to get an ISO image from a copy protected or otherwise damaged
 CD-ROM, cdrdao and bin2iso might possibly do a better or faster job)

This is achieved by multiple reads from smaller sections at the begin of every
IO error causing (i.e. damaged) area, skipping that area while padding the
destination file with zeroes, and continuing where readable data starts again,
using a similar algorithm to find the true end of the damaged area.

For this to work, the source device or file has to be seekable.
For unseekable devices (like tapes) you can use an external script to execute a
controlled skip over the damaged part for you.
(For example by using "mt seek" and "mt tell" on an SCSI tape device)
See the "-S <seekscript>" parameter for details.

Performance and success of this tool depend extremely on the device driver,
firmware and underlying hardware.

Some DVD drives are known to cause the ATAPI bus to crash on errors, causing
the device driver to freeze for times up to and beyond a minute per error. Try
to avoid using such drives for media recovery.

Some drives can read bad media better than others. Be sure to attempt data
recovery of CDs and DVDs on several different drives and computers.
You can use safecopys incremental recovery feature to read previously
unreadable sectors only.

Different use cases:

	How do I...

- resurrect a file from a mounted but damaged media, that copy will fail on:
	safecopy /path/to/problemfile ~/saved-file

- create an filesystem image of a damaged disk/cdrom:
	safecopy /dev/device ~/diskimage

- resurrect data as thoroughly as possible?
	safecopy source dest -r 1 -R 4 -Z 2

- resurrect data as fast as possible, or
- resurrect data with low risk of damaging the media further:
	safecopy source dest -b 16384 -r 16384 -R 1 -Z 0

- resurrect some data fast, then read more data thoroughly later:

	safecopy source dest -b 16384 -r 16384 -R 1 -Z 0 -o badblockfile-16384
	safecopy source dest -r 1 -R 4 -Z 2 -I badblockfile-16384 -i 16384

- utilize some friends CD-ROM drives to complete the data from my damaged CD:
	safecopy /dev/mydrive imagefile <someoptions> -b <myblocksize> \
		-o myblockfile;
	safecopy /dev/otherdrive imagefile <someoptions> -b <otherblocksize> \
		-I myblockfile -i <myblocksize> -o otherblockfile;
	safecopy /dev/anotherdrive imagefile <someoptions> \
		-b <anotherblocksize> -I otherblockfile -i <otherblocksize>

- interrupt and later resume a data rescue operation:
	safecopy source dest
	<CTRL+C> (safecopy aborts)
	safecopy source dest -I /dev/null

- interrupt and later resume a data rescue operation with correct
  badblocks output:
	safecopy source dest <options> -o badblockfile
	<CTRL+C> (safecopy aborts)
	mv badblockfile savedbadblockfile
	safecopy source dest -I /dev/null -o badblockfile
	cat badblockfile >>savedbadblockfile

- find the corrupted files on a partially successful rescued file system:
	safecopy /dev/filesystem image -M CoRrUpTeD
	fsck image
	mount -o loop image /mnt/mountpoint
	grep -R /mnt/mountpoint "CoRrUpTeD"
  (hint: this might not find all affected files if the unreadable
   parts are smaller in size than your marker string)

- exclude the previously known badblocks list of a filesystem from
  filesystem image creation:
  	dumpe2fs -b /dev/filesystem >badblocklist
	safecopy /dev/filesystem image \
		-X badblocklist -x <blocksize of your fs>

- create an image of a device that starts at X and is Y in size:
	safecopy /dev/filesystem -b <bsize> -s <X/bsize> -l <Y/bsize>

- rescue data of a tape device:
	If the tape device driver supports lseek(), treat it as any file,
	otherwise utilize the "-S" option of safecopy with a to be
	self-written script to skip over the bad blocks.
	(for example using "mt seek")
	Make sure your tape device doesn't auto-rewind on close.
	Send me feedback if you had any luck doing so, so I can update
	this documentation.

FAQ:
    Q:	Why create this tool if there already is something like dd-rescue and
	other tools for that purpose?
    A:  Because I didn't know of dd-rescue when I started, and I felt like it.

    Q:	What exactly does the Z option do?
    A:	Remember back in MS-DOS times when a floppy would make a "neek nark"
    	sound 3 times every time when running into a read error?
	This happened when the BIOS or DOS disk driver moved the IO head
	to its boundaries to possibly correct small cylinder misalignment,
	before it tried again.
	Linux doesn't do that by default, neither do common CDROM drives or
	drivers.  Nevertheless forcing this behaviour can increase your
	chance of reading bad sectors from a CD __BIG__ time.
	(Unlike floppies where it usually has little effect)

    Q:	Whats my best chance to resurrect a CD that has become unreadable?
    A:	Try making a backup image copy on many different computers and drives.
	The abilities to read from bad media vary extremely.
	I have a 6 year old Lite On CDRW drive, that even reads deeply
	and purposely scratched CDs (as in with my key, to make it unreadable)
	flawlessly. A CDRW drive of the same age at work doesn't read any data
	from that part of the CD at all, while most DVD and combo drives have
	bad blocks every couple hundred bytes.
	As a general guideline:
	   -CDRW drives usually do better than read-only CD drives.
	   -CD only drives sometimes do better on CDs than DVD drives.
	   -PC drives are sometimes better than laptop ones.
	   -A drive with a clean lens does better than a dirtball.
	   -Cleaning up CDs helps.
	   -Unless you use chemicals.

    Q:	Whats my best chance to resurrect a floppy that became unreadable?
    A:	Again try different floppy drives. Keep in mind that its easier
	to further damage data on a bad floppy than on a CD.
	(Don't overdo read attempts)

    Q:	What about BlueRay/HDDVD disks?
    A:	Hell if I knew, but generally they should be similar to DVDs.
	It probably depends how the drives firmware acts up.

    Q:  My hard drive suddenly has many bad sectors, what should I do?
    A:  Avoid accessing bad areas as much as possible to prevent further
	damage, while rescuing the still good data.
	Accessing bad sectors will make the drive perform lots of error
	recovery in its own, leading to lots of physical movement,
	and potentially lockdown of more disk areas by the firmware.
	You could use smartmontools to check drive error statistic and
	details about whats wrong / internal error logs.
	If you have a list of affected blocks/sectors, write a badblocks file
	manually and use the -X option to prevent safecopy from accessing them
	all together at first. (Syslog may list them, too)
	Then slowly do incremental recovery, decrease resolution (-r),
	(if your driver does correct sector alignments, Don't go under the
	physical sector size)
	decrease blocksize down to physical block size, increase the retry
	factor and at last try to add the -Z factor.
	(it probably won't help much on hard disks but its worth a try)
	If your drive stops responding, reboot, let it cool down for a while
	if necessary.
	(I heard from people who used ice-packs successfully as a last resort)

	!!! If the data is really important, go to a professional data recovery
	!!! specialist right away, before doing further damage to the drive

Safecopy 1.0 by CorvusCorax
Usage: safecopy [options] <source> <target>
Options:
	-b <bytes> : Blocksize in bytes, also used as skipping offset
	             when searching for the end of a bad area.
	             Set this to the physical sectorsize of your media.
	             Default: Blocksize of input device, if determinable,
	                      otherwise 512
	-r <bytes> : Resolution in bytes when searching for the exact
	             beginning or end of a bad area.
	             Smaller values lead to very thorough attempts to read
	             data at the edge of damaged areas,
	             but increase the strain on the damaged media.
	             Default: 1
	-R <number> : At least that many read attempts are made on the first
	              bad block of a damaged area with minimum resolution.
	              Higher values can sometimes recover a weak sector,
	              but at the cost of additional strain.
	              Default: 3
	-Z <number> : On each error, force seek the read head from start to
	              end of the source device as often as specified.
	              That takes time, creates additional strain and might
	              not be supported by all devices or drivers.
	              Default: 1
	-s <blocks> : Start position where to start reading.
	              Will correspond to position 0 in the destination file.
	              Default: block 0
	-l <blocks> : Maximum length of data to be read.
	              Default: Entire size of input file
	-I <badblockfile> : Incremental mode. Assume the target file already
	                    exists and has holes specified in a badblockfile.
	                    It will be attempted to retrieve more data from
	                    the missing areas only.
	                    Default: none
	-i <bytes> : Blocksize to interpret the badblockfile given with -I.
	             Default: Blocksize as specified by -b
	-X <badblockfile> : Exclusion mode. Do not attempt to read blocks in
	                    badblockfile. If used together with -I,
	                    excluded blocks override included blocks.
	                    Default: none
	-x <bytes> : Blocksize to interpret the badblockfile given with -X.
	             Default: Blocksize as specified by -b
	-o <badblockfile> : Write a badblocks/e2fsck compatible bad block file.
	                    Default: none
	-S <seekscript> : Use external script for seeking in input file.
	                  (Might be useful for tape devices and similar).
	                  Seekscript must be an executable that takes the
	                  number of blocks to be skipped as argv1 (1-64)
	                  the blocksize in bytes as argv2
	                  and the current position (in bytes) as argv3.
	                  Return value needs to be the number of blocks
	                  successfully skipped, or 0 to indicate seek failure.
	                  The external seekscript will only be used
	                  if lseek() fails and we need to skip over data.
	                  Default: none
	-M <string> : Mark unrecovered data with this string instead of
	              skipping / zero-padding it. This helps in later
	              finding affected files on file system images
	              that couldn't be rescued completely.
	              Default: none
	-h | --help : Show this text

Description of output:
	. : Between 1 and 1024 blocks successfully read.
	_ : Read of block was incomplete. (possibly end of file)
	    The blocksize is now reduced to read the rest.
	|/| : Seek failed, source can only be read sequentially.
	> : Read failed, reducing blocksize to read partial data.
	! : A low level error on read attempt of smallest allowed size
	    leads to a retry attempt.
	[xx](+yy){ : Current block and number of bytes continuously
	             read successfully up to this point.
	X : Read failed on a block with minimum blocksize and is skipped.
	    Unrecoverable error, destination file is padded with zeros.
	    Data is now skipped until end of the unreadable area is reached.
	< : Successful read after the end of a bad area causes
	    backtracking with smaller blocksizes to search for the first
	    readable data.
	}[xx](+yy) : current block and number of bytes of recent
	             continuous unreadable data.

Copyright 2009, distributed under terms of the GPL

