Since I've had quite a bunch of failing harddrives recently in for recovery, I've created some scripts which help me doing that.
So there's one script that simply parses a logfile from
DDrescue and shows
you the amount of data already recovered, and the amount that's still
bad/unrecoverable. Second script is intended to determine the affected
files (on NTFS), which also parses a
DDrescue log.
Third and last script is to keep re-reading a single sector, so you
might get the data out of it. DDrescue doesn't provide an integrated
routine to do this.
With all these little helpers, you're able to see which are the
defective/damaged files, and if something is really really important,
you can let the script run to gather that data in an endless-loop.
DDrescue Log Parsing
This is not actually a script, but a simple oneliner for the
DDrescue. It
prints the statistics of a diskimage (in fact of the logfile). This
allows you to get a quick overview of the status of such an image.
There's also no dependency for the original software.
It's tested against version 1.11 of ddrescue.
Code
cat log.txt | grep -E "^0x[0-F]+ 0x[0-F]+ [+-]" | awk --non-decimal-data ' /+/ { ok+=$2 } /-/ { bad+=$2 } END { total=ok+bad ; printf "TOTAL: %d MB nRecovered: %d MBnBad: %d KBnnPercentage: %3.3fn" ,total/1048576,ok/1048576,bad/1024,100/total*ok } '
Sample Output
|| user@workstation ~ || cat log.txt | grep -E "^0x[0-F]+ 0x[0-F]+ [+-]" | awk --non-decimal-data ' /+/ { ok+=$2 } /-/ { bad+=$2 } END { total=ok+bad ; printf "TOTAL: %d MB nRecovered: %d MBnBad: %d KBnnPercentage: %3.3fn" ,total/1048576,ok/1048576,bad/1024,100/total*ok } '
TOTAL: 152627 MB
Recovered: 152626 MB
Bad: 996 KB
Percentage: 99.999
Notes
You might notice a slight difference between this output and the one from ddrescue. This is because the values computed by this script are using 1KB = 1024B, whereas ddrescue 'assumes' 1KB = 1000B.
Find files occupying badblocks (NTFS)
This script allows you to parse a ddrescue logfile and print a list of files which occupy some of the bad-sectors. This allows you to easily identify which areas cause problems, and you can also modify the log manually (nice description is available here), so that ddrescue can concentrate on a specific area.
Code
#!/bin/bash
#########################################################
# Author: Raphael Hoegger
# Source: http://pfuender.net/?p=80
# License: This file is licensed under the GPL v2.
# Latest change: 2010.06.24 17:40:32 CEST
# Version: 1.1
#########################################################
FSoffset=32256 # this is equal to the value used in 'losetup' as the
offset
DEVICE=/dev/loop1
LOGFILE=log.txt ## the one from ddrescue
OUTPUT=results.txt ## where you want your results stored
for failingSector in $(grep - $LOGFILE | awk ' { print $1 } ') ; do
NTFSsector=$(( ($failingSector-$FSoffset)/4096 ))
echo "Sector $NTFSsector:" >>$OUTPUT
ntfscluster -f -c $NTFSsector $DEVICE 2>/dev/null >>$OUTPUT
done
Output
|| user@workstation ~ ||$ sudo losetup -o 32256 -r /dev/loop1 image.dd
|| user@workstation ~ ||$ ./find_damaged_files-ntfs.sh
...
Sector 1262075:
Searching for cluster 1262075
Inode 17166 /Windows/system32/$INDEX_ALLOCATION($I30)
Sector 1263743:
Searching for cluster 1263743
Inode 92016 /System Volume
Information/_restore{B2482471-DF35-4094-86D3-41D285BA1DE9}/RP1060/$INDEX_ALLOCATION($I30)
Sector 1263744:
Searching for cluster 1263744
Inode 9872 /Dokumente und Einstellungen/USER/Lokale
Einstellungen/Anwendungsdaten/Microsoft/Windows Live
Contacts/$INDEX_ALLOCATION($I30)
Sector 1263771:
Searching for cluster 1263771
Inode 109395 /Dokumente und Einstellungen/USER/Lokale
Einstellungen/Anwendungsdaten/Microsoft/Windows Live
Contacts/{19fddb48-6b7f-40b7-b4d2-f40b59677fea}/DBStore/$INDEX_ALLOCATION($I30)
Sector 1278455:
Searching for cluster 1278455
Inode 5697 /Dokumente und Einstellungen/USER/ntuser.dat/$DATA
...
Re-read single sector until success
As you can see above, you can easily generate a list of the files that are corrupted. You can also see which part is affected, like the \$INDEX_ALLOCATION etc. So to me line 18 looks like the most important, since it contains the HKCU part of the registry. So we'll now simply adjust the line below with the right sector using the calculations done previously but in the reverse order:
- As per the log above, our cluster is 1278455.
- Let's convert it from clusters back to sectors -- 1278455*4096=5236551680
- Add back the original filesystem offset -- 5236551680+32256=5236583936
So now below you can see the code with the right sector number:
Code
while [ $? -eq 1 ] ; do dd bs=512 skip=5236583936 if=/dev/sda of=sector_5236583936 count=1 ; done
Notes
This command above will not just run, you need to run it once before without the whole loop (just 'dd bs=... '). This is by design and even has a reason, you can quickly verify if you're targetting the right sector, so if it fails, you know that you're messing with the right one. Now you can start your while-loop and let it run, forever.... Can take quite a while to gather any data. Now before anybody throws in 'spinrite' as a keyword, I have to say at this point that I've tried it out, but it seems to be really buggy/irresponsive or whatever, that's my experience and might also be only because of my buggy-bios or what so ever. If you have different experience / other tools, just let me know in the comments!
Further resources
- DDrescue manual - This is actually my favorite imager under linux. If you follow the link, you can read about it's algorithm used to obtain the data in a fast and effective way, definitely worth reading!
- Forensicswiki - A nice wiki which contains a bunch of useful articles about forensics/datarecovery.
That's it for the moment! As usual, questions can be asked in the comments, I'll answer them as time permits ;-)
Cheers,
Raphi