Sunday, February 21, 2010

Raid Hard Drive Recovery

It might be imaged that with all of this fault tolerance that data recovery would not be a requirement, but things will quiet go corrupt.

With all RAID levels logical corruption, wound to the file system, has fair as devastating execute as with a single hard disk. You might have a robustly stored file system, but it is a robustly stored and corrupted file system.

With RAID 0 the result of a failure of one disk is terminal for the RAID, if data cannot be recovered from the failed disk then a percentage of the data is lost for respectable, and since RAID uses data striping, this could be like losing 1 MB of data out of every 4 MB, and the chances of that leaving any major files intact are coarse. For smaller files, those less than the sum of a strip each from the working drive there will be files that are fortunately intact, for larger files (e.g. Exchange or SQL databases) there will be noteworthy data loss and structural harm and indecent level work will be required to score any useful data from them.

For RAID levels where there is parity and the chance to recover from a single disk failure then the most accepted problems were sight are:

Degraded running

A single disk fails and is ignored, or there is not a spare available and so one is ordered. Either blueprint the RAID unit stays in operation but with a disk missing so there is no longer any redundancy.

Usually the hard disks in a RAID are allotment of the same manufacturing batch, have been stored and urge in the same environment, if the unit has been mis-handled then each disk in the RAID has been mis-handled. So, there is quite a top-notch chance that another drive will fail sometime soon, if not for any of the reasons fair given but because unpleasant things don't happen singly.

Multiple failure

Striped RAID is fault tolerant if a single drive fails nice and cleanly. If multiple drives fail then the RAID is lost, but also if one drive fails and de-stabilises the SCSI bus. This can result in multiple drives appearing to fail, the RAID unit believes that they have failed, and so the RAID will not operate.

Configuration loss

When a RAID is configured information is stored about the order of the disks the size of a strip of data and so on. If there is a failure within the RAID controller and this information is lost then the RAID will no operate, and it is not always practicable to re-instate it.

Some RAID controllers will deem re-programming the RAID configuration as a rebuild quiz and re-write to each of the disks destroying the data.

People making it worse

One of the worst sounds we hear with RAID problems is that of human anxiety, and frantic attempts to repair the jam. "We're unprejudiced going to try one more thing" is often the sound that signals the demolish of the data as a RAID is repaired with the disks in the deplorable slots, or rebuild and plot support to its modern region.

What to do when a RAID fails

STOP
THINK
Make obvious that anything you do is going to be non-destructive.
Get Advice

Do not let anyone push you into precipitous action, they might have a deadline and be applying pressure but they will mercurial forget their section in driving proceedings when the RAID is fatally damaged by a hurried repair attempt.

How can data be recovered from a RAID?

great of RAID recovery is the same as for a single disk recovery, data must be secured and backed up to guarantee that the plight will not be exacerbated. For logical problems the difficult work is all on the analysis of the file system, that it is from a RAID makes no major contrast once the RAID plan has been identified and the legal access to it worked out.

For mirrored RAID data can be "mixed and matched" from the ample sectors of two drives to rebuild a valid drive. With striped RAID schemes that consume parity then data can be rebuild at the stripe level rather than on a per drive basis so if there are abominable sectors throughout more than one drive these can be corrected individually.

With non-redundant RAID schemes each sector that cannot read from a disk results in data loss from the RAID status. For redundant RAID schemes, however, there is great that can be done to rebuild when data is missing. Whilst a RAID controller will recall a disk off-line when it fails and operate in degraded mode rebuilding the data from the missing disk on request, a data recovery process can be somewhat more sophisticated. With properly written recovery software the level of granularity can be one sector rather than one disk so for each sector that fails the data can be rebuild so long as all sectors can be recovered from the remainder of the disks. Even if the next failed sector is on a different drive in the place, so long as the same sector can be read from the other disks then a complete rebuild can be made.

For levels of RAID that have greater redundancy, the number of failed sectors across a station of disks can be even greater without data loss.

Even as data recovery specialists we are, however, composed lumber by the rules of mathematics. If sector 99 is missing from both disks 0 and 4 in a RAID5 position then rebuilding of the missing data is not a possibility.

Once the raid/disk issues have been resolved then the data recovery process can continue unprejudiced as it would for a single disk.