We have heard about 'questions about RAID failure' in data recovery professionals with a recovery rate of 95.2%



' RAID ( RAID )' is a combination of multiple HDDs so that they can be virtually operated as one drive. RAID is used in file servers and the like for large-scale data sharing in companies etc., and RAID is also used by the GIGAZINE editorial department in its own operation server. However, in RAID multiple times in the past in GIGAZINE trouble is generated there have experience, just the other day also trying to newly constructed and the RAID error occurs. Fortunately, there was no harm as it was at the stage before data migration, but if the error occurred after data migration, all data in the server may have been missed. Such a possibility that it was also terribly scary, the data recovery industry boasts the No. 1 share of sales for 11 consecutive years, data recovery rate (number of data recovery / number of data recovery requests: December 2017 to 2018 According to Digital Data Recovery, which boasts the highest level of 95.2% recovery in Japan, the highest recovery rate of each month until November, there is a talk that 'in-house RAID and servers are prone to failure after long holidays'. I flew away. So fortunately, I've been buzzing data recovery professionals about RAID and RAID failures.

Data Recovery.com [Digital Data Recovery] | Data Recovery Service with 95.2% Recovery Rate
https://www.ino-inc.com/


That's why we came to the Tokyo headquarters for digital data recovery in Ginza, Tokyo.


At the general reception desk on the 6th floor, a privacy mark registration certificate is displayed.


I would like to start an interview as soon as possible, but at first I will be able to see the site of the Tokyo head office where data recovery is actually being performed.


Data recovery is taking place in the corner of an office building, which is carefully inspected using metal detectors.


It is said that the reason why strict security has been laid up to this point is to make it impossible to take out the important data deposited by the customer to the outside. In addition to this, countless surveillance cameras were installed in the office.


The first thing that comes in and jumps into the eye is the storage area. Here, the delivery of HDDs sent from customers and shipments of HDDs for which data recovery has been completed are being carried out.


Not only packing and unpacking, but also work to remove the storage media from the equipment that has been sent, etc. At the time of the tour, work was done to remove the storage from the iMac.


HDD in the storage area ...


The barcode is pasted. By doing this, it is managed that each HDD is at the stage of recovery work, and it prevents loss, carrying out, etc.


This is the area to recover HDD's logical failure. Work has been done to analyze data corruption using a binary editor and to repair corrupted values to normal ones in corrupted sectors or bytes.


The green part of the monitor mounted on the wall seems to indicate that the data on the HDD has been read properly. The red part shows the broken part, so anyone can visually confirm the progress of data recovery.


In addition, a clean room equipped with the same equipment as the operating room is also required, and it is mandatory to wear dustproof and anti-static clothing. Here, parts such as magnetic heads and motors are replaced in order to repair the physical failure of the HDD.


So, when the tour is over, interview starts immediately. I spoke with Yoshiya Ijima, who is the group head of the engineer group for digital data recovery.


GIGAZINE (hereinafter, 'G'):
Then thank you for your consideration today. I heard that before I interviewed, ' The failure of RAID tends to occur at the end of a long vacation ,' but is it something that breaks in such a long vacation?

Yoshiya Ibuchi, General Manager, Engineer Group Group (hereinafter referred to as 'Ibuchi'):
It often breaks at the end of a long vacation. If it is broken, there will be many inquiries for data recovery.

G:
Why does the number of inquiries increase?

Izushi:
There is a company that shuts down the system once for a long vacation, and in many cases, it has been left for a while during the holidays, so if you try to put it on after the holidays, there are many cases.

G:
Why does it happen that a shutdown does not start even if the power is turned on for a while?

Izushi:
It's probably on, it's up and running, but often it's not going up.

G:
Why is that something like that? If you can tell me while including the example that there was such a thing in the experience so far mainly.

Izushi:
I think that the HDD was in the first place to be broken by age deterioration in the first place. Since servers and RAID are basically running for a long time, they often don't notice that they are about to break, but once they shut down, they say that they do not go well when they start up and notice a failure. If it keeps running, it is as it is.

G:
Is that the axis at the center of the HDD or the image that the motor part is broken?


Izushi:
I think that there is also, in the first place there are many bad sectors, and I think that there is a case that it is stuck and it can not be read when trying to read it again. Basically, frequently used data is limited even if it is RAID or a server, and it only reads the data, so there is no chance that it will get stuck in a bad sector even after startup.

G:
I see!

Izushi:
After that, in the case of RAID, writing and reading are distributed to multiple HDDs, so once you turn off the power, you will go from the first RAID controller to 'let's read which disk'. . For example, 'Disc 1', 'Disc 2', 'Disc 3' and 'Disc 4' are present, and before failure occurs, it is assumed that 'Disc 2 is already broken and so only Disc 1, Disc 3 and Disc 4 are being read.' . At that time, if you shut down and restart RAID once, I will try to read the broken disk 2 and there may be a case where it crashes. It's a hypothesis because we only see things after breaking.


G:
About RAID controller that came out to the present story, GIGAZINE also had its own server, and it was true that it was broken after rebooting the power once like the previous story. If you turn off the RAID controller and then turn it on again, it will not start, or will it fail to write, and why does that happen?

Izushi:
The RAID controller itself stores logs and 'Where is broken' information, so if it is working normally, it should be able to start it as it was when the next boot. In other words, when such a failure occurs, I think that there is nothing but that the original RAID controller itself is broken.

G:
I see. It is most unclear, but I can understand that a RAID controller is fragile at a single point because it has only one piece in itself, but recently there have been many software RAIDs. In the case of software RAID, even if the software is broken, I think it will be somehow done if the software is repaired, but that may not be the case.

Izushi:
Although we do not do much, we can repair RAID with software RAID. In most cases, since all RAID information is written in the HDD, it is often damaged. I will rewrite the values of the RAID information there using the binary editor little by little and reassemble.


G:
I see. Also, there are one, five, six, ten, or various RAID levels for RAID . When you actually fix it, is it really difficult to fix this RAID level?

Izushi:
Speaking of difficulty in repair, is it like RAID-Z ?

G:
What is RAID-Z ...?

Izushi:
The file system itself is called ZFS . Those with numbers like RAID 1 or RAID 2 can be fixed fairly easily. Because data is distributed in disk units. However, RAID-Z, Drobo RAID, etc. ...

G:
After all the RAID standard which each maker made uniquely is the image such as yabai.

Izushi:
Yes, there is no recovery tool on the market, so I can not fix it without using a tool developed by a recovery company.

G:
On the other hand, will it be possible to repair if something like that is brought in?

Izushi:
Yes, I can. What is going on with ZFS, which is a RAID-Z file system, is not distributed by disks, but rather by being distributed like trees or sideways, not sideways. Hm. This is the way data is distributed in a file system such as ZFS, for example, for RAIDs that originally have numbers, like 'disk 1, disk 2, disk 3, disk 4 ...', the data is also in that order It is lined up, but, for ZFS etc., it is feeling that 'the next data is on disk 1 and the next data is on disk 4'. A special part of this file system is that it is further distributed after actually creating a RAID.

G:
Certainly it is very special when you are listening.

Izushi:
So recovery is difficult. It is not a good idea to put the disks in order.


G:
I see, that's what it feels like. Also, on the page of Digital Data Solutions, I was able to read all the pages where Iku appeared, when interviewing this time, but among them, Iku said 'in the industry that it can not be restored now. If you find the possibility of recovery, you can try the method, if necessary, Africa or Alaska, but it will let you go. But what does this 'industry say is that recovery is not possible now'? For example, if the fire completely burns out and becomes ash, I know that restoration is impossible.

Izushi:
It's a physical part and it's always there.

G:
How much now is the last line, 'It looks bad, but you don't have to give up.'

Izushi:
If there is a scratch, it will be difficult to recover physically because the head will not read in the first place, and depending on the type of HDD, it may be considered impossible to recover if the scratch is even a bit. Is basically recoverable. It is difficult for helium HDDs recently. If the capacity is large and contains helium instead of air, the air will be replaced if it is opened, so it will be necessary to replace the helium. So far, the project itself has not come in so much, and I am doing it while researching.

Helium filling in HDD increases capacity by 40%, will structural change come to HDD industry-GIGAZINE


G:
Also, I read that I thought 'Is there such a thing?' On the same page, saying that Mr. Ikushi was a sales engineer for the communication system in the previous job, 'I went to the customer for maintenance The job was also to receive orders for new projects, some of which also had the task of contacting a data recovery company on behalf of the customer who experienced a data failure. ' If a failure occurs and you contact a data recovery company, in short, it may be a situation like 'RAID does not start' like now.

Izushi:
That's what it looks like.

G:
At the time of the previous job, how did you decide where 'it must call a data recovery company'?

Izushi:
At that time, as long as it can be fixed there, it worked hard to repair, and if it did not start, it was a flow of asking a data recovery company.

G:
In fact, I think now that the person who calls the data recovery company will be from the person who is in charge of the case that I had experienced before. When you see such a project, do you still have something like 'I will contact you sooner!'

Izushi:
There are (laughs), it is just such a project.

G:
I think the borderline I think 'I will contact you sooner' is that there are cases in which people in the field, such as those who used to be well-in-handed, have no judgment. Do I have to?

Izushi:
It is the first stage.

G:
First stage? !

Izushi:
The truth is, 'I want you to call me before I try to do something myself'. However, I also understand that it does not go so. I think that it can not be helped if you feel that you must somehow fix it while your customers are looking in front of you. However, if the failure worsens the possibility of recovery, I would like you to call digital data recovery at the beginning.

G:
Is there a pattern that actually gets worse if you do something like this?

Izushi:
What do you do, I think it is not like luck. How old were you if the media was aged? If you are using a PC that has been in use for three or four years, you should not overdo it. After that, I think that there is also a case that the customer was called after being done variously.

G:
What specifically did the customer do?

Izushi:
I'm trying to fix it myself.

G:
Do you try to open it?

Izushi:
That is extreme though.

G:
Do you want to install the software and try to restore it?

Izushi:
I agree. After that, the data will be overwritten many times.

G:
Have you ever seen a request to feel too 'doing it?'

Izushi:
There is. There are times when you think 'oh ...'. (Lol)


G:
Data Recovery.com's page contains ' 4 points that can not be done when RAID failure occurs ', but at the top of it is written 'Rebuild data rebuild' and 'Rebuild is expensive' It is said that it will fail with a probability! ”But when does it fail in rebuilding rebuild data, what time does it fail?

Izushi:
If rebuilding of rebuild data stops, it is almost a failure. Reasons for stopping are bad sectors, HDDs not working properly, or things that were forced to work. Since rebuilding will force the HDD to move, only the single HDD gets worse and stops on the way. If you do so, the other normal disks will be rebuilt only up to the point where they stopped, so the data after that will remain shifted.

G:
I see.

Izushi:
Even if you remove the HDD and bring it all in that state, it is fine until you can rebuild it, but after that the gap remains and you will lose consistency between the front and back. .


G:
That's right. The second of '4 points that can not be done at RAID failure occurrence' is 'Replacement / Replacement of HDD order', and it says 'It will automatically rebuild', but this is What do you mean.

Izushi:
This is not all, but the HDD of 'disk 1, disk 2, disk 3, disk 4' is included from the top, as with the problem with the RAID controller above. All information such as 'The HDD of model number is inserted' and 'The HDD of such model number is inserted into the 2nd disk' is recorded in the RAID controller. However, when I replaced the HDD, the firmware was 'RAID controller or NAS.' Even though the HDD of Western Digital was originally included in No. 1, now HDD of Seagate is included. A new HDD has been inserted. It recognizes as ', and initialization starts.

G:
I see, that's what it means. The third 'Remove the HDD and turn on the power alone!' Says that if you turn on the power with the HDD removed, the possibility of accidentally formatting is large. What kind of situation is it?

Izushi:
I think that there are not many people who do this, but it means that only one HDD should be started.

G:
Is it an image of 'Can I recognize properly?'

Izushi:
Maybe it is. I haven't seen it these days, but in the past there were people who did format. Customers who are trying to make sure that they can be connected to a PC and recognized as a single unit with one HDD and four RAID sets are still present. I get an error saying 'Do you want to format it?' On the PC, and if you answer 'Yes' there will be formatting. Answer 'No' to 'Do you want to format?' And check the HDD.

G:
The last point says 'I will replace the RAID card' and says, 'There is a high possibility that the data configuration will fall apart!'

Izushi:
You are welcome. This is really literally, but it's a pattern that you realize that the RAID card is the problem, and simply replacing it with a new board will make the data visible.

G:
Everyone, you are doing something relatively aggressive. Have you ever seen a memorable episode of 'Why is this customer doing such a thing ......'?

Izushi:
What a ...... But I think it's really about what is written here.

G:
What you see here is actually happening.

Izushi:
I'm awake.

G:
In short, it's a story that says 'Please don't do this and bring it right away.'

Izushi:
If you throw it away, there is nothing you can do, but there are also cases where it has been recovered to a network supplier or communication company in the middle, or it has been returned to the manufacturer, thinking that it has been broken. As is common with HP, there is a rule that if you replace the HDD, you have to recover the old HDD. But if you set a time limit, for example, 'I will return within one month,' you can keep the old HDD while you have a new HDD if you keep the promise. It will be awful when it comes to 'I understand, return immediately' without being guided properly.

G:
That is, it seems that initial response is important for manufacturer-made RAID.

Izushi:
It is important. Because it is the story before my involvement.


G:
If you bring it in this way, will the recovery rate go up and will the charges be relatively cheap? In what order should you address the problem isolation method?

Izushi:
The most ideal is 'Call a data recovery company if broken.' This is most likely to recover data.

G:
There are actually such customers, too.

Izushi:
Of course. If you don't know what to do, the data will be fixed right away.

G:
Because you do not know what to do and call immediately because you do not have bad knowledge, it is easy to cure as a result.

Izushi:
And that way is still cheaper. People who are familiar with this field and those who have been studying should do well, but there are many cases where it will be difficult if not fixed.

G:
Oh, I guess it seems like the data was broken while trying out.

Izushi:
What you are doing is not a mistake, and it should be able to fix it, but if the deeper part of the contents is broken, it just doesn't work and it gets worse.

G:
Recently, GIGAZINE is also building a NAS with RAID, and it has become increasingly uneasy as we build very large capacity ones, but what was the capacity of HDD from 2 to 4 TB in the past, now it is 14 TB Well, isn't it a great capacity? But the size of the HDD itself has hardly changed.

Izushi:
The density inside has changed.

G:
Is such a large capacity HDD hard to fix?

Izushi:
Inside is ordinary air up to 8TB, but when it comes to 14TB etc., inside becomes helium. Such things are still difficult to fix. However, if helium is injected and the disk spins stably, it will be difficult to break. Instead, the gap between the disk and the disk becomes narrower, so it becomes more troublesome if it breaks.

G:
In fact, will an HDD like 14TB of helium be brought into repair?

Izushi:
There is a small number. However, I think that it will be about two to three years from now that HDDs of helium will actually come in as data recovery projects because HDDs are broken about three years ago, and so on.

G:
If you look at the statistical data of HDD failure rate by maker and model , it seems that there is a case that 'This firmware of this model number of this maker is useless.' Why is that happening?

Izushi:
From my point of view, when an inter-company acquisition takes place and the person making the HDD changes, I think that the HDD released soon after that will tend to break down.

G:
You can think of such a factor. There are many cases where the HDD firmware is said to be incorrect, but what is the HDD firmware?

Izushi:
Firmware is a program that describes how to move parts, and some are written inside or outside the disc. Because the program itself is firmware, it means 'it is bad' equals 'firmware is wrong'.

G:
The program is literally bad, isn't it?

Izushi:
It is not an initial failure of the program, but it feels like it has been designed as a program that is prone to malfunction.

G:
When that happens, is the cause 'a maker'?

Izushi:
That's right. In fact, there are cases where programs are written not only on disks but also on PCBs, and sometimes programs are written that only the manufacturer can understand. It's too fine, and when something goes wrong, the program that says 'I really have to do something like this' may move in a different way ... I think there are various causes.

G:
It's really all sorts of things. Earlier, I was shown the scene of various repairs on the floor, but I was looking at the normal HDD repair scene in the clean room, but in the case of an HDD with a higher capacity of helium Although it is a clean room, it can not work because it is ordinary air.


Izushi:
No, from a recovery point of view, the loss of helium does not have much effect. It will affect using it for 1 year or 2 years, but we can complete 80% of cases within 48 hours, so there is no problem if we can get the data within that time.

G:
It is literally an image that replenishes helium.

Izushi:
I guess this is not the case for any company.

G:
I see. Also, I was looking at various models such as Synology to build RAID even in GIGAZINE, but if it is something that is made up of SSDs instead of HDDs, or if you put SSDs in the foreground as caches, such things are increasing. What is the reliability of the combination of SSD and HDD, from the cases actually brought in?

Izushi:
It is not very suitable for data storage. It will be faster in terms of speed.

G:
It means that the performance is good.

Izushi:
It's hard to say, but online game companies and others mostly use SSDs for storage servers. There seems to be adoption of SSD because there is a need to have speed because it is a game, but if there is too much access, reading and writing of data will basically crash beyond the limit. There is a read / write limit on SSDs, but it is more than double that limit. It will eventually crash, so data migration and maintenance should be performed regularly, and data must be moved to a new SSD and operated. The speed of the HDD is a little slower, but the data seems to be stable.

G:
It is such an image. Also, I think that there are probably inquiries from many companies such as 'RAID is broken' or 'NAS is broken'. What is the largest project so far?

Izushi:
There are many hospitals for large projects. In large hospital systems and servers, the number of HDDs may exceed 90. The huge amount of X-ray images in the past are high-quality photos, so the capacity of each sheet is very large. I have to set up a decent server, but in some cases the entire data is gone.

G:
Was it all 90 HDDs brought here?

Izushi:
I was able to bring it. Other than that, there are many projects such as 60 or 40 pairs. However, I did not form a single RAID with 90, but it was distributed inside, so it felt like I was combining distributed RAID again.

G:
In such a case, do you bring the hardware of the RAID system or pull out the HDD inside?

Izushi:
If it is whole, it can not be physically entered, so only HDD can be brought. I took a clone of the HDD, made it compact and then pulled out the data. Your HDD is about the same size as this, or 2TB of this. That's 90. When I work, I first take a clone, so even if it is not a disk, it takes up as an image, so there is no space for it.


G:
I see. I saw it on the occasion of the office tour just before, but recovery of data in the smartphone is also done. In the case of a smartphone, it is not an HDD.


Izushi:
In the case of a smartphone, it is not HDD but memory.

G:
That said, I don't know exactly who will be asking for data recovery, but why are they asking for it?

Izushi:
If you delete the data, it may be submerged. I think that many in summer, but it has fallen to the sea.

G:
Are you able to recover the data in such cases as well?

Izushi:
I can do it. After that is investigation of cheating .....

G:
There are cases like that.

Izushi:
There are, and the rest are relics. It is an item that you want to bring in a smartphone of a child or a parent, and just take out photos from inside.

G:
It seems that there are many requests from the police. What kind of content do you specifically request?

Izushi:
It is something that is incidental. The police also have various tools and facilities, but those that didn't make it come as a request.

G:
Is it an image that something like 'This is a police item' suddenly comes? It is rather frightening to say that 'the police project has come'.

Izushi:
There are many things. Is it folded in half?

G:
Even if your smartphone is bent in half, is it something you want to go back to?

Izushi:
It depends on the broken part. Since the place where the data is stored is decided, if it deviates from there. I do not know if I do not look at the damage condition of that area.

G:
Still there is a possibility of saving. Well then, if there is data you don't want to see, it's not only breaking, but it's a bad thing if you don't break it into pieces.

Izushi:
It would be impossible if you smash it.

G:
As you often see in TV dramas etc., it is possible that data may be restored if you step on your smartphone and break it.

Izushi:
I agree.

G:
Even so, why did you decide to recover data from work originally going to sales at various companies?

Izushi:
I used to work in a communications company and I enjoyed it there, but I also wanted to find out one thing. There is also a request for data recovery related, and it seems that you have shifted to 'Let's go in the direction of data recovery'.

G:
It's not that I've been working on the recovery from the beginning after entering this company.

Izushi:
At first, it was not possible at all to create a RAID or fix a RAID. So, at first I was not studying.


G:
It was written on the official page as 'a collection of cutting-edge technologies from around the world '. What are you specifically doing?

Izushi:
As there are trainings and social gatherings, we will study data recovery technology. It is Russia, China, and Israel that are strong in data recovery on a global basis.

G:
Why are those three countries strong in data recovery?

Izushi:
From a historical point of view, Russia and China are good at analyzing and analyzing existing things such as 'repair' and 'simulate'. Because there are many such people, there is a hand in hand to learn the technique of data recovery.

G:
I don't know how far I can say, but have you recently learned something like a study club and say that it was 'new', or something new.

Izushi:
It is NAND system at the latest, and it is repair of USB or SSD. After that, there was also a question of how to repair the iPhone and smartphone. I am specialized in RAID, so I am not involved in that direction, but I have noticed that 'you can retrieve data from there'. I can not say so in detail.

G:
Lastly, I would like to ask you that RAID-Z is not good enough and that each manufacturer's unique RAID is good, as I mentioned earlier, but when asking for data recovery for RAID level Is there something like 'It's easy to fix it at this level'.

Izushi:
The RAID level doesn't matter very much.

G:
Choose which RAID to use based on performance or capacity, and if it breaks, it will break.

Izushi:
I agree.

G:
Another thing is that it isn't easy to recover or easy to recover.

Izushi:
After all, is it whether or not the backup has been properly taken, and if it is RAID 5, whether or not it corresponded when the alert came out properly? At the same time, I think that HDDs created at the same time are mostly in the same place, so if you leave them broken and have a red lamp on for a long time, other disks may be affected and broken. There isn't much to say about the RAID level. It is safe to break two disks, so I think that it is easy to operate it with RAID 6, but it is the same at any other level from the viewpoint of the repair side.

G:
Anyway, it feels like it is better to bring it as quickly as possible.

Izushi:
It's just how much failure there is on that one disk.

G:
I see, that's what it means. Thank you for your time today.

Data Recovery.com [Digital Data Recovery] | Data Recovery Service with 95.2% Recovery Rate
https://www.ino-inc.com/


in Coverage,   Interview,   Hardware,   Advertisement, Posted by logu_ii