"Sakura's cloud" Frequent disorder due to frequent occurrence, due to not reaching the quality that can be charged



"As we are aware of the current situation that customers can use with confidence, I am aiming for normalization as soon as possible, but now I have determined that I am not in a situation where I can charge you. For the moment, we will report that we will make free of "Sakura's cloud usage fee" for the time being, going back to March 1, "What is"Sakura's cloud"Was made free of charge.

About Sakura's cloud's current status report and charging compliance | IaaS type public cloud "Sakura's cloud"
http://cloud.sakura.ad.jp/news/sakurainfo/newsentry.php?id=622


The reason why you have no choice but to make it free of charge is explained as follows.

As already mentioned, due to performance degradation of the storage of data,
The situation where the load increases frequently continues.
For that reason, let's do a firmware update work aimed at improving
We had a long stop after the maintenance, and the effect of improvement was also
It is not obtained in minutes.
We are making efforts to fundamentally solve the manufacturer of the storage
However, regrettably I can not guide you about the prospect of the complete solution to the present situation.

Although it is unknown what happened with the above release alone, from the end of December last year until March 16th this year, it is possible to grasp the background by looking at the trouble report page which has been updated for nearly four months indeed, You can see how difficult this free choice was and how deeply you are in the crisis of "Sakura's Cloud".

Report on "Sakura's Cloud" Storage Network Failure (Updated on March 16th) | IaaS Public Cloud "Sakura Cloud"
http://cloud.sakura.ad.jp/news/sakurainfo/newsentry.php?id=603


The first thing is December 9th last year.

This event is related to the storage system we use at "Sakura Cloud"
It is the one that is being used by customers in use, the host server was down and housed
There are phenomena that you can not access your server, read / write process to disk
It fails irregularly and its symptoms appear on the server error log and console screen while using it
An output phenomenon occurred.

The cause is that communication of a specific pattern is made to the interface for the storage network
Affected, communication trouble occurred.

For this matter,

· Introduction of configurations and settings to detect patterns causing communication failure and to prevent them from occurring

· Improvement of fault tolerance of storage system by related Kernel parameter tuning

We managed to recover on December 25th, and the usage fee for December 1st - December 31th was free.

However, further obstacles will occur in January at the beginning of the year.

■ Contents of obstacles
When disk access increases between the host server and the storage,
Response of the customer server deteriorated and server down occurred.

Cause
In Sakura's cloud, we store customer server data in centralized storage.
When a large amount of disk access occurs to this storage, the host server and the host
I have confirmed the symptoms of Large communication disruption.
Also, due to this effect, temporary suspension of the storage network and host server
I confirmed the case leading to down.
Regarding down of the host server, as a result of analyzing the kernel dump, we adopted it
We have confirmed that it is a bug in the virtualization infrastructure (KVM)

As for the access upper limit setting, the event that the disk access on the server extremely slow occurs next time, it does not reach the fundamental solution.

But again in February, trouble.

■ Status and countermeasures concerning host server down
Continue to analyze the kernel and to reduce the appearance of phenomena, establish storage
We are reviewing definite definitions. As a result, we are in a situation where it is difficult to make it apparent.

■ About the disorder that occurred on February 22
Processing such as duplication and deletion is concentrated in the system program managing the storage device
In case of occurrence, a fault occurred which caused trouble in storage access occurred. Currently, file
We reviewed the processing operation to the stem and made changes so that there are no problems with the service
I did it.

■ Storage situation and measures
Implementation of storage enhancement to cope with an increase in disk I / O due to an increase in customer servers
I decided to do. As the first step, add storage devices and distribute processing
Will be carried out.

And Todome, beginning in March, the performance starts to deteriorate intermittently.

■ Status and countermeasures concerning host server down
We are continuing to analyze the kernel. Review current storage configuration settings
We are restraining the appearance by

■ Status and countermeasures concerning storage performance deterioration
Inquiries for symptoms intermittently degrading storage performance since early March
We are also grasping the situation of our company.
Symptoms have been confirmed when access loads above a certain level occurred, in that case our company
I think that the performance has declined considerably more than expected.
The cause of performance degradation is on the storage system,
We have confirmed the effectiveness.

Regarding this free of charge, it is said that "we will terminate free correspondence after letting me judge that it is a quality that can be charged sufficiently." However, from the beginning what kind of storage maker used it What I am saying is "Sun ZFS Storage Appliance".

Oracle Asia Pacific & Japan Media Center - Oracle 's "Sun ZFS Storage Appliance" starts to operate as a storage of Sakura Internet cloud service
http://japanmediacentre.oracle.com/content/detail.aspx?ReleaseID=1501&NewsAreaId=2


This is a real photo


· Sakura Internet as a storage infrastructure realizing "Sakura's Cloud"
We responded to high-speed interfaces between storage and storage and to a large increase in traffic
We attach importance to expandability that can connect multiple devices, and already as InfiniBand compatible storage
We adopted the adoption of "Sun ZFS Storage 7320 Appliance" with achievements in May 2011.
After the verification period of about 6 months, by the start of today's service "Sun ZFS Storage 7320
Appliance "has been adopted and operated.

That seems to be the result of highly evaluating the following features.

· Corresponds to InfiniBand which is equivalent in cost to 10 Gbps Ethernet and achieves 40 Gbps broadband

· Achieving high I / O transfer rate and low power consumption by effectively using disk and flash memory for the purpose of reducing power cost

· It is possible to create a virtual server within 10 to 15 seconds by functions such as clones and snapshots used when deploying multiple virtual machines implemented by "Sun ZFS Storage Appliance"

· Support for file sharing system "NFS version 4" that consolidates files and systems on the storage side and facilitates connection from many servers

· The software "DTrace Analytics" which monitors the state of the storage intuitively grasps the state of the storage via the Web browser and can instantaneously respond when trouble occurs


In addition, on April 5, "Sakura Supporting the Internet cloud Sun ZFS Storage Appliance: Five selection reasonsThe lecture is scheduled to be said. The content is as follows.

In November 2011, Sakura Internet Co., Ltd. released a developer-oriented IaaS / public cloud service "Sakura's Cloud". We provide high performance, expandability, and a simple service menu with an easy-to-understand pricing system based on the concept of "Provide any unmistakable cloud with overwhelming cost performance" concept.
We will explain the history of adopting the Sun ZFS Storage 7320 Appliance as the storage that supports "Sakura's Cloud" focusing on 5 points.

Actually, it is supposed to be a ridiculous thing to happen, but what will happen to the "cherry blossoms in the cherry blossoms" in the future ......

in Web Service, Posted by darkhorse