Hu Yoshida's offical blogs

Syndicate content Hitachi Data Systems
Hu Yoshida, VP and CTO of Hitachi Data Systems, provides his insight into industry issues, discusses in his own words storage best practices, and provides realistic solutions to real storage problems of current and next generation storage environments.
Updated: 29 min 8 sec ago

Capacity Efficiencies: Allocation vs Utilization

Fri, 02/03/2012 - 18:13

As noted in previous posts, capacity efficiency has two dimensions: allocation efficiency and utilization efficiency.

Allocation Efficiency

Allocation efficiency is what most people think of first: eliminating the waste of over-allocation. In open systems this has been a major problem since we may not know ahead of time how much capacity an application requires, but we don’t want to run out of capacity and we know the operational difficulties of expanding it. So, the usual practice is to over-allocate by a wide margin.

After all, disk is cheap, isn’t it?

The problem with over-allocation is that that we don’t just make one copy. As Claus Mikkelsen noted in his blog, there may be 10 to 15 copies of that allocation for many valid requirements, like data analysis, data sharing, development test, or point in time snap shots. The most efficient way to eliminate this over-allocation is to use thin provisioning, where you provide virtual space for the requested allocation and only provision the capacity that is actually being used. It also helps to support the APIs for file systems, like VMFS and Symantec file systems that can notify the storage system when files are deleted so that the allocation for those files can be reclaimed by the storage system.

The capacity and time to make copies is also reduced by the elimination of allocated unused space. Reduction of copies can also be reduced further with copy on write so that only the new changes are replicated. HDS storage supports these functions, and can map them to legacy storage systems through virtualization.

The thin provisioning software that Hitachi provides in Hitachi Dynamic Provisioning (HDP) also increases allocation efficiency by providing a pool of preformatted pages. This eliminates the need for the storage administrator to format the drives, carve out the LUNs, and concatenate LUNs for striping performance. HDP will automatically strip a LUN across the width of the HDP pool. Allocation software in our Hitachi Command Suite will enable a user to allocate storage with five clicks of a mouse.

Utilization Efficiency

Utilization Efficiency is about using the capacity in an efficient manner so as to reduce costs and increase performance/availability. The primary Hitachi tool for doing this is Hitachi Dynamic Tiering, where we can dynamically tier pages within a volume across multiple tiers of cost/performance storage capacity. When most people think of tiering they think of volume level tiering, where a volume is moved between tiers of cost/performance storage. While this can match volumes with the right performance tier of storage, it can actually use more storage since you need to have space for the whole volume in all the tiers that are involved. With page level tiering, only the hot pages need to reside on the higher performance tiers. Since only a small amount of pages are hot at any time, you will only need enough high performance for 5% to 10% of your volume rather than for full 100% of your volume. That is utilization efficiency.

Utilization efficiency also depends on the efficiency of the paging process. Paging is the most efficient method of dynamic tiering since it is calculated on a page basis. Chunk/Chunklet methods for paging require the definition of a chunk and then an index into the chunklet. Dynamic tiering requires the handling of more metadata and more processing power within the storage system. VSP was designed with a separate control store for the metadata and a separate pool of Intel quad core processors to offload this processing from the I/O processors.

This function can also be mapped to external storage through storage virtualization through VSP.

These are the primary tools for storage capacity and utilization efficiencies in Hitachi storage systems. I would be interested in hearing about other functions that can be used to enhance capacity efficiencies.

Categories: HDS Blogs

Storage Efficiencies Redefined

Wed, 02/01/2012 - 21:25

If you Google storage efficiencies, eventually you will get a Wikipedia definition, which describes storage efficiency as “the ability to store and manage data that consumes the least amount of space with little or no impact on performance, resulting in a lower total operational cost.”  Wikipedia also references the SNIA definition, which notes:

storage efficiency = (effective capacity + free capacity)/raw capacity

However, as I noted in my last post, the definition of storage efficiency has expanded beyond capacity.

Since that last post on storage efficiency being more than capacity, I have discovered more tweets and blogs around this topic. Randy Kerns posted this week at IT Knowledge Exchange on storage efficiencies and data center optimization.

Randy identifies storage efficiencies around data reduction, allocation of capacity reduction, performance efficiency, data protection efficiency, scalability efficiency, and increasing automation for administrative efficiency. This matches very well to the list of efficiencies that Jon Toigo provided in the blog that I recently referenced.  One new twist may be the scalability efficiency, in which scaling of capacity and performance is done in equal proportion to support greater consolidation and growth. I think he makes a very good point, so I am adding it to my list of storage efficiencies.

Expanding on Jon Toigo’s base list of efficiencies, I would add Randy’s contribution along with my “storage management efficiency”:

  • Capacity Allocation Efficiency
  • Capacity Utilization Efficiency
  • Performance Efficiency
  • Data Protection Efficiency
  • Energy Efficiency
  • Storage Management Efficiency
  • Scalability Efficiency

Does anyone have other efficiency consideration for storage? In my next few blogs I plan to expand on each of these bullets.

Categories: HDS Blogs

More to Storage Efficiency than Capacity

Fri, 01/27/2012 - 18:45

In response to my last blog, Jon Toigo was kind enough to post a training piece that he wrote last year, reminding us that capacity is only one part of storage efficiency.

In addition to capacity allocation efficiency, which most of us are addressing with thin provisioning, Jon points out the need for capacity utilization efficiency, storage performance efficiency, data protection efficiency, and storage energy efficiency. Expanding on his thoughts, I have added storage management efficiency.

Capacity utilization efficiency is about the placement of data on the appropriate tier of storage, based on frequency of access, business value and cost of the storage. This could be addressed by automated tiering based on policies that are triggered by time or events.

Storage performance efficiency could be addressed by automated wide striping or page level tiering, where only the hot pages of a volume—rather than the whole volume—is moved to high performance tiers of storage.  Ray Luccesi has a great take on storage performance efficiency in his IOPs vs Drive Counts chart of the month, which he posted last week.

Data protection efficiency, which is measured in terms of Recovery Time and Recovery Point objectives (RPO/RTO), is a major area for improving efficiency. This has to do with replication, backup, recovery, archive, etc. If most of the data is static data, which is not being updated like most unstructured data, you only need two copies for redundancy. You can eliminate making the many snapshots and backups of the same unchanged data over and over again.  Brad Clarke commented on my post that the most important storage efficiencies to him were the ones which make replication less bandwidth hungry. He makes the point that when data volumes increase, the cost of disks to contain that capacity is relatively cheap—compared to the cost of the increase in bandwidth pipes that is required to replicate it.

Energy efficiency will be a big focus this year based on the record increase in carbon emissions in 2011 (5.9% increase) that was reported by The New York Times. Another factor is the nuclear problems at Fukushima, which has sifted demand from nuclear power to carbon fuels, and is raising the cost of energy, as well as the possibility of carbon taxes on top of the energy bill. Since storage is becoming a greater percent of the power consumption in the data center, storage energy efficiency is becoming a key consideration for buying decisions. Storage energy efficiency benefits from the other efficiencies cited above, but there are also efficiencies of 40% or more with storage systems like VSP, which use Small Form Factor SAS drives, dense drawers, front to back cooling, and the replacement of batteries with flash for protection of the cache.

Another efficiency that comes to mind is Storage management efficiency: the ability to manage heterogeneous storage arrays as a pool of common resources with a common set of tools, so that resources like capacity and performance can be shared rather than isolated in silos.

Are there other areas of storage efficiency that we should be considering?

Here are more posts on capacity efficiency:

Categories: HDS Blogs

Storage Efficiency: Switch It On III

Thu, 01/26/2012 - 20:46

The greatest tool for storage efficiency is storage virtualization, which enables the extension of other storage efficiency tools like tiering and thin provisioning to existing storage systems that do not have that capability. It also reduces operational costs by providing a common pool of dynamic shared resources under a common set of management tools.

In view of Gartner’s recent prediction of a 5% to 20% increase in disk prices and shortages due to last year’s floods in Thailand, Hitachi Data Systems will extend and enhance the Switch It On III promotion that reduces the cost of virtualizing and managing heterogeneous storage on a Virtual Storage Platform. This promotion is targeted to helping customers increase utilization and reclaim capacity on existing third party storage in the face of raising disk prices and shortages. Customers who take advantage of this promotion will be able to reduce storage CAPEX and OPEX and increase their return on assets.

Photo by Kevin Kevin Pelletier

The Switch It On III promotion, which was due to expire at the end of March has been extended to June 30, 2012. The promotion reduces the licensing and maintenance costs for virtualizing heterogeneous storage systems. It also includes price reductions on thin provisioning, dynamic tiering, disaster recovery, and in-system replication software, as well as management tools for tuning and Command Director that provides an end-to-end application to infrastructure view of utilization and service level objectives.

This promotion is available on new purchases of VSP and existing VSP systems that have not already virtualized external storage.  Hitachi storage virtualization enables customers to fully utilize the resources that they already have in their multivendor environment. It creates a single pool of heterogeneous storage capacity that they can control and optimize with a powerful suite of management tools. It lets them take advantage of cost saving features like thin provisioning and data mobility with their existing storage systems and provides them with a sustainable storage architecture that that can protect and grow into the future.

For specific details on this promotion contact your Hitachi Data Systems representative or reseller. Also, check out more details on Switch It On III.

Also, here are some other blog posts on storage efficiencies:

Categories: HDS Blogs

A Consensus on Storage Efficiencies

Tue, 01/24/2012 - 15:45

Since I posted my trends for 2012, I have been looking at what other bloggers have been predicting.

The most common theme is the explosion of data, and the need for storage efficiencies. Jon Toigo says that 70% of the capacity on every disk today doesn’t need to be there–40% should be archived and the other 30% should be reviewed and probably deleted.

Joe Kovar of CRN predicts that growth in storage capacities decline as users implement more storage efficiencytechnologies like thin provisioning, deduplication, and cloud. An optimistic prediction is that the impact of the Thailand floods on disk shortages will wane sooner than expected, and the expectation is that disk shortages will wane by the second half of this year. I agree that the impact may wane sooner as users implement storage efficiencies. However, I believe there will be a fundamental change in the pricing of storage capacity, as I posted last week.

David Chapa believes storage is becoming more and more affordable to the masses, through the adoption of small business cloud services. He recognizes that home office users now have several terabytes of data stored locally and the increasing costs of managing that data, similar to the enterprise. While we have seen consumer prices of disks more than double in the last two quarters—$79/TB to $190/TB was quoted at a recent Gartner conference—the total cost of managing storage far exceeds the cost of acquisition. Cloud services can reduce the cost of this management and make the total cost of storage more affordable, even though the cost of the disk may increase. See what David Merrill says about procurement costs.

What do you think about storage efficiencies and the impact of disk prices on storage costs?

Here are more posts on capacity efficiency:

Categories: HDS Blogs

HDS Places in FORTUNE’s 100 Best Places to Work with Innovation and Trust

Mon, 01/23/2012 - 13:52

I have been working for HDS for almost 15 years, so, needless to say, I am very proud that we have been recognized as one of FORTUNE magazine’s 100 best companies to work for in the United States in 2012. This recognition helps to validate one of our stated company goals, which is to be the employer and partner of choice.

FORTUNE worked with Great Place to Work Institute to conduct the most extensive employee survey in Corporate America. While this survey applies to the United States, we have similar results across Hitachi Data Systems globally. The Great Place to Work Institute ranked HDS #5 in Poland and #13 in France.  In Silicon Valley—one of the most exciting places to work in technology—we ranked #3 and placed in the top 10 for the last three consecutive years!

How do you define a great workplace? Great Place to Work Institute, who has been doing research on this for over 25 years, has this to say:

“Trust is the defining principle of great workplaces – created through management’s credibility, the respect with which employees feel they are treated, and the extent to which employees expect to be treated fairly. The degree of pride and levels of authentic connection and camaraderie employees feel with one are additional essential components.”

HDS has an entrepreneurial culture where we are constantly improving and innovating. At the foundation of our innovation is trust, an integral part of the Hitachi culture (Hitachi Spirit) that we have inherited from our parent Company Hitachi, Ltd:

Hitachi Spirit is what distinguishes us as the employer and partner of choice. It’s more than our foundation, more than a poster on the wall. It’s how we operate every day, how we get things done. It’s who we are.

See what our employees are saying on the top workplaces website.

Categories: HDS Blogs

The Tipping Point for Hard Disk Prices?

Wed, 01/18/2012 - 19:13

Q1 2012 marks a major turning point in the storage industry. After 50 years of price declines in the magnetic disk industry, we are seeing what most analysts predict to be a 5% to 20% increase in disk prices due to the catastrophic floods in Thailand, which has had a major impact on the disk supply chain. While the manufacturers are hoping to get their capacity back on line to ease the supply shortages by the second half of this year, the cost of rebuilding their manufacturing capabilities will impact prices for some time. This additional cost will also impact the investments required to deliver next generation higher capacity disk technologies, like Bit Patterned Media or Heat Assisted Magnetic Recording, which are on the roadmap for disk drives.

Above is a chart showing disk drive capacity increases going back to 1980, which is truly phenomenal. Due to this rate of capacity increase, the industry has enjoyed an annual price erosion of about 20% to 30% per year on disk media. However, you can see that the density curve is starting to slow down as we approach the limits of current perpendicular recording technologies.

With this historic price erosion, most data centers depreciate their enterprise storage over three years while midrange storage is typically depreciated over five years considering it has been cheaper to buy new than to maintain the old after that time frame. If this price erosion starts to slow down, data centers may need to extend their depreciation to seven years. By this time, the disks will be in the five to ten TB range, so keeping the media longer may not be a bad idea.

However, there is a lot more technology that goes into storage systems than the disk technology and the rate of that technology has been increasing rapidly with thin provisioning, data mobility, tiering, replication, and closer integration with the application layer through APIs, plugins, client/providers, adapters, and snap-ins. That means that a five to seven year life cycle for storage systems will make your storage system non-competitive. The reason why enterprise storage is capitalized over a shorter period than lower cost modular storage is because of the higher technology cycle of enterprise storage.

An approach to solving this is to separate the disk capacity from the enterprise storage system controllers, so that you can keep the storage system controllers current with systems technology on a three year cycle, while you refresh the disk capacity on a five to seven year cycle. Since the storage media is still the bulk of the cost of a storage system, the longer depreciation cycle will help to reduce the capital costs. You can do this with an enterprise storage control unit, which also has the capability to virtualize external storage systems. This is what we provide with VSP.

What are your thoughts? Are the price increases that we expect this quarter just an anomaly and will we go back to enjoying the price erosion that we have enjoyed for the last 50 years? Or has this changed forever?

Is this a tipping point in the way we capitalize storage assets?

Categories: HDS Blogs

Buying Disks or Buying Storage Efficiencies

Thu, 01/12/2012 - 23:24

At the top of my list of trends to watch in 2012 was an increased focus on storage efficiency due to economic uncertainty and hard disk supply shortages—stemming from last year’s floods in Thailand. Yesterday IDC and Gartner both reported declines in 4th quarter PC shipment of 1.4 to 0.2%, compared to 2010 that was partly due to disk drive shortages. (My colleague David Merrill also covered this in a recent post.)

The shortages have been felt the most in the consumer markets. At the Gartner Data Center Conference in Las Vegas, a speaker cited the costs of a TB disk at the U.S. retailer Fry’s had gone from $79 to $190. Last November, the Storage Architect reported that a 2TB SATA drive that he had bought before for £65 was then listed at £150 on Amazon. Consumer markets run on very low margins, so the price can increase dramatically in response to any shortages.

In the enterprise space, the shortages have been real but the prices have been more stable. Some drive types have increased on the order of 5-15%. Consult your vendor to see what the current status is. Hopefully we will see the supply situation return to normal by the end of 2Q.

The Storage Architect supports my view of focusing on storage efficiency during this period by using thin provisioning and other efficiency methods in his post Drive Prices Increase – Who Will Suffer Most?

“If your vendor doesn’t offer it, then there are plenty out there who do.  As prices rise, it may be time to look again at implementing these features and fixing the processes that stop you using them today.”

Investing in storage virtualization through VSP adds another dimension to storage efficiency by extending these new capabilities to existing storage systems. If your current storage system does not provide thin provisioning, which can reclaim 40% or more of allocated but unused capacity, you don’t need to rip and replace it. Just by attaching it behind VSP, VSP can see your existing LUNs, and move them into a dynamic provisioning pool where the unused pages in the LUN can be reclaimed (Zero Page Reclaim) while your application is running.

So instead of buying additional disks during this period of shortages, invest in storage virtualization with VSP, which not only frees up the capacity you need today from your existing storage assets, but positions you for sustainable growth into the future. See what Claus Mikkelsen and David Merrill have to say about storage efficiencies.

Categories: HDS Blogs

Why RAID and Erasure Codes Need to be Considered in Disk Purchases

Wed, 01/11/2012 - 18:47

Recently, I spent a few days with Garth Gibson, a computer scientist at Carnegie Mellon University and the founder of Panasas, an enterprise server and storage company. Garth and I were in Singapore for a review with the Data Storage Institute.

Garth is best known for the research paper that he authored with David Patterson and Randy Katz in 1988, “A Case for Redundant Arrays of Inexpensive Disks (RAID)”, which was the catalyst for the RAID storage industry. Once this paper was published and presented at conferences, it took only a few years for all the major storage vendors to deliver RAID storage systems and for customers to adopt this new technology. The rate of this adoption was phenomenal.

When I asked Garth about this, he said the reason RAID was adopted so quickly was that this was delivered as a paper and not as a patent. It was freely available to the industry. Garth and Randy also included a taxonomy that defined RAID levels 1 to 5 and mathematical calculations to determine Mean Time To Failure (MTTF), a key factor in fault tolerance.

It also was helped by the availability of relatively inexpensive 5.5-inch disk drives and the premise that a RAID array of inexpensive disk drives could replace more expensive enterprise storage systems with the same reliability and performance. The industry was quick to drop the term “Inexpensive” in favor of “Independent” and RAID was redefined as Redundant Array of Independent Disks.

Tracing RAID’s Origins

Actually, the concept of RAID was introduced much earlier. Garth does not claim that he invented RAID. The earliest patent on RAID was filed by Norman “Ken” Ouchi of IBM, who was issued U.S. Patent 4,092,732 titled, “System for recovering data stored in a failed memory unit” in 1978. (Claus Mikkelsen and I worked with Ken at IBM.)

This patent described what Garth and his colleagues later defined as RAID 5. Ken’s patent also mentioned mirroring (RAID 1) and dedicated parity (RAID 4) as prior art at that time. So RAID has been around for some time but was not adopted until the RAID paper in 1988 which gave it a name, taxonomy, and a financial justification.

RAID as a fault tolerance mechanism for storage is running out of gas as the densities of disk media increase and the probability of multi drive failures increase. RAID levels up to RAID 5 only protect against a single drive failure in a RAID group. As densities increase, the probability of a drive failure increases and the RAID rebuild time also increases which affects performance due to drive contention and the increased probability of another drive failing during the rebuild. There is also an increasing problem with uncorrectable read errors as densities increase.

Getting Familiar with Erasure Codes

While everyone is familiar with RAID as a method for protection of a drive failure, many are not familiar with the term “erasure code”.  The image on the left come from IEEE to illustrate the concept.

In information theory, an erasure code is a Forward Error Correction (FEC) code for a binary channel (where data is transmitted as one of two symbols, usually a 0 or 1) that can reconstruct symbols that are erased.  It can be used for networks as wells as storage.

RAID is really a simple form of an erasure code where a parity or check sum is appended to a number of records, so that if one record is lost, it can be reconstructed by summing or XOR the remaining records and parity.

If we want to correct more than one error, additional redundancy must be added and the calculation now becomes a polynomial. This is where you will begin to hear more about erasure codes. RAID 6 is a polynomial erasure code that was introduced with large capacity drives in the last decade to protect against dual drive failures. RAID 6 has two redundancy records, so it requires more overhead than RAID 5 in capacity and processing.  As a result it has not been widely adopted until recently.

RAID 6 also helps with uncorrectable read errors. Today, we strongly recommend the use of RAID 6 with RAID pools where data is stripped across many RAID groups, since a dual drive failure in one RAID group would create data loss in all the applications that are using this pool. The cost of an additional parity drive in each RAID group is relatively inexpensive compared to the application down time and the cost of recovering an entire provisioning pool.

However, RAID 6 is not a long-term panacea since it only protects against dual drive failures. With the increasing rate of drive densities, it won’t be too long before we get concerned over three or more drive failures in a RAID group. Storage vendors are working to address these long term requirements.

While storage systems vendors source their disks from the same disk vendors, the reliability of the disk in a storage system will vary depending on how well the system vendors scrub the drives for errors, the effectiveness of their proprietary error detection and recovery software, their maintenance practices, and their proprietary implementation of erasure codes. Users will need to consider the track record of the vendor’s disk availability and then consider the costs and performance trade offs of different erasure codes.

Categories: HDS Blogs

Hitachi VSP Kicks off the New Year with Another Award

Fri, 01/06/2012 - 15:54

Happy 2012! While this year is starting with a lot of uncertainty around the world economy and supply/demand questions, there are still areas of assurance. One is that you can still do more with less to meet your storage needs.

Today, Nikkei, the leading business magazine in Japan, announced that Hitachi VSP was the winner of the 2011 Nikkei Superior Products and Services Awards. This year, over 20,000 products were submitted and only 44 were chosen for the award. Nikkei chose VSP for its usability and high performance through virtualization to reduce CAPEX and OPEX.

This is the latest addition to the list of awards VSP has received, which collectively recognize its innovation. Among these awards are the Information Age Innovation Award for 2011 and the TechTarget Best Enterprise Storage Platform Award for 2010.

The best awards are those that come from our customers, who have made VSP the most successful product in Hitachi Data Systems history. Most customers, who take advantage of the full value of VSP with virtualization of their existing assets, can see a payback in nine months or less, and a 40% reduction in capacity demand.

So if you are looking to increase your storage resources while struggling with budget constraints, contact your HDS account representative or HDS Partner to see what VSP can do for you: http://www.hds.com/corporate/contacts/

Categories: HDS Blogs

Looking Back on the Future: Top 10 Storage Trends for 2012

Thu, 12/29/2011 - 16:52

As we close out 2011, the storage industry has seen significant growth based on budgets, which were established in the beginning of the year. However, over the course of 2011, we saw natural disasters, political upheaval, and heightened economic turmoil. Companies are now looking ahead to 2012 with a great deal of uncertainty around their budgets. However, there is absolute certainty that the growth of data will continue to explode.

In order to meet these challenges, IT will focus on technologies that will enable them to grow through better use of their existing assets. During the last few months I have been blogging about ten storage trends that will develop out of this focus.

As my last post of 2011, I am summarizing them here with links to the extended posts.

  1. Storage Efficiency: Global economic uncertainty and supply shortages in the first half of 2012 will require IT professionals to achieve better returns from their existing assets rather than buying new ones. There will be a greater focus on storage efficiency technologies such as dynamic or thin provisioning, dynamic tiering, archiving, and the extension of these technologies to existing assets with storage virtualization. http://blogs.hds.com/hu/2011/11/2012-a-focus-on-increasing-storage-utilization.html
  2. Consolidation to Convergence: Consolidation will give way to convergence. Over the past few years IT has focused on consolidation, with much of the low-hanging fruit already executed on. In order to gain further cost savings, the focus will be on convergence of server, storage, networks, and applications. Application programming interfaces (APIs), which offload workload to storage, can make servers and memory more efficient. Orchestration software will help to converge the management, and automate the provisioning and reporting across local, remote, and cloud based server, storage, and network infrastructures. http://blogs.hds.com/hu/2011/11/2012-trend-consolidation-to-convergence.html
  3. Transparency: Applications and infrastructure will be more transparent with each other in order to facilitate convergence through open interfaces like APIs, client/providers, and plugins. HDS provides Hitachi Command Director software, which gives applications a view into the service level, utilization, and health of the storage infrastructure behind the virtual storage that they are using.  http://blogs.hds.com/hu/2011/11/2012-trend-greater-transparency-for-end-to-end-management.html
  4. Storage Computers: Storage systems will need to become storage computers as more functions are being driven down to the storage level.  Old storage architectures with general purpose controllers which service all these new functions along with the normal I/O workload will not be able to scale. New storage architectures with separate pools of processors will be required to handle these additional functions. http://blogs.hds.com/hu/2011/11/2012-trend-rise-of-the-storage-computer.html
  5. Energy Efficiency: Power, cooling and carbon footprints will become even more critical as energy demand increases and countries begin to impose carbon taxes. IT will be asked to shoulder their share of the energy burden and will need to consider the energy savings of small form factor drives and front-to-back cooling. http://blogs.hds.com/hu/2011/12/2012-trend-energy-efficiency.html
  6. Closing the Consumption Gap: The consumption gap between technology and IT operations will be an area of focus as businesses drive IT to implement technologies at a faster pace in order to realize the benefits that are already available with current technology. There will be a greater need for services to offload over-committed IT staff and accelerate the adoption of new technologies. http://blogs.hds.com/hu/2011/11/the-technology-consumption-gap.html
  7. Storage Scaling: Sever and desktop virtualization will increase the need for enterprises to scale up storage systems non-disruptively as virtual machine demands increase. With more virtual eggs in one basket, modular storage systems will need to be replaced by high availability enterprise storage to service the tier one demands of virtual servers. Scale-out storage architectures will not be able to meet the scale-up demands of server and desktop virtualization.  http://blogs.hds.com/hu/2011/12/2012-trend-an-increasing-need-for-scale-up-storage.html
  8. Virtualized Migration: Disruptive device migrations will be replaced by new virtualization capabilities that will eliminate the need to reboot. http://blogs.hds.com/hu/2011/12/2012-trend-virtualized-migration-of-storage.html
  9. Cloud Acquisition: Cloud acquisition, based on self-service, pay per use, and on demand will begin to replace the current three-to-five year acquisition cycle of products, as convergence begins to create blended pools of resources. http://blogs.hds.com/hu/2011/12/2012-trend-cloud-acquisition.html#more-5098
  10. Big Data: The hype for 2012 will continue to be around Big Data. The explosion of unstructured data and mobile applications will generate a huge opportunity for the creation of business value, competitive advantage, and decision support if this data can be managed and accessed efficiently. The massive size of Big Data will make it difficult to use traditional, relational databases, or desktop visualization products. Object based content platforms and large-scale file servers will be required to store this data. There will be greater adoption of content platforms in preparation for Big Data analytics. http://blogs.hds.com/hu/2011/12/2012-trend-big-data.html

These are just a few of my personal thoughts regarding 2012. Please let me know what you think will be the major focus areas in storage next year.

Categories: HDS Blogs

2012 Trend: Big Data

Tue, 12/27/2011 - 15:29

The big hype in 2012 will be around Big Data. The explosion of unstructured data and mobile applications will generate a huge opportunity for the creation of business value, competitive advantage, and decision support if this data can be captured, stored, managed, accessed, analyzed, and visualized. Companies that provide these capabilities for Big Data will be targets for acquisition, much like we saw in past years with thin provisioning technology companies.

“Big” is a relative term. What may be big for one company may not be big for another. Data is big when it becomes difficult to work with using relational databases or desktop visualization packages. The size of Big Data makes it impractical to replicate, backup and mine through traditional means. Instead of moving or replicating big data for use by analysis programs, analysis programs will have to work directly with the original data using massively parallel or map reduce software. Big data may be more about the intersection of many data stores.

Big Data is best stored in an object store built on a virtualized storage infrastructure so that it is not affected by changes in the infrastructure. An object store is a container, which holds data along with the meta data that describes the data, and the policies that govern it. An object store disaggregates the data from the application or instrumentation which created it, and essentially virtualizes the data so that it can be accessed and repurposed by other applications.

Since Big Data is predominantly unstructured data that is not updated, once it is ingested into an object store, it only needs to be replicated once for protection. Some data sources maybe acquired over NFS, which will require a high performance filer that can scale to multiple PBs and millions of objects—like the HNAS 3200.  HNAS also provides the ability to stub out to the HCP content platform as a seamless extension of the file system into an object store.

The interest in Big Data will drive greater adoption of object stores like HCP and large scale file servers like HNAS from Hitachi Data Systems. It will also open up a huge market for analytics. HDS has partnered with SAP with certification of a reference architecture for their in-memory transaction analysis system, HANA, using our Hitachi Blade servers and AMS storage.

For Hu’s other 2012 trends, visit this bit.ly bundle: http://bitly.com/vXGP2T

Categories: HDS Blogs

2012 Trend: Cloud Acquisition

Thu, 12/22/2011 - 13:41

By now it is clear that cloud is a reality with many successful implementations of cloud services. One of the most valuable benefits is the way that cloud services can be acquired—namely on demand, pay as you go, and self-service.

This acquisition model will be extended back into the data center to reduce the traditional costs of acquiring storage infrastructure.

The traditional way that storage is acquired is through the purchase of assets, which are capitalized over three to five years. Three years is often used in the case of feature rich, enterprise storage, and five years in the case of commodity modular storage. With the traditional price erosion of storage due to increasing disk capacity per spindle, it is cheaper to buy new than to extend the life of storage systems after three to five years.

Most data centers will buy all their storage for the next three to five years at today’s price, even though the cost may be eroding at 25% to 30% per year. The reason they do this is because it is too disruptive to add incremental upgrades once the system is in production, and they lose the capital life of assets that they install toward the end of the cycle. When they install the replacement technology, they rip and replace all their previous technologies, even if some of it has only been capitalized for a portion of the full three to five year cap rate.

When they install the new technology, they will need to overlap it with the current technology to allow for migration of the data, and this could take six months or more if they do not virtualize the migration. By the time they finish the current migration, they only have a short time before they must prepare for the next.

The chart below illustrates the waste associated with this type of acquisition. All the capacity above the demand line is capacity that is paid for but not utilized, while it still is consuming power, cooling, and space. The overlap areas represent the waste associated with non-virtualized migration.

The following chart illustrates how this waste could be eliminated with a cloud acquisition model, based on pay per use, on demand.

Aside from using a cloud provider, how can you change your current acquisition model to a cloud acquisition model? Two things have to happen.

First, you need to use storage virtualization so that you can separate the intelligent controller from the media, so that you can add media on demand without disrupting the applications, and you can realize the full capitalization of that asset since the functionality will be kept current in the storage virtualization engine. Virtualization with thin provisioning will enable you to leverage the resources across many users and normalize peak demand. Virtualization will also reduce the overlap associated with data migration.

Second you have to find some way to cover the capital cost either through leveraging the cost across many users— like a cloud provider would or through leasing, managed services, or other financial services.

The key to realizing on demand, pay as you go acquisition will depend on a foundation of storage virtualization. Many data centers will see the advantage of this and will begin to replace their traditional acquisition model with a cloud acquisition model.

For Hu’s other 2012 trends, visit this bit.ly bundle: http://bitly.com/vXGP2T

Categories: HDS Blogs

2012 Trend: Virtualized Migration of Storage

Tue, 12/20/2011 - 16:39

The number of applications that are consolidated onto a SAN attached storage frame has increased dramatically with the adoption of virtual servers. This is making it increasingly more difficult to migrate the data when the storage system needs to be refreshed. The quantity of data is in the tens or even hundreds of TBs, which will take days or weeks to physically move from one storage frame to another. What is even more impactful is the number of applications, which have to be stopped while the data is moved and then restarted when that move is complete.

There may be more than 100 applications on a large storage system, which makes it nearly impossible to stop them all for an extended period of time to complete the migration. Typically this is done by scheduling a few applications for migration each weekend until all the applications are migrated over a period of months. During this time there are the power and cooling costs for two storage frames, one of which is probably has expensive maintenance, as well as software license, and other operational costs. David Merrill has done studies which show that the burdened cost of migration can be $7,000 to $10,000 per TB, which is about the same as the cost of acquisition.

The time to complete a migration can be greatly reduced through the use of storage virtualization. During a scheduled down time you can stop the applications for a short time while you insert a storage virtualization platform between the host servers and the old storage system. Once that is done you can reassign the LUNs, rezone the SAN and restart the applications while the movement of the data takes place in the background. Many storage virtualization users have done this type of migration in a weekend.

While this type of storage migration does take an outage in order to insert the storage virtualization platform, the storage that sits behind it can be migrated to the next generation storage system without an outage.

But what happens when the storage virtualization system has to be refreshed? Today you can do it in the same manner that you virtualized your other storage. You either virtualize the new virtualization system behind the old one and move the data or virtualize the old one behind the new virtualization system. This will still take an outage. Many of our customers have used this technique to migrate from USP to USPV, and then ultimately to VSP.

In 2012, you will see storage virtualization vendors enable the upgrade of storage virtualization systems without an outage. This will further reduce the cost of storage migration by eliminating the need to disrupt the application. But you can’t take advantage of this unless you install the initial storage virtualization platform.

For Hu’s other 2012 trends, visit this bit.ly bundle: http://bitly.com/vXGP2T

Categories: HDS Blogs

2012 Trend: An Increasing Need for Scale Up Storage

Fri, 12/16/2011 - 16:58

Server and desktop virtualization will increase the need for enterprises to scale up storage systems as physical server demands increase. Initial installations of server virtualization were done to consolidate non-critical application servers, which were installed on lower cost modular, dual controller, storage systems. As multiple servers were virtualized and consolidated onto a single physical server, their storage was also consolidated onto a single modular storage system connected behind the physical server. The I/O workload that used to be distributed on file systems across multiple modular storage systems is now consolidated onto one virtual file system and one modular storage system.

Initially this worked since the servers and storage that were consolidated in the beginning were running at low utilization, but as more and more servers are consolidated onto the same virtual platform, modular storage systems will begin to break. Dual controller, modular storage systems were designed to support direct attached workstations, and were not designed to support large scale, virtualization servers. Virtualization servers require enterprise storage systems, which were designed from the beginning to support the scale up demands of mainframe virtualization. Enterprise storage systems are designed to integrate the power of multiple controllers through a global cache as the virtual servers demand more storage resources.

Today’s multi-core Unix and x86 servers are as powerful as mainframes were just a few years ago, and are running virtualization through multiple partitions or a hypervisor, which dramatically increases the load on storage systems. In addition, FC SANs are increasing the consolidation of more and more servers onto a storage platform with higher transmission speeds of eight Gbps. Applications are beginning to offload more workload onto the storage systems through APIs, client providers, and plugins. All this additional workload will demand that storage can scale up beyond the capabilities of a dual controller, modular storage system.

As we increase the number of virtual machines or partitions in a virtual server, we are also “putting more eggs in one basket”. This means there is a higher demand for availability from the storage system. A dual controller system is not a high availability system. When one controller is down for scheduled or unscheduled maintenance, the other controller is usually stopped in order to avoid data loss, or performance suffers as one controller tries to do the work of two. An enterprise storage system with multiple controllers and a global cache can continue to operate in the event that one or even two controllers is taken offline.

If you have already invested in modular storage for your virtual server environment and are experiencing outages and slowdowns, you don’t need to replace your investment in modular storage. You can keep your modular storage and front end it with an enterprise storage virtualization platform like VSP in order to get the scale up performance and high availability of an enterprise storage system.

For Hu’s other 2012 trends, visit this bit.ly bundle: http://bitly.com/vXGP2T

Categories: HDS Blogs

HDI 3.0: Edge-to-Core with Content Sharing

Thu, 12/15/2011 - 23:03

Hitachi Data Ingestor (HDI) combines with Hitachi Content Platform (HCP) to provide an “edge-to-core” data solution that eliminates the need for backup and delivers a seemingly bottomless filer on the edge of a cloud, or for remote office/branch offices.

This week, we announced version three of the HDI product, which enables the following new capabilities:

1. An end user to discover and retrieve previous versions of a file or files that were deleted;

2. An HDI system that allows other HDI systems to access and read content that it stored in the same HCP;

3. Transparent migration of data from a production NAS (CIFS Windows Storage Server or NetApp FAS) to HDI and HCP.

Content sharing has many practical applications. A good example is healthcare, where a patient might move to another city and transfer from one hospital to another. In this case, the new hospital could be granted access to the patient’s data that was stored in HCP by the previous healthcare facility.

And speaking of healthcare, did you see Dave Wilson’s post: Google Health Dies – What Next?

Also, check out Miki Sandorfi’s take on HDI: Enhancements to Hitachi Data Ingestor

For more information on HCP and the new HDI features, see here.

Categories: HDS Blogs

2012 Trend: Closing the Consumption Gap Between IT Technology and Operations

Mon, 12/12/2011 - 15:31

Many analysts are recognizing a growing gap between technology and the ability of IT organizations to consume the product features and value that new technology can enable.

During the past few years there have been a lot of new technologies that have become available to IT operations. We have seen wide adoption of server virtualization, but even here there is a widening consumption gap. Server virtualization technology is advancing rapidly with VAAI in ESX 4 and the introduction of vSphere 5 this year. Many server virtualization users are still on ESX 3 and are not able to enjoy the benefits of VAAI or vSphere 5, either because they have not had the time to upgrade their hypervisor or upgrade their storage to support these new features.

Storage vendors have introduced thin provisioning with zero page reclaim, dynamic tiering, archiving, and de-duplication or single instance store, but many storage administrators have not implemented these capabilities even though they may be able to reduce their capital and operational costs by at least 40%.  Even worse, many users have bought the capability but have not found time or resources to implement it.

The pace of business is also increasing and organizations are driving IT to implement new technologies faster in order to be more responsive. However, this is often difficult to do when IT has been doing more with less for so many years and the operations staff is down to bare bones and reacting by jumping from one task to the next. It takes time to learn the new technologies, plan the implementation and integration with existing systems, and execute the plan. If the proper planning is not done, it can create more work and delays for IT.

Now more than ever IT must look to third party services to fill the gap and unload some of the grunt work so that their operations staffs can be properly trained to plan and execute to close the consumption gap between technology and operations. Even if you decide to outsource or offshore, you need time to become educated and do the proper planning.

Technology alone will not close the consumption gap. The planning must include people, process, services, business or financials, and governance. Here is how Hitachi Data Systems provides services to close this gap.

For Hu’s other 2012 trends, visit this bit.ly bundle: http://bitly.com/vXGP2T

Categories: HDS Blogs

How Thin Provisioning Contributes to Storage Efficiency

Fri, 12/09/2011 - 22:57

As you may have read already, I led off my 2012 trends blog series with a post on a “Focus on increasing storage utilization.”

I have talked with many customers who have seen utilization of storage assets increase from 20% -30%, and 50% - 60% using efficiency tools such as thin provisioning, dynamic tiering, deduplication, and active archive. A comment from John Nicholson indicates that the problem of efficiency may be even greater than the problem of utilization, as he ponders “how 100TB of raw disk capacity turns into 15 TB of actual data with layers of thick provisioning, virtualization, and wasteful snapshots.”

Layers of thick provisioning can be eliminated by a combination of thin provisioning in the storage systems and APIs from file vendors. Storage systems that support thin provisioning can provision user requests for storage with virtual capacity until they actually write to the storage, then they will provide the physical storage as the data is written. The storage system will not be able to reclaim space that is deleted unless the file system informs the storage through an API or SCSI command. Storage systems that support thin provisioning can “thin” existing thick volumes, by moving them into a thin provisioned pool. As they move the pages or chunks, they can determine which pages or chunks have zero data and thin them from the volume. If the storage system is also capable of storage virtualization, then it can thin provision external storage that would otherwise need to be replaced to get this capability.

Once a volume is thin provisioned, all the snapshots and moves of that volume become more efficient since all the allocated unused capacity has been eliminated.  If the volume is composed of static data, data may be active but is not being updated, it could be moved into an active archive where only one copy is needed for redundancy and all those additional snapshots and backups can be eliminated.

I was a little stumped by how virtualization could be wasteful, until you consider server virtualization. Server virtualization creates many virtual machines each with their own VMDisk. A VMdisk, just as a physical disk for a physical machine, needs to be allocated and formatted. VMWare will do this in software, but this can be eliminated if the storage system can support VAAI primitives—which enables the storage to thin provision—or provision the VMDisk with virtual capacity from a thin provision pool of preformatted capacity. The ease of creating virtual machines also leads to virtual server sprawl, where many machines are created that are not utilized while they reserve storage capacity. Support of VAAI requires controller upgrades or controller replacement in the storage. However, a storage virtualization solution, like Hitachi VSP, can provide this support for any storage that is attached.

John also asked if there were any favorite tools for hunting down the layers of unused capacity, which could be reclaimed by Dynamic Provisioning. A favorite tool that we use is our own Storage Capacity Reporter and Virtual Server Reporter, powered by Aptare.

Are there any other tools that users recommend?

Categories: HDS Blogs

Can you live without email?

Mon, 12/05/2011 - 23:42

An ABC news blog caught my eye recently. It was entitled Tech Firm implements employee “Zero Email” policy.

Source: http://www.funnytimes.com/cartoons.php?cartoon_id=19950517#.Tt1KfnMwKjU

It quotes Thierry Breton, the CEO of ATOS, a Global French Information Technology company, who says that only 10% of the 200 messages that each employee receives per day is useful and he plans to eradicate internal emails in 18 months, forcing the company’s 74,000 employees to communicate with each other via instant messaging and a Facebook style interface. A spokesperson clarified that this was for internal emails rather than external emails to clients and partners. ATOS is trying to reverse the trend toward email pollution, which is increasingly encroaching on our personal lives as well increasing the percent of our work hours searching for information.

All of us who use email can relate to this. The first task I do every morning is check my calendar and delete emails which are not relevant to my business activities. Those that are relevant are often chained to several messages, which could be deleted after I read the latest in the chain.

At Hitachi Data Systems we use Exchange for our email system, but we also use an internal social network called the LOOP where Hitachi Data Systems employees can create forums and invite other employee with similar interests to join. In this way we communicate with interested parties without the need for long email chains and multiple copies of attachments, which get out of synch.  For instance, we have a LOOP forum for internal bloggers.

What do you think about the “Zero Email” Policy? What other ways can you communicate internally? Externally?

Categories: HDS Blogs

2012 Trend: Energy Efficiency

Mon, 12/05/2011 - 18:18

In 2012, power, cooling and carbon footprints will become even more critical as energy demand increases and countries begin to impose carbon taxes. IT will be asked to shoulder their share of the energy burden.

Last week the 17th session of the Conference of the Parties to the Climate Change Convention, in conjunction with the 7th meeting of the Parties to the Kyoto Protocol, met in Durban, South Africa to get commitments from 37 industrial countries for further reductions in carbon emissions.  An assessment of the state of CO2 emissions, which was prepared for this conference by the International Energy Agency can be downloaded here: http://www.iea.org/co2highlights/co2highlights.pdf

This report shows that in 2009, 41% of CO2—the majority—was generated by electricity and heating from carbon fuels. With the recent disasters which have shut down many nuclear energy plants, there will be more dependence on carbon fuels, which will drive up the cost of electricity and the emission of CO2.

Despite the downturn in the economy, Australia recently passed a carbon tax. Other countries and states, like the European Union, California, India and China are or will be imposing similar taxes or restrictions. In the near future, data centers can expect to get a carbon tax bill with their increasing electricity bill.

Data Centers need to be investing now to reduce their energy costs. The good news is that there are technologies helping to dramatically reduce energy costs.

In August 2007, the EPA’s Report to Congress on Server and Data Center Energy Efficiency was released. This report showed that the energy use of the nations’ servers and data centers in 2006 more than doubled the electricity consumed for that purpose than in 2000.  In 2006, data centers and servers consumed about 61 billion kWh (1.5 percent of total U.S. energy consumption) for a total of $4.5 billion.

Annual Energy Use by Server Type 2000-2006

Source: U.S. EPA Report to Congress on Server and Data Center Energy Efficiency

At this rate, energy consumption was expected to double by 2012. Fortunately, that growth rate was moderated significantly in a new study done by Stanford professor Jonathan Koomey, and reported in The New York Times, which showed that data center power consumption increased by only 36% from 2005 to 2010—rather than the 100% predicted in 2007. http://www.nytimes.com/2011/08/01/technology/data-centers-using-less-power-than-forecast-report-says.html?_r=1

The two main reasons for this moderation were the economy, and the adoption of server virtualization, which greatly reduced the power consumption of volume servers. Virtualization of servers enabled one server with new multicore processors to replace 10 or more standalone servers. This reduction in servers also helped to reduce site infrastructure costs.

Servers were the biggest factor in data center power consumption, but what about storage? In the chart above, you can see that the power consumption of storage nearly tripled from 2000 to 2006. Server virtualization does not do anything for storage; a server, whether it is physical or virtual, will require the same amount of storage. In fact, it might require more storage as applications tend to proliferate when servers are virtual and essentially free after the initial cost of the physical server. Unfortunately, The New York Times report does not break down the power consumption between storage and servers, but I expect that storage power consumption continues to explode at the previous pace, and will soon become the dominant power consumer in the Data Center if it is not brought under control.

Fortunately, there are some recent improvements in storage and data management which can make storage sustainable. On the physical side, it is the introduction of 2.5 inch disks which consume half the power of 3.5 inch disks which have been the standard until recently. Secondly, some vendors like HDS have gone to denser packaging of disk modules, which reduce floor space and cooling requirements, and replacement of batteries with solid state disks to protect volatile cache memory.

Storage virtualization can reduce the need for storage capacity and related power and cooling, as well as automate the tiering of less active data to larger capacity disks, thin provision over-allocated volumes, and consolidate silos of external storage into a common pool of storage resources. De-duplication also helps to reduce the need for disk capacity. All these improvements may make about a 40% to 60% improvement in power consumption for storage.

Another approach that may make an even greater improvement in the reduction of storage power consumption is the virtualization of data, which I covered in a previous post. Gartner and other analyst tell us that we make 10 to 20 copies of our data for backup, data analysis, development and test, data sharing, etc. If we virtualize the data in a content platform and mirror it in another content platform, we eliminate the need for copies and backup. Just like server virtualization enables multiple operating systems to run on one physical server, data virtualization enables multiple applications to run on one physical content platform or one copy of the data.

In 2012, we will see a focus on reducing power consumption for storage on the scale that we have seen for servers. We will see more focus on smaller form factor disks, storage virtualization, and data virtualization in order to reduce the power bill and carbon tax for data centers.

For Hu’s other 2012 trends, visit this bit.ly bundle: http://bitly.com/vXGP2T

Categories: HDS Blogs