Containing the flow of big data

Containing the flow of big data

“Big Data” is no longer just another IT buzzword, but rather a valuable business tool for organisations looking to gain a competitive advantage. Whether enterprises want to attract new customers or deliver innovative products, the power and benefits of big data are well documented.

However, big data comes with a unique set of challenges, particularly in terms of storage. Gartner research estimates that 40% of all organisations will double the size of their on-premise storage infrastructure by 2016, but even if this is the case, big data storage requirements go well beyond the capabilities of traditional SAN and NAS storage systems, writes Gerald Sternagl, EMEA Business Unit Manager Storage at Red Hat.

Between advances in technology and changes to legal and regulatory requirements (particularly in consumer-facing industries), the sheer volume of data within enterprises is spiralling out of control. If that wasn’t bad enough, the data is no longer uniform and structured; whether it be an email, an image, a video, an application log or even a GPS signal from a mobile device, organisations are realising that every single bit of data has the potential to offer valuable insight. Traditional storage systems with their expensive and proprietary components were not designed to handle such volumes of data and any attempt to use these systems to store big data will prove to be extremely costly, not to mention inefficient and inflexible.

It is estimated that businesses generate between 40-60% more data each year, making it almost impossible for IT managers to predict the necessary amount of storage capacity – something that they would need to do upfront in the case of traditional storage systems.

Finally, there is the issue of portability. Big data cannot be dealt with using traditional solutions such as aggregation and data integration, due to the sheer volume of data involved as well as the bandwidth limitations of wide-area networks (WANs). Furthermore, the emergence of cloud computing and virtualisation requires IT managers to deploy data storage solutions that can be easily transported from one environment to another.

If enterprises are to take advantage of big data and harness its potential, they need to address these storage challenges. Rather than simply investing in larger traditional storage systems, enterprises need to completely reinvent their approach to data storage. Fundamental to this approach is the need for enterprises to start thinking about their storage systems as data platforms rather than static data destinations. In doing so, enterprises will be able to meet both their current and future storage needs by satisfying five key success factors for managing big data:

 

1) Delivering cost-effective scale and storage capacity

Probably the most critical requirement for big data storage platforms is the need to be agile and flexible enough to be scaled up (both in terms of capacity and performance) to match the enterprises’ storage demands, with the caveat of keeping CapEx and OpEx to a minimum.

Unlike traditional NAS and SAN storage systems with fixed capacity, requiring either a data purge or a very expensive ‘scale-up’ effort to accommodate extra data, big data storage platforms take a ‘scale-out’ approach. This is achieved by combining industry standard commodity servers with virtual or cloud storage resources to create an easily managed storage system.

 

2) Eliminating need for data migration

Because traditional storage systems have fixed capacity, growing businesses need to balance their future data storage requirements with current budget constraints. For this reason, enterprises need to migrate their data to newer systems periodically. Depending on the volume of data, these migrations can be very expensive, cause a drain on personnel, and can often lead to additional hidden costs such as unplanned downtime or overrun leases. With some enterprises’ data now approaching petabyte size, they need to avoid having to physically migrate data from one environment to another.

 

3) Bridging legacy storage silos

In order to keep pace with the exponential growth in data, enterprises with traditional fixed capacity storage systems often end up accumulating extra storage systems. Because these systems are disconnected from each other, this so called ‘storage sprawl’ inhibits the ability of the enterprise to see the big picture and extract any valuable insight from the gathered data. Rather than adding another system into the mix, big data storage must be capable of bridging these legacy silos.

 

4) Ensuring global accessibility of data

Given the sheer volume of big data, limitations on WAN bandwidth and the consequences of a single point of failure, a centralised approach to data management is no longer practical.

Big data storage platforms must be able to manage data distributed across global enterprises as a single, unified pool.

 

5) Protecting & maintaining the availability of data

Traditional storage systems rely on external backups and hardware redundancies to mitigate loss of data and application downtime. Such a strategy is impractical for big data, from an efficiency and a cost standpoint, due to the volume and decentralized nature of the data. Big data storage platforms need to perform automatic replication of the data to ensure that the data is instantly available and extremely robust.

 

If enterprises want to gain a competitive advantage by harnessing the power of big data, it is imperative that they take decisive action and address their data storage system capabilities. Migrating from traditional fixed capacity storage systems to an open, software-defined storage platform will allow enterprises to circumvent all the pitfalls inherent in traditional systems at a fraction of the cost and reap their rewards for years to come.

Click below to share this article

Browse our latest issue

Intelligent CIO Middle East

View Magazine Archive