Categories: Big DataData Storage

The Speed vs. Richness Data Trade-Off

Let me start off with a simple business fact: it has never been more important for businesses to analyse and make sense of their data.

For precisely this reason, many are recognising the newfound responsibility of using data to create more value, whether it is to keep costs down, drive sales, engage customers or improve process efficiency. This part of the story is all well and good, but what isn’t explored so often, is the business plan that must be put in place in order to ensure data is being put to work as effectively as it can be in 2015.

Data Velocity: The Speed vs. Richness Trade-Off

To really succeed, it is critical that a business first understands just how quickly it must put data to work. What is more important to your business for example, accessing a richly defined data set for broad analytic use, or getting hold of data quickly? Enter the New Data Velocity Spectrum, which recognises that some data needs curation time, rich definition and dimension in order to unlock its value, while other data needs to be put to work immediately, coming to be known as ‘fast data’.

microsoftThe simple premise here is that deep analysis requires more richly defined data that supports multiple views or perspectives and requires some time (latency) to assemble, sometimes referred to as “data at rest”. If for example, the primary goal is about accessing rich data that can be used flexibly across a variety of analytic uses, the primary value comes from the “dimension-ality” of the data.

Alternatively, instantly using transactional or event-based data to make real-time decisions is making use of “data in motion”, a very different and potentially equally valuable paradigm. Before any business plan to unlock data is put into place, having a firm understanding of the trade-off between dimension and speed is important. Simply put, is it better to have a richer set of data dimensions with a delay, or do you require making decisions based on real time data? The answer to this question will then ultimately determine the best information architecture to put into place.

Building a data architecture that suits you

Once a compromise on speed and time has been met, the good news is that a number of different modern technologies are already available today to help. Hadoop is commonly becoming the modern data warehouse, being increasingly used to address our need for so many new, high-volume, multi-structured data types. Variety in data is very much the norm, and rather than speed, this is all about the ‘dimension-ality’ of data. Thankfully, solving for rich definition and dimensions in data has never been easier or less expensive. As another example, the combination of a massively parallel analytic database (Vertica, Netezza, Greenplum) and a modern, in-memory-based business analytics platform (TIBCO Jaspersoft, TIBCO Spotfire) now often replace (or surpass) most of the functionality of traditional OLAP technologies at a fraction of the overall time and cost.

If speed is of the essence, Apache Storm, Amazon’s AWS Kinesis, and TIBCO’s Streambase all provide immediate access and processing of data feeds from nearly any data source or type. Today, streaming data feeds power both transactional and analytic uses of data, enabling relevant rules to be established where real-time results can be achieved. This all helps to garner insight that leads to security monitoring, fraud detection, service route optimisation, and trade-clearing across the globe.

A Gartner survey released towards the end of last year confirmed that investment in big data technologies is continuing to expand, with 73 percent of organisations having either invested or set plans in motion to invest in big data over the next two years. For this very reason, matching the right information architecture with the best available technology that matches the specific business need will be increasingly important. The good news is that special purpose technologies are fast arriving to address the growing needs across The New Data Velocity Spectrum, something that many stand to benefit from if the correct infrastructure is put in place.

How much do you know about big data? Take our quiz!

Duncan Macrae

Duncan MacRae is former editor and now a contributor to TechWeekEurope. He previously edited Computer Business Review's print/digital magazines and CBR Online, as well as Arabian Computer News in the UAE.

Recent Posts

NASA, Boeing To Begin Starliner Testing After ‘Anomalies’

American space agency prepares for testing of Boeing's Starliner, to ensure it has two space…

6 hours ago

Meta Launches Friends Tab, As Zuck Touts ‘OG Facebook’

Zuckerberg seeks to revive Facebook's original spirit, as Meta launches Facebook Friends tab, so users…

11 hours ago

WhatsApp Appeal Against EU Fine Backed By Court Advisor

Notable development for Meta, after appeal against 2021 WhatsApp privacy fine is backed by advisor…

1 day ago

Intel Board Shake-Up As Three Members Confirm Retirement

First sign of shake-up under new CEO Lip-Bu Tan? Three Intel board members confirm they…

1 day ago

Trump’s SEC Pick Pledges ‘Coherent’ Crypto Rules

Trump's nominee for SEC Chairman, Paul Atkins, has pledged a “rational, coherent, and principled approach”…

1 day ago