The Speed vs. Richness Data Trade-Off
Brian Gentile, senior VP and GM at TIBCO Analytics, explores the trade-off in speed and richness companies need to make when analysing data
Let me start off with a simple business fact: it has never been more important for businesses to analyse and make sense of their data.
For precisely this reason, many are recognising the newfound responsibility of using data to create more value, whether it is to keep costs down, drive sales, engage customers or improve process efficiency. This part of the story is all well and good, but what isn’t explored so often, is the business plan that must be put in place in order to ensure data is being put to work as effectively as it can be in 2015.
Data Velocity: The Speed vs. Richness Trade-Off
To really succeed, it is critical that a business first understands just how quickly it must put data to work. What is more important to your business for example, accessing a richly defined data set for broad analytic use, or getting hold of data quickly? Enter the New Data Velocity Spectrum, which recognises that some data needs curation time, rich definition and dimension in order to unlock its value, while other data needs to be put to work immediately, coming to be known as ‘fast data’.
The simple premise here is that deep analysis requires more richly defined data that supports multiple views or perspectives and requires some time (latency) to assemble, sometimes referred to as “data at rest”. If for example, the primary goal is about accessing rich data that can be used flexibly across a variety of analytic uses, the primary value comes from the “dimension-ality” of the data.
Alternatively, instantly using transactional or event-based data to make real-time decisions is making use of “data in motion”, a very different and potentially equally valuable paradigm. Before any business plan to unlock data is put into place, having a firm understanding of the trade-off between dimension and speed is important. Simply put, is it better to have a richer set of data dimensions with a delay, or do you require making decisions based on real time data? The answer to this question will then ultimately determine the best information architecture to put into place.
Building a data architecture that suits you
Once a compromise on speed and time has been met, the good news is that a number of different modern technologies are already available today to help. Hadoop is commonly becoming the modern data warehouse, being increasingly used to address our need for so many new, high-volume, multi-structured data types. Variety in data is very much the norm, and rather than speed, this is all about the ‘dimension-ality’ of data. Thankfully, solving for rich definition and dimensions in data has never been easier or less expensive. As another example, the combination of a massively parallel analytic database (Vertica, Netezza, Greenplum) and a modern, in-memory-based business analytics platform (TIBCO Jaspersoft, TIBCO Spotfire) now often replace (or surpass) most of the functionality of traditional OLAP technologies at a fraction of the overall time and cost.
If speed is of the essence, Apache Storm, Amazon’s AWS Kinesis, and TIBCO’s Streambase all provide immediate access and processing of data feeds from nearly any data source or type. Today, streaming data feeds power both transactional and analytic uses of data, enabling relevant rules to be established where real-time results can be achieved. This all helps to garner insight that leads to security monitoring, fraud detection, service route optimisation, and trade-clearing across the globe.
A Gartner survey released towards the end of last year confirmed that investment in big data technologies is continuing to expand, with 73 percent of organisations having either invested or set plans in motion to invest in big data over the next two years. For this very reason, matching the right information architecture with the best available technology that matches the specific business need will be increasingly important. The good news is that special purpose technologies are fast arriving to address the growing needs across The New Data Velocity Spectrum, something that many stand to benefit from if the correct infrastructure is put in place.
How much do you know about big data? Take our quiz!