Big data is big news, but it’s still in its infancy. While most enterprises at least talk about launching Big Data projects, the reality is that very few do in any significant way. In fact, according to new survey data from Dimensional, while 91% of corporate data professionals have considered investment in Big Data, only 5% actually put any investment into a deployment, and only 11% even had a pilot in place.
Big data is big news, but it’s still in its infancy. While most enterprises at least talk about launching Big Data projects, the reality is that very few do in any significant way. In fact, according to new survey data from Dimensional, whileÂ 91% of corporate data professionals have considered investment in Big Data, only 5% actually put any investment into a deployment, and only 11% even had a pilot in place.
Real Time Gets Real
ReadWrite: Hadoop has been all about batch processing, but the new world of streaming analytics is all about real time and involves a different stack of technologies.
Langseth:Â Yes, however I would not entangle the concepts of real-time and streaming. Real-time data is obviously best handled as a stream. But itâs possible to stream historical data as well, just as your DVR can stream Gone with the Wind or last weekâs American Idol to your TV.
Â This distinction is important, as we at Zoomdata believe that analyzing data as a stream adds huge scalability and flexibility benefits, regardless of if the data is real-time or historical.
RW: So what are the components of this new stack? And how is this new big data stack impacting enterprise plans?
JL:Â The new stack is in some ways an extension of the old stack, and in some ways really new.
Data has always started its life as a stream. A stream of transactions in a point of sale system. A stream of stocks being bought and sold. A stream of agricultural goals being traded for valuable metals in Mesopotamia.
Traditional ETL processes would batch that data up and kill its stream nature. They did so because the data could not be transported as a stream, it needed to be loaded onto removable disks and tapes to be transported from place to place.
But now it is possible to take streams from their sources, through any enrichment or transformation processes, through analytical systems, and into the dataâs âfinal resting placeââall as a stream. There is no real need to batch up data given todayâs modern architectures such as Kafka and Kinesis, modern data stores such as MongoDB, Cassandra, Hbase, and DynamoDB (which can accept and store data as a stream), and modern business intelligence tools like the ones we make at Zoomdata that are able to process and visualize these streams as well as historical data, in a very seamless way.
Just like your home DVR can play live TV, rewind a few minutes or hours, or play moves from last century, the same is possible with data analysis tools like Zoomdata that treat time as a fluid.
Throw That Batch In The Stream
Also we believe that those who have proposed a âLambda Architecture,â effectively separating paths for real-time and batched data, are espousing an unnecessary trade-off, optimized for legacy tooling that simply wasnât engineered to handle streams of data be they historical or real-time.
At Zoomdata we believe that it is not necessary to separate-track real-time and historical, as there is now end-to-end tooling that can handle both from sourcing, to transport, to storage, to analysis and visualization.
RW:Â So this shift toward streaming data is real, and not hype?
JL: It’s real.Â It’s affecting modern deployments right now, as architects realize that it isnât necessary to ever batch up data, at all, if it can be handled as a stream end-to-end.Â This massively simplifies Big Data architectures if you donât need to worry about batch windows, recovering from batch process failures, etc.
So again, even if you donât need to analyze data from five seconds or even five minutes ago to make business decisions, it still may be simplest and easiest to handle the data as a stream. This is a radical departure from the way things in big data have been done before, as Hadoop encouraged batch thinking.
But it is much easier to just handle data as a stream, even if you donât care at allâor perhaps not yetâabout real-time analysis.
RW: So is streaming analytics what Big Data really means?
JL:Â Yes. Data is just like water, or electricity. You can put water in bottles, or electricity in batteries, and ship them around the world by planes trains and automobiles. For some liquids, such as Dom Perignon, this makes sense. For other liquids, and for electricity, it makes sense to deliver them as a stream through wires or pipes. Itâs simply more efficient if you donât need to worry about batching it up and dealing with it in batches.
Data is very similar. Itâs easier to stream big data end-to-end than it is to bottle it up.
Article originally appeared HERE.