Big data and big data analytics is the process of examining large amounts of data to uncover patterns, correlations and other useful marketing information that help make future business decisions. With the ever-changing data sources, Web server logs, click stream data, social media, mobile phones, etc., there’s certainly a challenge of analyzing the vast amount of data these sources create. Big data analytics can be done with software tools or open source options: NoSQL databases, Hadoop and MapReduce. These technologies form the core of a framework that supports the processing of large data sets across clustered systems.
Hadoop seems to have solidified its position as the cornerstone of the entire ecosystem, but there are still a number of competing distributions. Spark, an open source framework that builds on top of the Hadoop Distributed File System, is getting a lot of buzz because it promises to fill in the places where Hadoop has been weak: namely interactive speeds and good programming interfaces (early signs seem to point to fulfilling that promise). Some themes (e.g., in memory or real time) continues to stand out in customers’ minds. There’s also another whole new generation of data transformation/munging/wrangling tools, (e.g., Trifacta, Paxata and DataTamer).
Another key discussion is whether or not enterprise data will truly move to The Cloud, and if so, then how quickly. Many will argue that Fortune 500 companies will keep their data (and the software to process it) on premises for years to come; a generation of Hadoop-in-the-cloud startups (Qubole, Mortar,etc.) will argue that all data is moving to the cloud long-term.
Where Are We in the Market?
Overall we’re still in the early part of this market. During the last couple of years, some promising companies have failed (for example, Drawn to Scale, Precog, Prior Knowledge, Lucky Sort, Rapleaf, Nodeable and Karmasphere). A handful saw more meaningful outcomes (i.e., Infochimps, Causata, Streambase, ParAccel, Aspera, GNIP, BlueFin Labs and BlueKai).
Meanwhile some companies seem to be reaching significant scale and have raised large amounts of capital (i.e., MongoDB raised more than $230 million; Palantir, almost $900 million; and Cloudera, $1 billion). Overall we’re still early in the curve in terms of successful IPOs (Splunk or Tableau notwithstanding). Big companies are getting more acquisitive in the space (Oracle with BlueKai, IBM with Cloudant). In many segments start-ups and large companies are jockeying for position and no obvious leader has emerged.
Hype Meets Reality
A few years into a period of incredible hype, is big data still a thing? While big data is becoming less press worthy, the next couple of years are going to be hugely important for this market, as corporations start moving projects from experimentation to full production. While those deployments will lead to rapidly increasing revenues for some big data vendors, they will also test whether big data can truly deliver on its promise. Meanwhile, the fundamental need for big data technology keeps increasing as the deluge of data keeps accelerating, powered, in part, by the rapidly emerging Internet-of-things industry.
As predicted the action has been slowly, but surely moving to the application layer of big data. Some offer horizontal applications – for example, big data-powered marketing, CRM tools or fraud detection solutions. Others use big data in vertical-specific applications. Finance and ad tech were always early leaders in adopting big data, years before it was even called big data. Gradually, the use of big data is spreading to more industries, such as healthcare and biotech (particularly in genomics) or education. We think that this is only the beginning.
Where to go from here
To speak with a Big Data specialist, call (631) 789-9595 or fill out our Information Request Form and a representative will call you back shortly.