The below points state how some of the big companies are talking over Bigdata and Analytics.
Big data Platforms in cloud:
In 2013, most of the big data projects we’ve seen were put on top of bare metal infrastructure in the enterprise. We expect to see an evolution toward a virtualized infrastructure in 2014. We’re seeing a lot of investment in products that make this happen, such as Serengeti for vSphere, Savanna for OpenStack and Ironfan for Amazon Web Services (AWS). These projects allow us to automate the deployment of big data platforms to a virtualized infrastructure.
This is the era of Analytics:
In 2013, enterprises learned a lot about how to use the big data infrastructure that was new to the market. This coming year, those lessons will be applied toward analytic applications. In 2014, we will see some great use cases happen on that big data infrastructure. This will be the year of: ’What can I do with big data?’ rather than: ’What is big data?’ Given this refocus on analytic applications, 2014 will create an even greater demand for people with skills in data science.
The Hadoop platforms almost consolidated:
We feel confident that the big data industry will consolidate down to a couple of Hadoop distributions. Currently, many distributions of Hadoop exist, some proprietary and some open source. In 2014, , the industry will consolidate to two of these. Those that remain will become less relevant — either because they are consolidated by acquisition into one of the survivors or they exit the market.
Analytics with ETL processing:
Speaking of exits, serial extract, transform, load (ETL) processes will largely go away in 2014. As the velocity of data increases, especially social data, there’s more need to analyze data in real time as a stream. Currently, Hadoop is being pressed into service for this — something it’s not well suited for. In-memory analytics and complex event processing give us the capability to analyze these streams in real time and extract intelligence on the fly. That eliminates the need to perform the traditional ETL steps.
MDM will provide the dimensions:
Master data management (MDM) is used to create a single definition of data from an internal standpoint. As people realize that external data sources are going to add more dimensions to their internal problems, they’ll look for a single definition, or a single piece of data that will help describe that new definition or that new dimension, even though it’s coming from the outside world. If you realize that external data sources help solve a problem, you’ll want an external MDM focus as well.
The consolidation of NoSQL:
NoSQL means ”not only SQL” rather than ”the absence of SQL,” which means it is more inclusive than exclusive. NoSQL means there are many ways to look at data other than the structured and ordered approach that SQL requires. NoSQL was created to offer a way to look at data without forcing it into a concrete schema. That has been extremely successful, and we’re seeing a massive growth in NoSQL. There will be no slowdown in the adoption of NoSQL, but just as with Hadoop distributions, the industry is beginning to settle on a few major players. 2014 will bring a similar consolidation of NoSQL database distributions.
Skill Set for Predictive Analytics:
- Bachelor’ Degree or higher in Operations Research, Statistics, Economics, Finance, Computer Science or other quantitative field, advanced degree preferred
- Strong experience with exploratory data analysis
- Solid analytical and data manipulation skills including advanced knowledge of SQL
- Expert knowledge of statistical modeling techniques, machine learning, and big data analysis
- Deep familiarity with analysis tools such as SAS, R, or STATISTICA
- Advanced knowledge in Operations Research is a plus
- Working knowledge of data ETL techniques and OLAP architecture