10 Skills to get the Best of Big Data

Brochure Image

by Shaku Atre download a PDF brochure Download Event Brochure


Big Data technology is new to most organizations and so is awareness of the skills needed to get the best out of Big Data. To “have” these skills overnight is wishful thinking. As a result, in most organizations a large percentage of Big Data skills need to be either learned or recruited, or a little bit of both. Big volumes of data beg for analysis in order to glean correlations and inferences and to prove or disprove hypotheses. These methods point straight to Data Science. In the past, Data Science was practiced only in the academic world. Now, in order to be competitive in the marketplace, every business is expected to possess these academic skills. With one big difference - in academia, results typically did not need to be obtained very quickly, if the problems and the data were very complex. They could take their dear time - something businesses cannot afford to do; Time to Results is of paramount importance for businesses to succeed. That said, besides volume, the bigger problem is speed - meaning the velocity with which the data arrives, with which it is supposed to be worked on, and with which the insights are supposed to be provided to the decision makers. It is not only that the standard of “how much data” has changed but also “how soon” has changed dramatically as well.

What you will learn

Analysts of Big Data should have the following strengths:

  • Familiarity with newer statistical languages like R
  • Understanding and use of analytics modeling techniques
  • Outstanding familiarity with the data to be analyzed
  • Risk-taking mentality to experiment with data

Technical skills needed are, among others:

  • Very good understanding and experience with Open Source Software
  • Data architecting of databases with terabytes of data and growing every minute
  • Experience managing software frameworks like Hadoop; expertise in databases like noSQL, Cassandra, and HBase
  • Expertise with analytics programming languages and facilities such as very important languages R or Pig
  • Ability to manage hardware with hundreds or thousands of “small’ CPUs, for multiple terabytes of data.

And, soft skills having not much to do with Big Data are needed in many organizations:

  • Understanding of the ”ins and outs” of the business
  • Understanding of the “bottom line” of the business
  • Ability to discern which analytics will answer the bottom-line questions
  • Communications skills to explain the analytics results
  • Understanding not only transactions but also interactions and observations

Main Topics

  • Open Source: Apache Hadoop
  • Open Source: Apache Spark- an alternative to MapReduce
  • Some More Technologies: Python, Data Lake, NoSQL
  • SQL
  • General-Purpose Programming Languages: Java, C, Python, Scala
  • Data Mining and Machine Learning
  • Statistical and Quantitative Analysis
  • Data Visualization
  • Creativity
  • Problem Solving and Subject Matter Expertise