To AI or Not To AI

To AI or Not To AI
To AI or Not To AI

We all have heard about AI by now, it is stack of capabilities, put together to achieve a business objective with certain capacity for autonomy, ranging from expert system to deep learning algorithms. In my several conversations, I have found myriad of uninformed expectations from businesses on what they think of AI and what they want to achieve from it. The primal reason that IMHO is happening is limited understanding and technology landscape explosion. Such a radical shift has left businesses with imperfect understanding of the capabilities of AI. While it is tempting to point out the challenges that businesses are facing today, it is important to understand the core problem. One of the company executive (lets call him Steve) put it the best, “in today’s times AI is pushed right in the 2nd sentence of almost every product pitch and almost every vendor is trying to sell with rarely anyone trying to tell. But most are unclear what are they doing with their AI and how it would affect us.”. This hits to the nail of the problem.

Due to buzz in the market and push from top software companies to push their AI assistant to consumer, market is exploding. This widespread investment and media buzz is doing a great job at keeping the business anxiety high. While this is tiring for businesses and could potentially challenge their core strength (if not understood properly), businesses need to respond to this buzz word as an Innovation maverick. Hopefully we’ll talk about it below. Still not convinced to investigate AI adoption and need a reason? There are reportedly 35.6 million voice activated assistance devices that would make their way into American homes. That pretty much means that 1 in 4 household has an AI Assistant (total of 125.82 million households). This massive adoption is fueling the signal that everyone should consider AI in their corporate strategy as AI is slowly sliding into our lives and our work would be next. After all, you don’t want to lose your meal to an AI.

So, hopefully you are almost at the edge of being convinced, now what are some of the considerations that businesses should remember (almost always) and use them to build some ground rules before venturing into high dose of AI planning, execution & adoption.


AI is no silver bullet

While AI is good for lots of things, it’s not good for everything. AI solutions are great at clustering (likely events), predicting future (based on past) and finding anomalies (from large dataset) but they are certainly not great at bringing intuition to the table, quantify & qualify culture. They are still lagging to provide trusted results when they are equipped with underfitted or overfitted models. AI solutions are amazing at normalizing the data to predict the outcome, which many times leave the corners unseen. AI also has bias problem that humans have been mastering for ages. So, go with AI but keep critical decisions around best intuitive algos with who could do it the best, yes humans.


eAI (Enterprise AI) is in its infancy, so don’t yet give launch code to the kid

I am a South Asian, and sometimes when I am in my Indian-mode, my Indian accent jumps out and my interaction with Siri, Alexa and Google Home turn into an ego fest between what AI thinks I am speaking vs what I am speaking. The only difference is that AI holds more power in those interactions. Which is not yet scary, but I am sure it could be. If you have interacted with your AI assistance toys, you could relate to the experience when AI responds/reacts/executes due to misinterpretation. Now assume when consumer toys are programmed to react on misinformation, sometimes enterprise solutions could also suggest some fun and bizarre recommendations. Don’t believe me? Read my previous blog: Want to understand AI bounds? Learn from its failures to learn more. So, it’s important for businesses to understand and create the boundaries of AI and keep it air-tight from your critical decision-making.


Focus on the journey and not the destination

Yes, I know you have heard about this before in almost every challenging streak you are about to take. We have also heard about the same quote with “Journey” and “Destination” reversed. But I like the previous one. It puts emphasis on learning from this project and prepare decision makers to not rely on these technologies without a robust and fail-safe qualifying criterion. Every step, every learning, every use-case (intended & un-intended) must be captured for analysis. Most successful deployment stories I have heard are the ones where AI led the ROI hockey stick from unexpected corners. So, businesses should always provision for ears that must be listening to those unexpected corners. One of the most challenging conversation I find myself in contains a clearly defined uptight goals with no room for change. We need to achieve X by Y. While this is music to corporate ear, this is a headache for new untested waters. Imagine jumping in a viscous slurry to get to other corner in 10min. Sure, you may or may not get there, but then you’ll be too focused in getting to the other side and not focused enough in finding hacks to get you through the slurry faster the next time.


Building up on the foundation is critical

Yes, let’s not talk about changing laws of physics. Wait, let us change the laws of physics but give respect to the fundamental laws. It is important to see fundamentals REALLY fail before we try to ignore them. Avoid going against gravity, but it should be allowed to experiment with it. Businesses exists because of their secret sauce: part culture, part process and part product. While it is very tempting to break the legacy and build it fresh, it is extremely painful and time consuming to make different aspects of business work in harmony. Ignoring the old in front of the new is one of the most under estimated and freakishly devastating oversight that innovation could put businesses through. Imagine a newly launched rockstar product and how everyone jumped to work on it. While it is cool to work on new challenges, it is critical to establish their compliance to the fundamental foundation of the business. There is no silver bullet as to what constitutes the foundation, but it’s a deep conversation that business needs to have before venturing into self-learning, self-analyzing and self-reacting solutions.


Don’t rush to the road

I have a young child at home and I remember getting in a conversation with an executive about the young ones and AI (Wait, this could be a great title). Idea that you wouldn’t trust your young ones with the steering of your car on road without much practice and / or your confidence in their abilities. You would rather have them perfect their craft in the controlled backyard, instead of getting them on highways early. Once they are mature and sane, yes, now is the time to let them drive in controlled roads and once confident, go all in. Current AI capabilities are no different. They require lot of hand-holding and their every outcome hits you with amusement. So, understand the play area for AI deployment and build strong ground rules. You don’t want anyone hurt with overconfidence of a minor. So, is the case with these expert systems.


Be fallible

Time to get some meetings with your innovation team (if you have one yet), else time for you to create one. We are currently working on tech stack where most of the technologies are undergoing disruption. The time is excitingly scary for tech folks. As tech is now substitute spine of any business, and yet undergoing disruption. So, tech folks should be given enough ammo to fail and they should be encouraged to fail. The scariest thing that could happen today would be a team executing a scenario and giving it a pass due to the fear of failure. There needs to be responsive checks and balances to understand and appreciate failures. This will help businesses work with IT that is agile and yet robust enough to undertake any future disruptive change.


Understand the adoption challenges

If you are hurt that we spend little time to talk about adoption, my apologies, I hope to hear from you in the comment section below. Adoption holds the key for AI implementation. While you are undergoing digital transformation (you soon will, if not already), you are making your consumers, employees, executives crazy with this new paradigm, so adoption of yet another autonomy layer holds some challenges. Some of the adoption challenges could be attributed to understanding of capabilities. From poor understanding of corporate fundamentals to inability to deploy a fitted model that could be re-calibrated once ecosystem sees a shift, the adoption challenges are everywhere.

So, while it is great to jump on AI bandwagon, it is important now (more than ever before) to understand the business. While IT could be a superhero amidst this, understand that along with more power comes more responsibility. So, help prepare your tech stack to be responsible and with open ears.

If you have more to add, welcome your thoughts on the comments below. Appreciate your interest.

Source by v1shal

The Beginner’s Guide to Predictive Workforce Analytics

Greta Roberts, CEO
Talent Analytics, Corp.

Human Resources Feels Pressure to Begin Using Predictive Analytics
Today’s business executives are increasingly applying pressure to their Human Resources departments to “use predictive analytics”.  This pressure isn’t unique to Human Resources as these same business leaders are similarly pressuring Sales, Customer Service, IT, Finance and every other line of business (LOB) leader, to do something predictive or analytical.

Every line of business (LOB) is clear on their focus. They need to uncover predictive analytics projects that somehow affect their bottom line. (Increase sales, increase customer service, decrease mistakes, increase calls per day and the like).

Human Resources Departments have a Different, and Somewhat Unique, Challenge not Faced By Most Other Lines of Business
When Human Resources analysts begin a predictive analytics initiative, what we see mirrors what every other line of business does. Somehow for HR, instead of having a great outcome it can be potentially devastating.

Unless the unique challenge HR faces is understood, it can trip up an HR organization for a long time, cause them to lose analytics project resources and funding, and continue to perplex HR as they have no idea how they missed the goal of the predictive initiative so badly.

Human Resources’ Traditional Approach to Predictive Projects
Talent Analytics’ experience has been that (like all other lines of business) when Human Resources focuses on predictive analytics projects, they look around for interesting HR problems to solve; that is, problems inside of the Human Resources departments. They’d like to know if employee engagement predicts anything, or if they can use predictive work somehow with their diversity challenges, or predict a flight risk score that is tied to how much training or promotions someone has, or see if the kind of onboarding someone has relates to how long they last in a role. Though these projects have tentative ties to other lines of business, these projects are driven from an HR need or curiosity.

HR (and everyone else) Needs to Avoid the “Wikipedia Approach” to Predictive Analytics
Our firm is often asked if we can “explore the data in the HR systems” to see if we can find anything useful. We recommend avoiding this approach as it is exactly the same as beginning to read Wikipedia from the beginning (like a book) hoping to find something useful.

When exploring HR data (or any data) without a question, what you’ll find are factoids that will be “interesting but not actionable”. They will make people say “really, I never knew that”, but nothing will result.  You’ll pay an external consultant a lot of money to do this, or have a precious internal resource do this – only to gain little value without any strategic impact.  Avoid using the Wikipedia Approach – at least at first.  Start with a question to solve.  Don’t start with a dataset.

Human Resources Predictive Project Results are Often Met with Little Enthusiasm
Like all other Lines of Business, HR is excited to show results of their HR focused predictive projects.

The important disconnect. HR shows results that are meaningful to HR only.

Perhaps there is a prediction that ties # of training classes to attrition, or correlates performance review ratings with how long someone would last in their role. This is interesting information to HR but not to the business.

Here’s what’s going on.

Business Outcomes Matter to the Business.  HR Outcomes Don’t.
Human Resources departments can learn from the Marketing Department who came before them on the predictive analytics journey. Today’s Marketing Departments, that are using predictive analytics successfully, are arguably one of the strongest and most strategic departments of the entire company.

Today’s Marketing leaders predict customers who will generate the most revenue (have high customer lifetime value). Marketing Departments did not gain any traction with predictive analytics when they were predicting how many prospects would “click”. They needed to predict how many customers would buy.

Early predictive efforts in the Marketing Department used predictive analytics to predict how many webinars they’ll need to conduct to get 1,000 new prospects in their prospect database.  Or, how much they’d need to spend on marketing campaigns to get prospects to click on a coupon. (Adding new prospect names to a prospect database is a marketing goal not a business goal.  Clicking on a coupon is a marketing goal not a business goal). Or, they could predict that customer engagement would go up if they gave a discount on a Friday (again, this is a marketing goal not a business goal. The business doesn’t care about any of these “middle measures” unless they can be proved and tracked to the end business outcome.

Marketing Cracked the Code
Business wants to reliably predict how many people would buy (not click) using this coupon vs. that one.  When marketing predicted real business outcomes, resources, visibility and funding quickly became available.

When Marketing was able to show a predictive project that could identify what offer to make so that a customer bought and sales went up – business executives took notice. They took such close notice that they highlighted what Marketing was able to do, they gave Marketing more resources and funding and visibility. Important careers were made out of marketing folks who were / are part of strategic predictive analytics projects that delivered real revenue and / or real cost savings to the business’s bottom line.

Marketing stopped being “aligned” with the business, Marketing was the business.

Human Resources needs to do the same thing.

Best Approach for Successful and Noteworthy Predictive Workforce Projects
Many people get tangled up in definitions. Is it people analytics, workforce analytics, talent analytics or something else? It doesn’t matter what you call it – the point is that predictive workforce projects need to address and predict business outcomes not HR outcomes.

Like Marketing learned over time, when Human Resources begins predictive analytics projects, they need to approach the business units they support and ask them what kinds of challenges they are having that might be affected by the workforce.

There are 2 critical categories for strategic predictive workforce projects:

  • Measurably reducing employee turnover / attrition in a certain department or role

  • Measurably increasing specific employee performance (real performance not performance review scores) in one role or department or another (i.e. more sales, less mistakes, higher customer service scores, less accidents).

I say “measurably” because to be credible, the predictive workforce initiative needs to measure and show business results both before and after the predictive model.

For Greatest ROI: Businesses Must Predict Performance or Flight Risk Pre-Hire
Once an employee is hired, the business begins pouring significant cost into the employee typically made up of a) their salary and benefits b) training time while they ramp up to speed and deliver little to no value. Our analytics work measuring true replacement costs show us that even for very entry level roles a conservative replacement estimate for a single employee (Call Center Rep, Bank Teller and the like) will be over $6,000.

A great example, is to consider the credit industry. Imagine them extending credit to someone for a mortgage – and then applying analytics after the mortgage has been extended to predict which mortgage holders are a good credit risk. It’s preposterous.

They only thing the creditor can do after the relationship has begun is to try to coach, train, encourage, change the payment plan and the like. It’s too late after the relationship has begun.

Predicting credit risk (who will pay their bills) – is predicting human behavior.  Predicting who will make their sales quota, who will make happy customers, who will make mistakes, who will drive their truck efficiently – also is predicting human behavior.

HR needs to realize that predicting human behavior is a mature domain with decades of experience and time to hone approaches, algorithms and sensitivity to private data.

What is Human Resources’ Role in Predictive Analytics Projects?
The great news is that typically the Human Resources Department will already be aware of both of these business challenges. They just hadn’t considered that Human Resources could be a part of helping to solve these challenges using predictive analytics.

Many articles discuss how Human Resources needs to be an analytics culture, and that all Human Resources employees need to learn analytics. Though I appreciate the realization that analytics is here to stay, Human Resources of all people should know that there are some people with the natural mindset to “get” and love analytics and there are some that don’t and won’t.

As I speak around the world and talk to folks in HR, I can feel the fear felt today by people in HR who have little interest in this space. My recommendation would be to breathe, take a step back and realize that not everyone needs to know how to perform predictive analytics.  Realize there are many traditional HR functions that need to be accomplished. We recommend a best practice approach of identifying who does have the mindset and interest in the analytics space and let them partner with someone who is a true predictive analyst.

For those who know they are not cut out to be the person doing the predictive analytics there are still many roles where they can be incredibly useful in the predictive process. Perhaps they could identify problem areas that predictive analytics can solve, or perhaps they could be the person doing more of the traditional Human Resources work. I find this “analytics fear” paralyzes and demoralizes employees and people in general.

Loosely Identified, but Important Roles on a Predictive Workforce Analytics Project

  1. Someone to identify high turnover roles in the lines of business, or identify where there are a lot of employees not performing very well in their jobs

  2. A liaison: Someone to introduce the HR predictive analytics team to the lines of business with turnover or business performance challenges

  3. Someone to help find and access the data to support the predictive project

  4. Someone to actually “do” the predictive analytics work (the workforce analyst or data scientist)

  5. Someone who creates a final business report to show the results of the work (both positive and negative)

  6. Someone who presents the final business report

  7. A high level project manager to help keep the project moving along

  8. The business and HR experts that understand how things work and need to be consulted all along the way

These roles can sometimes all be the same person, and sometimes they can be many different people depending on the complexity of the project, the size of the predictive workforce organization, the number of lines of business that are involved in the project and / or the multiple areas where data needs to be accessed.

The important thing to realize is there are several non analytics roles inside of predictive projects. Not every role in a predictive project requires a predictive specialist or even an analytics savvy person.

High Value Predictive Projects Don’t Deliver HR Answers
We recommend, no. At least not to begin with. We started by describing how business leaders are pressuring Human Resources to do predictive analytics projects. There is often little or no guidance given to HR about what predictive projects to do.

Here is my prediction and you can take it to the bank. I’ve seen it happen over and over again.

When HR departments use predictive analytics to solve real, Line of Business challenges that are driven by the workforce, HR becomes an instant hero. These Human Resources Departments are given more resources, their projects are funded, they receive more headcount for their analytics projects – and like Marketing, they will turn into one of the most strategic departments of the entire company.

Feeling Pressure to Get Started with Predictive?
If you’re feeling pressure from your executives to start using predictive analytics strategically and have a high volume role like sales or customer service you’d like to optimize, get in touch.

Want to see more examples of “real” predictive workforce business outcomes? Attend Predictive Analytics World for Workforce in San Francisco, April 3-6, 2016.

Greta Roberts is the CEO & Co-founder of Talent Analytics, Corp. She is the Program Chair of Predictive Analytics World for Workforce and a Faculty member of the International Institute for Analytics. Follow her on twitter @gretaroberts.

Source: The Beginner’s Guide to Predictive Workforce Analytics

April 10, 2017 Health and Biotech analytics news roundup

A DNA-testing company is offering patients a chance to help find a cure for their conditions: Invitae is launching the Patient Insights Network, where people can input their own genome data and help link it to other health data.

Congratulations, you’re a parasite!  Erick Turner and Kun-Hsing Yu won the first ‘Research Parasite’ award, given to highlight reanalysis of data. The name is a tongue-in-cheek reference to an infamous article decrying the practice.

IMI chief: ‘We need to learn how to share data in a safe and ethical manner’: Pierre Meulien discusses the EU’s Innovative Medicines Initiative, where public and private institutions collaborate.

5 Tips for Making Use of Big Data in Healthcare Production: Two pharmaceutical executives offer their opinions on using data in pharmaceutical manufacturing.

Originally Posted at: April 10, 2017 Health and Biotech analytics news roundup

Mar 07, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)


Data interpretation  Source

[ AnalyticsWeek BYTES]

>> Why Using the ‘Cloud’ Can Undermine Data Protections by analyticsweekpick

>> Leveraging Social Media to Showcase Your Expertise [Infographic] by v1shal

>> The Big Data Challenge: Generating Actionable Insight by analyticsweekpick

Wanna write? Click Here


 Delta Risk CEO Scott Kaine Featured on “Insights & Intelligence” Cloud Security Podcast – Security Boulevard Under  Cloud Security

 Delta Risk CEO Scott Kaine Featured on “Insights & Intelligence” Cloud Security Podcast – Security Boulevard Under  Cloud Security

 Delta Risk CEO Scott Kaine Featured on “Insights & Intelligence” Cloud Security Podcast – Security Boulevard Under  Cloud Security

More NEWS ? Click Here


Data Mining


Data that has relevance for managerial decisions is accumulating at an incredible rate due to a host of technological advances. Electronic data capture has become inexpensive and ubiquitous as a by-product of innovations… more


Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython


Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored f… more


Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.


Q:How do you control for biases?
A: * Choose a representative sample, preferably by a random method
* Choose an adequate size of sample
* Identify all confounding factors if possible
* Identify sources of bias and include them as additional predictors in statistical analyses
* Use randomization: by randomly recruiting or assigning subjects in a study, all our experimental groups have an equal chance of being influenced by the same bias

– Randomization: in randomized control trials, research participants are assigned by chance, rather than by choice to either the experimental group or the control group.
– Random sampling: obtaining data that is representative of the population of interest



Decision-Making: The Last Mile of Analytics and Visualization

 Decision-Making: The Last Mile of Analytics and Visualization

Subscribe to  Youtube


If we have data, let’s look at data. If all we have are opinions, let’s go with mine. – Jim Barksdale


@JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast

 @JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast


iTunes  GooglePlay


For a typical Fortune 1000 company, just a 10% increase in data accessibility will result in more than $65 million additional net income.

Sourced from: Analytics.CLUB #WEB Newsletter

Skill-Based Approach to Improve the Practice of Data Science

Our Big Data world requires the application of data science principles by data professionals. I’ve recently taken a look at what it means to practice data science as a data scientist. Our survey results of over 500 data professionals revealed that different types of data scientists possess proficiency in different types of data skills. In today’s post, I take another look at that data to identify the data skills that are essential for successful analytics projects. Additionally, I will present the Data Science Driver Matrix, a skill-based approach to identify how to improve the practice of data science.

Substandard Proficiency in Data Skills

In this ongoing study with AnalyticsWeek, we asked data professionals a variety of questions about their skills, job role, education level and more.

Data professionals were asked to rate their proficiency across 25 data skills in five skill areas (i.e., business, technology, programming, math & modeling and statistics) using the following scale:

Data Skills Proficiency Wheel
Figure 1. Proficiency in Data Science Skills by Job Role. Click image to enlarge.
  • Don’t know (0)
  • Fundamental Knowledge (20)
  • Novice (40)
  • Intermediate (60)
  • Advanced (80)
  • Expert (100)

The different levels of proficiency are defined around the data scientists ability to give or need to receive help. In the instructions to the data professionals, the “Intermediate” level of proficiency was defined as the ability “to successfully complete tasks as requested.” We used that proficiency level (i.e., Intermediate) as the minimum acceptable level of proficiency for each data skill. The proficiency levels below the Intermediate level (i.e., Novice, Fundamental Awareness, Don’t Know) were defined by an increasing need for help on the part of the data professional. Proficiency levels above the Intermediate level (i.e., Advanced, Expert) were defined by the data professional’s increasing ability to give help or be known by others as “a person to ask.”

We looked at the level of proficiency for the 25 different data skills across four different job roles. As is seen in Figure 1, data professionals tend to be skilled in areas that are appropriate for their job role (see green-shaded areas in Figure 1 where average proficiency ratings are 60 or above). Specifically, Business Management data professionals show the most proficiency in Business Skills. Researchers, on the other hand, show lowest level of proficiency in Business Skills and the highest in Statistics Skills.

For many of the data skills, however, the typical data professional does not have the minimum level of proficiency to be successful at work, no matter their role (see yellow- and red-shaded areas in Figure 1 where average proficiency ratings are below 60). Specifically, there are 10 data skills in which the typical data professional does not have the minimum level of proficiency: Unstructured data, NLP, Machine Learning, Big and distributed data, Cloud management, Front-end programming, Optimization, Graphic models, Algorithms and Bayesian statistics. Furthermore, there are nine data skills in which only one type of data professional has the minimum level of proficiency to be successful at work: Product design, Business Development, Budgeting, Database Administration, Back-end Programming, Data Management, Math, Statistics/Statistical Modeling and Science/Scientific Method.

Not all Data Skills are Equally Important

Given that data professionals lack proficiency in many skill areas, where do they begin to improve their overall set of data skills? Are some data skills more critical to project success than others? Should data professionals focus on learning/developing certain skills instead of other, less important skills?

Table 1. Correlations of Proficiency of Different Data Skills with Satisfaction with Outcomes of Analytics Projects
Table 1. Correlations of Proficiency of Different Data Skills with Satisfaction with Outcomes of Analytics Projects

In our study, data professionals were asked to rate their satisfaction with the outcomes of analytics projects on which they work. They provided their rating on a scale from 0 (Extremely Dissatisfied) to 10 (Extremely Satisfied). I used this score as a measure of project success.

For each data skill, I correlated data professionals’ proficiency ratings with the data professional’s satisfaction with outcomes to understand the link between a specific skill and the outcome of analytics projects. This exercise was done for each of the four job roles (See Table 1). Skills that show a high correlation with satisfaction with outcomes indicate that those skills are closely linked to project success (as defined by the satisfaction ratings). Skills listed in the top half of Table 1 are more essential to project outcomes compared to skills listed in the bottom half of Table 1.

On average, we see that data skills are more closely linked to satisfaction with work outcomes for data professionals who are Business Managers (average r = .30) and Researchers (average r = .30) compared to data professionals who are Developers (average r = .18) and Creatives (average r = .18).

The ranking of data skills with respect to their impact on satisfaction also varies significantly by job role. The average correlations among the rankings of data skills across the four job roles is r = .01, suggesting that data skills that are essential to project outcomes for one type of data scientist are not essential for other types of data scientists.

The Data Science Driver Matrix: Graphing the Results

Figure 2. Skill-based approach to improve the practice of data science
Figure 2. Data Science Driver Matrix: Skill-based approach to improve the practice of data science. Click image to enlarge.

So, we now have the two pieces of information for each of the 25 data skills: 1) average proficiency rating (in Figure 1) and 2) correlation with work outcome (in Table 1). For each job role, I plotted both pieces of information of the 25 data skills in a 2×2 table (see Figure 2). I call this diagram the Data Science Driver Matrix (DSDM). In the DSDM, the x-axis represents the average level of proficiency across all data skills. The y-axis represents how essential the skill is to project outcome.

The midpoint on the x- and y-axes are 60 (minimum level of proficiency needed to be successful at work) and .30 (~average correlation of skills with satisfaction), respectively.

Interpreting the Results: Improving the Practice of Data Science

Each of the data skills will fall into one of the four quadrants of the DSDM. In Table 1, I list the quadrant number for each data skill for the separate job roles. The decisions you make about a specific data skill (e.g., whether to learn it or not) depends on the quadrant in which it falls:

  1. Quadrant 1 (upper left): Quadrant 1 houses skills that are essential to the outcome of the project and in which the proficiency is below the minimum requirement. These data skills reflect good areas for potential improvement efforts because we have ample room for improvement. Improvements in proficiency could come in the form of investments in hiring data professionals with these skills, investments in training your current data professionals to acquire these skills or creation of teams with members that have complementary skills.
  2. Quadrant 2 (upper right): Quadrant 2 houses skills that are essential to the outcome of the project and in which the proficiency is above the minimum requirement. These skills reflect data professionals’ strength that we know improves the success in analytics projects. You’ll likely want to stay the course on these data skills.
  3. Quadrant 3 (lower right): Quadrant 3 houses skills in which the proficiency is above the minimum requirement but are not very essential to the outcome of the project. Be careful not to over-invest in improving these skills as they are not necessarily essential for the success of analytics projects.
  4. Quadrant 4 (lower left): Quadrant 4 houses skills in which the proficiency is below the minimum requirement but are not very essential to the outcome of the project. Consider divesting resources from these skills and re-direct them to skills falling in Quadrant 1. These skills are of low priority because, despite the fact that proficiency is low for these skills, they do not have a substantial impact on the outcome of the analytics projects.

Data Science Driver Matrices for Different Data Roles

I created a DSDM for each of the four job roles: Business Manager, Developer, Creative and Researcher. For this exercise, I will focus primarily on data skills that fall into Quadrant 1 (i.e., low proficiency in highly essential data skills).

1. Business Managers

For data professionals who self-identify as Business Managers (see Figure 3), we see that none of the skills fall into Quadrant 2 (high proficiency in highly essential skills), while 12 skills fall into Quadrant 1 (low proficiency in highly essential skills). Skills in quadrant 1 include:

Figure 3. Data Science Driver Matrix for Business Managers. Click image to enlarge.
Figure 3. Data Science Driver Matrix for Business Managers. Click image to enlarge.
  • Statistics / Statistical Modeling
  • Data Mining
  • Science / Scientific Method
  • Big and distributed data
  • Machine Learning
  • Bayesian Statistics
  • Optimization
  • Unstructured data
  • Structured data
  • Algorithms
Data Science Driver Matrix for Developers
Figure 4. Data Science Driver Matrix for Developers. Click image to enlarge.

2. Developers

For data professionals who identify as Developers (see Figure 4), most of the skills fall into Quadrant 4 (low proficiency in non-essential skills). Only two skills fall into Quadrant 1:

  • Systems Administration
  • Data Mining
Data Science Driver Matrix for Creatives
Figure 5. Data Science Driver Matrix for Creatives. Click image to enlarge.

3. Creatives

For data professionals who identify as Creatives (see Figure 5), most of the skills fall in Quadrant 4 (low proficiency in non-essential skills). Five skills fall into Quadrant 1:

  • Math
  • Data Mining
  • Business Development
  • Graphical Models
  • Optimization

4. Researchers

For data professionals who identify as Researchers (see Figure 6), six skills fall into Quadrant 1 (low proficiency in essential skills):

Data Science Driver Matrix for Researchers
Figure 6. Data Science Driver Matrix for Researchers. Click image to enlarge
  • Algorithms
  • Big and distributed data
  • Data Management
  • Product Design
  • Machine Learning
  • Bayesian Statistics

Researchers appear to lack proficiency in areas that are critical to the success of analytics projects.


Applying the right data skills to analytics projects is key to successful project outcomes. I proposed a skill-based approach to improve the practice of data science to help identify the essential data skills for different types of data professionals. Businesses can use these results to ensure they bring the right data professionals with the right skills to bear on their Big Data analytics projects.

There are a few conclusions from we can make from the current analyses.

  1. Data Mining was the only data skill that was one of the top 4 data skills that was essential to the project outcome. No matter your role as a data professional, a key ingredient to project success is your ability to mine insights from data.
  2. Proficiency in data skills appears to be more important for data professionals who are in the roles of Business Management and Researcher compared to data professionals who are in the roles of Developer and Creative. Improving proficiency in data skills to increase satisfaction with work appears to be a more realistic approach for Business Management and Researcher type data professionals.
  3. Data professionals could likely be happier about the outcomes of their projects if they possessed specific data skills. Surprisingly, for Business Managers, business-related data skills are not critical to the outcome of their analytics work. Instead, what drives their work satisfaction is the extent to which they are proficient in statistical and technological skills. Unfortunately, these Business Management workers typically do not possess adequate proficiency in these types of skills.

Improving the practice of data science can be accomplished in a variety of ways.  While the current analysis suggests that you can improve analytics project outcomes by improving skills for specific data professionals, another approach is to build data science teams with data professionals who have complementary skills. As I’ve found before, Business Managers are more satisfied with the outcomes of analytics projects when they are paired with data professionals with strong statistics skills compared to Business Managers who work alone. Likewise, Researchers are more satisfied with the outcomes of analytics projects when they are paired with data professionals with strong business acumen. Using either approach, organizations can leverage the practice of data science to address their analytics projects.

Source: Skill-Based Approach to Improve the Practice of Data Science