Using Big Data to Kick-start Your Career

Gordon Square Communications and WAAT offers tips about how to make the most of online resources to land a dream job – all without spending a penny.

Left to right: Vamory Traore, Sylvia Arthur and Grzegorz Gonciarz

You are probably familiar with Monster.com or Indeed.com, huge jobs websites where you can upload your CV together with other 150 million people every month.

The bad news is that it is unlikely that your CV will ever get seen on one of these websites, discovered attendees of London Technology Week event Using Tech to Find a Job at Home or Abroad.

“There are too many people looking for a small number of jobs,” says Sylvia Arthur, Communicator Consultant at Gordon Square Communications and author of the book Get Hired! out on 30th June.

“The problem is that only 20% of jobs are advertised, while 25% of people are seeking a new job. If you divide twenty by twenty-five, the result of the equation is that you lose,” explains Ms Arthur.

So, how can we use technology to effectively find a job?

The first step is to analyse the “Big Data” – all the information that tells us about trends or associations, especially relating to human behaviour.

For example, if we were looking for a job in IT, we could read in the news that a new IT company has opened in Shoreditch, and from there understand that there are new IT jobs available in East London.

Big Data also tells us about salaries and cost of living in different areas, or what skills are required.

“Read job boards not as much to find a job as to understand what are the growing sectors and the jobs of the future,” is Ms Arthur’s advice.

Once you know where to go with the skills you have, you need to bear in mind that most recruiters receive thousands of CVs for a single job and they would rather ask a colleague for a referral than scan through all of them.

So if you are not lucky enough to have connections, you need to be proactive and make yourself known in the industry. “Comment, publish, be active in your area, showcase your knowledge,” says Ms Arthur.

“And when you read about an interesting opportunity, be proactive and contact the CEO, tell them what you know and what you can do for them. LinkedIn Premium free trial is a great tool to get in touch with these people.”

Another good advice is to follow the key people in your sector on social media. Of all the jobs posted on social media, 51% are on Twitter, compared to only 23% on LinkedIn.

And for those looking for jobs in the EEA, it is worth checking out EURES, a free online platform where job seekers across Europe are connected with validated recruiters.

“In Europe there are some countries with shortage of skilled workforce and others with high unemployment,” explains Grzegorz Gonciarz and Vamory Traore from WAAT.

“The aim of EURES is to tackle this problem.”

Advisers with local knowledge also help jobseekers to find more information about working and living in another European country before they move.

As for recent graduates looking for experience, a new EURES program called Drop’pin will start next week.

The program aims to fill the skills gap that separates young people from recruitment through free training sessions both online and on location.

To read the original article on London Technology Week, click here.

Originally Posted at: Using Big Data to Kick-start Your Career

Mar 08, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data Mining  Source

[ AnalyticsWeek BYTES]

>> February 13, 2017 Health and Biotech analytics news roundup by pstein

>> Genomics England exploits big data analytics to personalise cancer treatment by analyticsweekpick

>> Looking for Building Machine Learning Solution? Learn From a Bartender by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 Syngenta Signs Long-Term Licensing of NRGene’s Data Analytics Platform – CropLife Under  Big Data Analytics

>>
 IT managers view data security as biggest priority – LocalGov.co.uk … – LocalGov Under  Data Security

>>
 A mysterious radiation cloud spread over Europe in September. Russia finally acknowledged it. – Vox Under  Cloud

More NEWS ? Click Here

[ FEATURED COURSE]

Statistical Thinking and Data Analysis

image

This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and n… more

[ FEATURED READ]

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

image

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the “data-analytic thinking” necessary for e… more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:Provide examples of machine-to-machine communications?
A: Telemedicine
– Heart patients wear specialized monitor which gather information regarding heart state
– The collected data is sent to an electronic implanted device which sends back electric shocks to the patient for correcting incorrect rhythms

Product restocking
– Vending machines are capable of messaging the distributor whenever an item is running out of stock

Source

[ VIDEO OF THE WEEK]

@SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

 @SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom. – Clifford Stoll

[ PODCAST OF THE WEEK]

#FutureOfData with @CharlieDataMine, @Oracle discussing running analytics in an enterprise

 #FutureOfData with @CharlieDataMine, @Oracle discussing running analytics in an enterprise

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

A quarter of decision-makers surveyed predict that data volumes in their companies will rise by more than 60 per cent by the end of 2014, with the average of all respondents anticipating a growth of no less than 42 per cent.

Sourced from: Analytics.CLUB #WEB Newsletter

Benchmarking the share of voice of Coca-Cola, Red Bull and Pepsi

Today we’re comparing three soft drink brands: Coca Cola, Pepsi and Red Bull. All are big names in the beverages industry. We’ll use BuzzTalk’s benchmark tool to find out which brand is talked about the most and how people feel about this brand. As you probably know it’s not enough if people talk about your brand. You want them to be positive and enthusiastic.

Coca Cola has the largest Share of Voice

In order to benchmark these brands we’ve created three Media Reports in BuzzTalk. These are all set-up the same way. We include news sites, blogs, journals and Twitter for the time period starting at 23 September 2013. In these reports we didn’t include printed media.

softdrinks share of buzzAs you can see Coca Cola (blue) is the dominant brand online. Nearly 45% of the publications mention Coca Cola. Red Bull (green) and Pepsi Cola (red) follow close to each other at 29 and 26%.

Benchmarking the Buzz as not all buzz is created equal

Coca Cola doesn’t dominate everywhere on the web. If we take a closer look the dominance of Coca Cola is predominantly caused by it’s share of tweets. When we zoom in on news sites we notice it’s Red Bull who’s got the biggest piece of the pie. On blogs (not shown) Coca Cola and Red Bull match up.

buzz by content type

Is Coca Cola’s dominance on Twitter due to Beliebers?

About 99,6% of Coca Cola related publications is on Twitter. Most of these tweets relate to the Coca-Cola.FM radio station in South America in relation with Justin Bieber. On 12th November Coca Cola streamed the concert of this young pop star and what we’re seeing here is the effect of ‘Beliebers’ on the share of voice.

coca cola hashtag justin bieber

The Coca Cola Christmas effect can still be detected

The Bieber effect is even stronger than christmas (42884 versus 2764 tweets).

coca cola hashtag xmas

Last year we demonstrated what’s marking the countdown to the holidays: it’s the release of the new Coca Cola TV-commercial. What we noticed then was a sudden increase in the mood state ‘tension’. In the following graph you can see it’s still there (Coca Cola is still in blue).

coca cola tension time novemberThe mood state ‘tension’ relates to both anxiety and excitement. It’s the emotion we pick up during large product releases. If this is the first time you’re reading about mood states we recommend reading this blogpost as an introduction. Mood states are an interesting add-on to sentiment to be used in predictions about human behavior. The ways in which actual predictions can be made are subject of ongoing research.

How do we feel about these brands?

Let’s examine some more mood states and see whether we can find a mood state that’s clearly associated with a brand. As you can see in the graphs below each soft drink brand gets it fair share of mood state tension. Tension not specific for Coca Cola, though it is more prominent during the countdown towards christmas.

mood states by brandPepsi Cola evokes the most ‘confusion’ and slightly more ‘anger’. The feelings of confusion are often related to feeling quilty after drinking (too much) Pepsi.

how do we feel

Red Bull generates the most mood states as it’s dominating not only for fatigue, but also – to a lesser extend – for depression, tension and vigor.

 

Striking is the amount of publications for Red Bull in which the mood state fatigue can be detected. They say “Red Bull gives you wings” and this tag line has become famous. People now associated tiredness with the desire for Red Bull. But people also blame Red Bull for (still) feeling tired or more tired. At least it’s good to see Red Bull also has it’s share in the ‘vigor’ mood state department.

To read the original article on BuzzTalk, click here.

Originally Posted at: Benchmarking the share of voice of Coca-Cola, Red Bull and Pepsi by analyticsweekpick

Mar 01, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Pacman  Source

[ NEWS BYTES]

>>
 How Business Schools Can Integrate Data Analytics into the … – The CPA Journal Under  Business Analytics

>>
 The Next Phase Of Machine Learning – SemiEngineering Under  Machine Learning

>>
 Senior Analytics Analyst – Enova | Built In Chicago – Built In Chicago Under  Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Master Statistics with R

image

In this Specialization, you will learn to analyze and visualize data in R and created reproducible data analysis reports, demonstrate a conceptual understanding of the unified nature of statistical inference, perform fre… more

[ FEATURED READ]

Storytelling with Data: A Data Visualization Guide for Business Professionals

image

Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You’ll discover the power of storytelling and the way to make data a pivotal point in your story. Th… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:You have data on the durations of calls to a call center. Generate a plan for how you would code and analyze these data. Explain a plausible scenario for what the distribution of these durations might look like. How could you test, even graphically, whether your expectations are borne out?
A: 1. Exploratory data analysis
* Histogram of durations
* histogram of durations per service type, per day of week, per hours of day (durations can be systematically longer from 10am to 1pm for instance), per employee…
2. Distribution: lognormal?

3. Test graphically with QQ plot: sample quantiles of log(durations)log?(durations) Vs normal quantiles

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Joe DeCosmo, @Enova

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Joe DeCosmo, @Enova

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The world is one big data problem. – Andrew McAfee

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData with Jon Gibs(@jonathangibs) @L2_Digital

 #BigData @AnalyticsWeek #FutureOfData with Jon Gibs(@jonathangibs) @L2_Digital

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In late 2011, IDC Digital Universe published a report indicating that some 1.8 zettabytes of data will be created that year.

Sourced from: Analytics.CLUB #WEB Newsletter

What Crying Baby Could Teach Big Data Discovery Solution Seekers?

What Crying Baby Could Teach Big Data Discovery Solution Seekers?
What Crying Baby Could Teach Big Data Discovery Solution Seekers?

Yes, you read it right. It is a light title for a serious problem. I spoke with big-data scientists in some fortune 100 companies and tried to poke them to learn their strategy on how they want to tackle big data & how they are figuring out the method/tool that works best for them. It was interesting to hear their story, to learn all the options that are available to them and how they ended up picking the tool. I was trying to understand/resolve the problem and then, one night I saw my 2 year daughter cry non-stop. We all huddled to find what is troubling her. Then it occurred to me that, it is the similar situation that companies are facing today.

First, let me explain what happened, and then I will try to make the connection on why and how it is relevant. On one blue moon, my daughter who has just turned two, started acting fussy compared to her normal state. There were some guests at home, so as a normal parent we started figuring out what is bothering her to calm her down, but nothing seems to be working. One of guest put forward some suggestion for the reason for her fussiness, and then there were other theories that got added. All of us were trying to find the right reason for her fussiness from our individual experience and soon, a collaboration of various tricks worked and she found her peace. Not sure if the reason for the fussiness is any important here but the good part is that she became relaxed.

Now, this is the problem that most of the companies are facing today. Like my daughter they all are fussy as they all have a big-data problem, they have lot of unknowns hiding in their data. They all can barely understand how to find them, let alone the way to put them to use. And if we compare visualization tool to guests, parent and everybody around my daughter trying to figure out their own version of what is happening- It’s a chaos. If you let one of the many figure out their version of what it is, they may be off for quite some time that could be painful, discomforting and wrong for some time. On the other hand, a model of collective wisdom worked best as everyone gave their quick thoughts which helped us collaborate and iterate on the information and figure out the best path.

Now consider companies’ using multiple tools on their problem, and babysitting for days/months/years costing time, money and resources. These tools could end up becoming the best nanny there is or the worst one. Outcome is anyone’s guess, but if you get a good tool, will you ever find out if there is a better or best tool out there. That is the problem big-data industry is facing today. Unlike their other traditional appliances/tools, big-data tool requires considerable cash influx and time/resource commitment, so going through long sales cycle and marrying a single tool should not be high on their charts.
Before you get onto your hunting, make sure to create a small data set that best defines your business chaos. The data should contain almost every aspect of your business in a way that it could work as a good recruiting tool for data discovery platform. I will go a bit deeper into what entails some good preparatory steps before you go shopping. But for this blog, let’s make sure we have our basic data set ready for testing the tools.

Now, the best approach in recruiting best visualization framework should go through one of the three ways:
1. Hiring an independent consulting, like we consult pediatrics for their expertise in dealing with baby problems, we could hire a specialized shop that could work closely with your business, and other data visualizations vendors. These consultants could help companies recruit those tools by acting as a mediation layer to help you filter out any bias, or technological challenge that restricts your decision making capabilities. These consultants could sit with your organizations, understand it’s requirements and go for tool fishing recommending the best tool that suits your needs.

2. Maximizing the use of trial periods for platform. Just as we quickly turn around things and validate which method could pacify the kids quickly and not get into long cycle of failures, we could treat It is the same. This technique is painful but still does relatively less damage than going full throttle with one tool on long journey of failure. This approach prepares you to have a mindset, tactical and strategic agenda to hire/fire tool fast and pick the best tool that is delivering maximum value per dataset. This technique is relatively expensive among the three and it could introduce some bias in the decision making.

3. Go with platform plays: Similar to pediatric clinic, you could find almost everything that could help pacify the situation. Similarly, vendors that provide you with platform system to help you experiment all those methodologies and let you pick the best combination that will work for your system. These vendors are not stuck to any visualization techniques but they make everything available to clients and help them get stuck with best package out there. Having locked at such system you could make sure that your business interest should get the highest precedence and not any specific visualization/discovery technique. For keeping the blog clean from any shout outs, I would keep the company name out of the text, but do let me know if you are interested to know which all companies provide platform play for you to experiment with.

And by that you could make the baby stop crying in fastest, most cost effective and business responsive manner.

Originally Posted at: What Crying Baby Could Teach Big Data Discovery Solution Seekers?

Lavastorm Democratizing Big Data Analytics in Face of Skills Shortage

Democratizing Big Data refers to the growing movement of making products and services more accessible to other staffers, such as business analysts, along the lines of “self-service business intelligence” (BI).

In this case, the democratized solution is “the all-in-one Lavastorm Analytics Engineplatform,” the Boston company said in an announcement today announcing product improvements. It “provides an easy-to-use, drag-and-drop data preparation environment to provide business analysts a self-serve predictive analytics solution that gives them more power and a step-by-step validation for their visualization tools.”

It addresses one of the main challenges to successful Big Data deployments, as listed in study after study: lack of specialized talent.

“Business analysts typically encounter a host of core problems when trying to utilize predictive analytics,” Lavastorm said. “They lack the necessary skills and training of data scientists to work in complex programming environments like R. Additionally, many existing BI tools are not tailored to enable self-service data assembly for business analysts to marry rich data sets with their essential business knowledge.”

XXX
[Click on image for larger view.]The Lavastorm Analytics Engine (source: Lavastorm Analytics)

That affirmation has been confirmed many times. For example, a recent report by Capgemini Consulting, “Cracking the Data Conundrum: How Successful Companies Make Big Data Operational,” says that lack of Big Data and analytics skills was reported by 25 percent of respondents as a key challenge to successful deployments. “The Big Data talent gap is something that organizations are increasingly coming face-to-face with,” Capgemini said.

Other studies indicate they haven’t been doing such a good job facing the issue, as the self-service BI promises remain unfulfilled.

Enterprises are trying many different approaches to solving the problem. Capgemini noted that some companies are investing more in training, while others try more unconventional techniques, such as partnering with other companies in employee exchange programs that share more skilled workers or teaming up with or outright acquiring startup Big Data companies to bring skills in-house.

Others, such as Altiscale Inc., offer Hadoop-as-a-Service solutions, or, like BlueData, provide self-service, on-premises private clouds with simplified analysis tools.

Lavastorm, meanwhile, uses the strategy of making the solutions simpler and easier to use. “Demand for advanced analytic capabilities from companies across the globe is growing exponentially, but data scientists or those with specialized backgrounds around predictive analytics are in short supply,” said CEO Drew Rockwell. “Business analysts have a wealth of valuable data and valuable business knowledge, and with the Lavastorm Analytics Engine, are perfectly positioned to move beyond their current expertise in descriptive analytics to focus on the future, predicting what will happen, helping their companies compete and win on analytics.”

The Lavastorm Analytics Engine comes in individual desktop editions or in server editions for use in larger workgroups or enterprise-wide.

New predictive analytics features added to the product as listed today by Lavastorm include:

  • Linear Regression: Calculate a line of best fit to estimate the values of a variable of interest.
  • Logistic Regression: Calculate probabilities of binary outcomes.
  • K-Means Clustering: Form a user-specified number of clusters out of data sets based on user-defined criteria.
  • Hierarchical Clustering: Form a user-specified number of clusters out of data sets by using an iterative process of cluster merging.
  • Decision Tree: Predict outcomes by identifying patterns from an existing data set.

These and other new features are available today, Lavastorm said, with more analytical component enhancements to the library on tap.

The company said its approach to democratizing predictive analytics gives business analysts drag-and-drop capabilities specifically designed to help them master predictive analytics.

“The addition of this capability within the Lavastorm Analytics Engine’s visual, data flow-driven approach enables a fundamentally new method for authoring advanced analyses by providing a single shared canvas upon which users with complementary skill sets can collaborate to rapidly produce robust, trusted analytical applications,” the company said.

About the Author- David Ramel is an editor and writer for 1105 Media.

Originally posted via “Lavastorm Democratizing Big Data Analytics in Face of Skills Shortage”

Source: Lavastorm Democratizing Big Data Analytics in Face of Skills Shortage

Feb 22, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Accuracy  Source

[ NEWS BYTES]

>>
 HP stealthily installs new spyware called HP Touchpoint Analytics Client – Computerworld Under  Analytics

>>
 Customer segmentation with big data at hand – Business MattersBusiness Matters Under  Prescriptive Analytics

>>
 5 ways analytics can help health systems optimize their collection strategies – Becker’s Hospital Review Under  Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

Hadoop Starter Kit

image

Hadoop learning made easy and fun. Learn HDFS, MapReduce and introduction to Pig and Hive with FREE cluster access…. more

[ FEATURED READ]

Data Science from Scratch: First Principles with Python

image

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn … more

[ TIPS & TRICKS OF THE WEEK]

Data aids, not replace judgement
Data is a tool and means to help build a consensus to facilitate human decision-making but not replace it. Analysis converts data into information, information via context leads to insight. Insights lead to decision making which ultimately leads to outcomes that brings value. So, data is just the start, context and intuition plays a role.

[ DATA SCIENCE Q&A]

Q:Is it better to spend 5 days developing a 90% accurate solution, or 10 days for 100% accuracy? Depends on the context?
A: * “premature optimization is the root of all evils”
* At the beginning: quick-and-dirty model is better
* Optimization later
Other answer:
– Depends on the context
– Is error acceptable? Fraud detection, quality assurance

Source

[ VIDEO OF THE WEEK]

Rethinking classical approaches to analysis and predictive modeling

 Rethinking classical approaches to analysis and predictive modeling

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom. – Clifford Stoll

[ PODCAST OF THE WEEK]

Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

 Venu Vasudevan @VenuV62 (@ProcterGamble) on creating a rockstar data science team #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

100 terabytes of data uploaded daily to Facebook.

Sourced from: Analytics.CLUB #WEB Newsletter

2018 Trends in Data Governance: Heightened Expectations

Organizations will contend with an abundance of trends impacting data governance in the coming year. The data landscape has effectively become decentralized, producing more data, quicker, than it ever has before. Ventures in the Internet of Things and Artificial Intelligence are reinforcing these trends, escalating the need for consistent data governance. Increasing regulatory mandates such as the General Data Protection Regulation (GDPR) compound this reality.

Other than regulations, the most dominant trend affecting data governance in the new year involves customer experience. The demand to reassure consumers that organizations have effective, secure protocols in place to safely govern their data has never been higher in the wake of numerous security breaches.

According to Stibo Systems Chief Marketing Officer Prashant Bhatia, “Our expectations, both as individuals as well as from a B2B standpoint, are only getting higher. In order for companies to keep up, they’ve got to have [governance] policies in place. And, consumers want to know that whatever data they share with a third party is trusted and secure.”

The distributed nature of consumer experience—and the heightened expectations predicated on it—is just one of the many drivers for homogeneous governance throughout a heterogeneous data environment. Governing that data in a centralized fashion may be the best way of satisfying the decentralized necessities of contemporary data processes because, according to Bhatia:

“Now you’re able to look at all of those different types of data and data attributes across domains and be able to centralize that, cleanse it, get it to the point where it’s usable for the rest of the enterprise, and then share that data out across the systems that need it regardless of where they are.”

Metadata Management Best Practices
The three preeminent aspects of a centralized approach to governing data are the deployment of a common data model, common taxonomies, and “how you communicate that data for…integration,” Bhatia added. Whether integrating (or aggregating) data between different sources either within or outside of the enterprise, metatdata management plays a crucial role in doing so effectually. The primary advantage metadata yields in this regards is in contextualizing the underlying data to clarify both their meaning and utility. “Metadata is a critical set of attributes that helps provide that overall context as to why a piece of data matters, and how it may or may not be used,” Bhatia acknowledged. Thus, in instances in which organizations need to map to a global taxonomy—such as for inter-organizational transmissions between supply chain partners or to receive data from global repositories established between companies—involving metadata is of considerable benefit.

According to Bhatia, metadata “has to be accounted for in the overall mapping because ultimately it needs to be used or associated with throughout any other business process that happens within the enterprise. It’s absolutely critical because metadata just gives you that much more information for contextualization.” When attempting to integrate or aggregate various decentralized sources, such an approach is also useful. Mapping between varying taxonomies and data models becomes essential when utilizing sources from decentralized environments into a centralized one, as does involving metadata in these efforts. Mapping metadata is so advantageous because “the more data you can have, the more context you can have, the more accurate it is, [and] the better you’re going to be able to use it within a… business process going forward,” Bhatia mentioned.

Regulatory Austerity
Forrester’s 2018 predictions identify the GDPR as one of the fundamental challenges organizations will contend with in the coming year. The GDPR issue is so prominent because it exists at the juncture between a number of data governance trends. It represents the greater need to satisfy consumer expectations as part of governance, alludes to the nexus between governance and security for privacy concerns, and illustrates the overarching importance of regulations in governance programs. The European Union’s GDPR creates stringent mandates about how consumer information is stored and what rights people have regarding data about them. Its penalties are some of the more convincing drivers for formalizing governance practices.

“Once the regulation is in place, you no longer have a choice,” Bhatia remarked about the GDPR. “Whether you are a European company or you have European interactions, the fact of the matter is you’ve got to put governance in place because the integrity of what you’re sending, what you’re receiving, when you’re doing it, and how you’re doing it…All those things no longer becomes a ‘do I need to’, but now ‘I have to’.” Furthermore, the spring 2018 implementation of GDPR highlights the ascending trend towards regulatory compliance—and stiff penalties—associated with numerous vertical industries. Centralized governance measures are a solution for providing greater utility for the data stewardship and data lineage required for compliance.

Data Stewardship
The focus on regulations and distributed computing environments only serves to swell the overall complexity of data stewardship in 2018. However, dealing with decentralized data sources in a centralized manner abets the role of a data steward in a number of ways. Stewards primarily exist to implement and maintain the policies begat from governance councils. Centralizing data management and its governance via the plethora of means available for doing so today (including Master Data Management, data lakes, enterprise data fabrics and others) enable the enterprise to “cultivate the data stewardship aspect into something that’s executable,” Bhatia said. “If you don’t have the tools to actually execute and formalize a governance process, then all you have is a process.” Conversely, the stewardship role is so pivotal because it supervises those processes at the point in which they converge with technological action. “If you don’t have the process and the rules of engagement to allow the tools to do what they need to do, all you have is the technology,” Bhatia reflected. “You don’t have a solution.”

Data Lineage
One of the foremost ways in which data stewards can positively impact centralized data governance—as opposed to parochial, business unit or use case-based governance—is by facilitating data provenance. Doing so may actually be the most valuable part of data stewardship, especially when one considers the impact of data provenance on regulatory compliance. According to Bhatia, provenance factors into “ensuring that what was expected to happen did happen” in accordance to governance mandates. Tracing how data was used, stored, transformed, and analyzed can deliver insight vital to regulatory reporting. Evaluating data lineage is a facet of stewardship that “measures the results and the accuracy [of governance measures] by which we can determine have we remained compliant and have we followed the letter of the law,” commented Bhatia. Without this information gleaned from data provenance capabilities, organizations “have a flawed process in place,” Bhatia observed.

As such, there is a triad between regulations, stewardship, and data provenance. Addressing one of these realms of governance will have significant effects on the other two, especially when leveraging centralized means of effecting the governance of distributed resources. “The ability to have a history of where data came from, where it might have been cleansed and how it might emerge, who it was shared with and when it was shared, all these different transactions and engagements are absolutely critical from a governance and compliance standpoint,” Bhatia revealed.

Governance Complexities
The complexities attending data governance in the next couple of years show few signs of decreasing. Organizations are encountering more data than ever before from a decentralized paradigm characterized by an array of on-premise and cloud architectures that complicate various facets of governance hallmarks such as data modeling, data quality, metadata management, and data lineage. Moreover, data is produced much more celeritously than before with an assortment of machine-generated streaming options. When one considers the expanding list of regulatory demands and soaring consumer expectations for governance accountability, the pressures on this element of data management become even more pronounced. Turning to a holistic, centralized means of mitigating the complexities of today’s data sphere may be the most viable means of effecting data governance.

“As more data gets created the need, which was already high, for having a centralized platform to share data and push it back out, only becomes more important,” Bhatia said.

And, with an assortment of consumers, regulators, and C-level executives evincing a vested interest in this process, organizations won’t have many chances to do so correctly.

Originally Posted at: 2018 Trends in Data Governance: Heightened Expectations by jelaniharper

Feb 15, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data security  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Big big love, how big data’s influencing the future of the online dating scene by analyticsweekpick

>> From Data Scientist to Diplomat by tony

>> Wrapping my head around Big-data problem by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 Equinix Agrees to Buy Australian Data Center Firm Metronode for … – Data Center Knowledge Under  Data Center

>>
 TIBCO Named a Leader in Streaming Analytics by Top Independent Research Firm – CSO Australia Under  Streaming Analytics

>>
 Twistlock Ties Container and Serverless Security Into a Single Platform – SDxCentral Under  Cloud Security

More NEWS ? Click Here

[ FEATURED COURSE]

Tackle Real Data Challenges

image

Learn scalable data management, evaluate big data technologies, and design effective visualizations…. more

[ FEATURED READ]

Data Science from Scratch: First Principles with Python

image

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn … more

[ TIPS & TRICKS OF THE WEEK]

Data aids, not replace judgement
Data is a tool and means to help build a consensus to facilitate human decision-making but not replace it. Analysis converts data into information, information via context leads to insight. Insights lead to decision making which ultimately leads to outcomes that brings value. So, data is just the start, context and intuition plays a role.

[ DATA SCIENCE Q&A]

Q:How do you know if one algorithm is better than other?
A: * In terms of performance on a given data set?
* In terms of performance on several data sets?
* In terms of efficiency?
In terms of performance on several data sets:

– ‘Does learning algorithm A have a higher chance of producing a better predictor than learning algorithm B in the given context?”
– ‘Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets”, A. Lacoste and F. Laviolette
– ‘Statistical Comparisons of Classifiers over Multiple Data Sets”, Janez Demsar

In terms of performance on a given data set:
– One wants to choose between two learning algorithms
– Need to compare their performances and assess the statistical significance

One approach (Not preferred in the literature):
– Multiple k-fold cross validation: run CV multiple times and take the mean and sd
– You have: algorithm A (mean and sd) and algorithm B (mean and sd)
– Is the difference meaningful? (Paired t-test)

Sign-test (classification context):
Simply counts the number of times A has a better metrics than B and assumes this comes from a binomial distribution. Then we can obtain a p-value of the HoHo test: A and B are equal in terms of performance.

Wilcoxon signed rank test (classification context):
Like the sign-test, but the wins (A is better than B) are weighted and assumed coming from a symmetric distribution around a common median. Then, we obtain a p-value of the HoHo test.

Other (without hypothesis testing):
– AUC
– F-Score

Source

[ VIDEO OF THE WEEK]

Andrea Gallego(@risenthink) / @BCG on Managing Analytics Practice #FutureOfData #Podcast

 Andrea Gallego(@risenthink) / @BCG on Managing Analytics Practice #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Without big data, you are blind and deaf and in the middle of a freeway. – Geoffrey Moore

[ PODCAST OF THE WEEK]

#FutureOfData with @CharlieDataMine, @Oracle discussing running analytics in an enterprise

 #FutureOfData with @CharlieDataMine, @Oracle discussing running analytics in an enterprise

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Distributed computing (performing computing tasks using a network of computers in the cloud) is very real. Google GOOGL -0.53% uses it every day to involve about 1,000 computers in answering a single search query, which takes no more than 0.2 seconds to complete.

Sourced from: Analytics.CLUB #WEB Newsletter

Know your Enterprise Feedback Management Provider


Provider 1 Word Cloud - TOP WORDS: CUSTOMER, EXPERIENCE, MANAGEMENT, SOLUTIONS, SERVICES
Provider 2 Word Cloud - TOP WORDS: CUSTOMER, CONTACT, CENTER, SERVICES, PARTNERS
Provider 3 Word Cloud - TOP WORDS: SURVEY, ONLINE, FEEDBACK, CUSTOMER, MANAGEMENT
Provider 4 Word Cloud - TOP WORDS: CUSTOMER, SURVEYS, ENGAGE, SOFTWARE, FEEDBACK
Provider 5 Word Cloud - TOP WORDS: CUSTOMER, SURVEYS, FEEDBACK, MANAGEMENT, MEASURE
Provider 6 Word Cloud - TOP WORDS: BUSINESS, SERVICES, CUSTOMER, SUPPORT, EXPERIENCE
Provider 7 Word Cloud - TOP WORDS: RESEARCH, CUSTOMER, MANAGEMENT, MARKET, FEEDBACK

I recently wrote about the value of Enterprise Feedback Management vendors. EFM is the process of collecting, managing, analyzing and disseminating different sources (e.g., customers, employees, partners) of feedback.  EFM vendors help companies facilitate their customer experience management (CEM)  and voice of the customer (VoC) efforts, hoping to improve the customer experience and increase customer loyalty.  This week, I take a non-scientific approach to understanding the EFM space and wondered how EFM/CEM vendors try to differentiate themselves from each other.

Using a word cloud-generating site, tagxedo.com, I created word clouds for 7  EFM/CEM vendors based on content from their respective Web sites. Word clouds are used to visualize free form text. I generated word clouds by simply inputting that vendor’s url into the tagxedo.com site (done on 7/15/2011 – prior to the announcement of the Vovici acquisition by Verint). I used the same tagxedo.com parameters when generating each vendor’s word cloud. For each word cloud, I manually removed company/proper names and trademarked words (e.g., Net Promoter Score) that would easily identify the vendor. The resulting word clouds appear to the right (labeled Provider 1 thorugh 7). These word clouds represent the key words each vendor uses to convey their solutions to the world. The seven vendors I used in this exercise are (in alphabetical order):

  • Allegiance
  • Attensity
  • MarketTools
  • Medallia
  • Mindshare
  • Satmetrix
  • Vovici

Can you match the vendors to their word cloud? Can you even identify the vendor your company uses (given it’s in the list, of course)? Answers to the word cloud matching exercise appear at the end of this post.

Before you read the answers, here is some help. It is clear that there is much similarity among these EFM vendors. They all do similar things; they use technology to capture, analyze and disseminate feedback. Beyond there core solutions, how do they try to differentiate themselves? Giving the word clouds the standard inter-ocular test (aka eye-balling the data), I noticed that, although “Customer” appears as a top word for all vendors, there are top words that are unique to a particular vendor:

  • Provider 1: Experience and Solutions
  • Provider 2: Contact, Center and Partners
  • Provider 3: Online
  • Provider 4: Engage and Software
  • Provider 5: Measure
  • Provider 6: Business
  • Provider 7: Market and Research

Maybe this differentiation, however subtle, can help you with the matching exercise. Let me know how you did. If you have thoughts on how EFM/CEM vendors can better differentiate themselves from the pack, please share your thoughts. More importantly, how can these vendors provide more value to their customers? One way is to help their clients integrate their technology solutions into their VoC program. Those EFM vendors who can do that will be more likely to succeed than those who simply want to sell technology as a solution (remember the CRM problem?).

—–

Answers to the EFM/CEM vendor word clouds:  Medallia (Provider 1); Attensity (Provider 2); Vovici (Provider 3); Allegiance (Provider 4); Mindshare (Provider 5); Satmetrix (Provider 6); MarketTools (Provider 7)

Source: Know your Enterprise Feedback Management Provider by bobehayes