Smart Data Modeling: From Integration to Analytics

There are numerous reasons why smart data modeling, which is predicated on semantic technologies and open standards, is one of the most advantageous means of effecting everything from integration to analytics in data management.

  • Business-Friendly—Smart data models are innately understood by business users. These models describe entities and their relationships to one another in terms that business users are familiar with, which serves to empower this class of users in myriad data-driven applications.
  • Queryable—Semantic data models are able to be queried, which provides a virtually unparalleled means of determining provenance, source integration, and other facets of regulatory compliance.
  • Agile—Ontological models readily evolve to include additional business requirements, data sources, and even other models. Thus, modelers are not responsible for defining all requirements upfront, and can easily modify them at the pace of business demands.

According to Cambridge Semantics Vice President of Financial Services Marty Loughlin, the most frequently used boons of this approach to data modeling is an operational propensity in which, “There are two examples of the power of semantic modeling of data. One is being able to bring the data together to ask questions that you haven’t anticipated. The other is using those models to describe the data in your environment to give you better visibility into things like data provenance.”

Implicit in those advantages is an operational efficacy that pervades most aspects of the data sphere.

Smart Data Modeling
The operational applicability of smart data modeling hinges on its flexibility. Semantic models, also known as ontologies, exist independently of infrastructure, vendor requirements, data structure, or any other characteristic related to IT systems. As such, they can incorporate attributes from all systems or data types in a way that is aligned with business processes or specific use cases. “This is a model that makes sense to a business person,” Loughlin revealed. “It uses terms that they’re familiar with in their daily jobs, and is also how data is represented in the systems.” Even better, semantic models do not necessitate all modeling requirements prior to implementation. “You don’t have to build the final model on day one,” Loughlin mentioned. “You can build a model that’s useful for the application that you’re trying to address, and evolve that model over time.” That evolution can include other facets of conceptual models, industry-specific models (such as FIBO), and aspects of new tools and infrastructure. The combination of smart data modeling’s business-first approach, adaptable nature and relatively rapid implementation speed is greatly contrasted with typically rigid relational approaches.

Smart Data Integration and Governance
Perhaps the most cogent application of smart data modeling is its deployment as a smart layer between any variety of IT systems. By utilizing platforms reliant upon semantic models as a staging layer for existing infrastructure, organizations can simplify data integration while adding value to their existing systems. The key to integration frequently depends on mapping. When mapping from source to target systems, organizations have traditionally relied upon experts from each of those systems to create what Loughlin called “ a source to target document” for transformation, which is given to developers to facilitate ETL. “That process can take many weeks, if not months, to complete,” Loughlin remarked. “The moment you’re done, if you need to make a change to it, it can take several more weeks to cycle through that iteration.”

However, since smart data modeling involves common models for all systems, integration merely includes mapping source and target systems to that common model. “Using common conceptual models to drive existing ETL tools, we can provide high quality, governed data integration,” Loughlin said. The ability of integration platforms based on semantic modeling to automatically generate the code for ETL jobs not only reduces time to action, but also increases data quality while reducing cost. Additional benefits include the relative ease in which systems and infrastructure are added to this process, the tendency for deploying smart models as a catalog for data mart extraction, and the means to avoid vendor lock-in from any particular ETL vendor.

Smart Data Analytics—System of Record
The components of data quality and governance that are facilitated by deploying semantic models as the basis for integration efforts also extend to others that are associated with analytics. Since the underlying smart data models are able to be queried, organizations can readily determine provenance and audit data through all aspects of integration—from source systems to their impact on analytics results. “Because you’ve now modeled your data and captured the mapping in a semantic approach, that model is queryable,” Loughlin said. “We can go in and ask the model where data came from, what it means, and what conservation happened to that data.” Smart data modeling provides a system of record that is superior to many others because of the nature of analytics involved. As Loughlin explained, “You’re bringing the data together from various sources, combining it together in a database using the domain model the way you described your data, and then doing analytics on that combined data set.”

Smart Data Graphs
By leveraging these models on a semantic graph, users are able to reap a host of analytics benefits that they otherwise couldn’t because such graphs are focused on the relationships between nodes. “You can take two entities in your domain and say, ‘find me all the relationships between these two entities’,” Loughlin commented about solutions that leverage smart data modeling in RDF graph environments. Consequently, users are able to determine relationships that they did not know existed. Furthermore, they can ask more questions based on those relationships than they otherwise would be able to ask. The result is richer analytics results based on the overarching context between relationships that is largely attributed to the underlying smart data models. The nature and number of questions asked, as well as the sources incorporated for such queries, is illimitable. “Semantic graph databases, from day one have been concerned with ontologies…descriptions of schema so you can link data together,” explained Franz CEO Jans Aasman. “You have descriptions of the object and also metadata about every property and attribute on the object.”

Modeling Models
When one considers the different facets of modeling that smart data modeling includes—business models, logical models, conceptual models, and many others—it becomes apparent that the true utility in this approach is an intrinsic modeling flexibility upon which other approaches simply can’t improve. “What we’re actually doing is using a model to capture models,” Cambridge Semantics Chief Technology Officer Sean Martin observed. “Anyone who has some form of a model, it’s probably pretty easy for us to capture it and incorporate it into ours.” The standards-based approach of smart data modeling provides the sort of uniform consistency required at an enterprise level, which functions as means to make data integration, data governance, data quality metrics, and analytics inherently smarter.


Hacking the Data Science

Hacking the Data Science
Hacking the Data Science

In my previous blog on the convoluted world of data scientist, I shed some light on who exactly is data scientist. There was a brief mention of how the data scientist is a mix of Business Intelligence, Statistical Modeler and Computer Savvy IT folks. The write-up discussed on how businesses should look at their workforce for data science as a capability and not as data scientist as a job. One area that has not been given its due share is how to get going in building data science area. How the businesses should proceed in filling their data science space. So, in this blog, I will spend sometime explaining an easy hack to get going on your data science journey without bankrupting yourself of hiring boatload of data scientists.

Let’s first try visiting what is already published on this area. A quick thought that comes to mind when thinking about the image that shows data science as three overlapping circles. One is Business, one is statistical modeler and one is technology. Where further common area shared between Technology, Business and statistician is written as data science. This is a great representation of where data science lies. But it sometimes confuses the viewer as well. From the look of it, one could guess that overlapping region comprises of the professionals who possess all the 3 talents and it’s about people. Whereas all it is suggesting is that overlap region contains common use cases that requires all 3 areas of business, statistician and technology.

Also, the image of 3 overlapping circles does not convey the complete story as well. It suggests overlap of some common use cases but not how resources will work across the three silos. We need a better representation to convey the accurate story. This will help in better understanding on how businesses should go about attacking the data science vertical in effective manner. We need a representation keeping in mind the workforce that is represented by these circles. Let’s call these resources in 3 broad silos Business Intelligence folks represented by Business circle, Statistical modeler is represented by statistician circle and IT/Computer engineers are represented by Technology circle. For simplicity lets assume these circles are not touching with each other and they are represented as 3 separated islands. This will provide a clear canvas of where we are and where we need to get.

This journey from 3 separated circles to 3 overlapping circle communicates some signals to understand how to achieve this capability from the talent perspective. We are given 3 separate islands a task to join them. There are few alternatives that comes to mind:

1. Hire data scientists, have them build their own circle in the middle. Let them keep expanding on their capability in all 3 directions (Business, Statistics and Technology). This will keep increasing the middle bubble to a point it touches and overlaps all the 3 circles. This particular solution is resulting in mass hiring of data scientists by mid-large scale enterprises. Most of these data scientist were not given real use cases and they are trying to find how they could bring the 3 circle closer to make the overlap happen. It does not take long to understand that this is not the most efficient process as it costs a bunch to businesses. This method sounds juicy as it gives Human Resources a good night sleep as HR could acquire Data Scientist talents for an area, which is high in demand. Now everyone needs to work on those hires and teach them 3 circles and the culture associated with it. Good thing in this method is that these professionals are extremely skilled and could roll the dice pretty quickly. However, they might take their own pace when it comes to identifying use cases and justifying their investments.

2. Another way that comes to mind is to set aside some professionals from each circle to start digging their way to common area where all the three could meet, learn from each other. Collaboration brings the best in most people. Via collaborative way, these professionals bring their respective culture, their SMEs in their line of business to the mix. This method looks to be the optimum solution, as it requires no outside help. This method does provide an organic way to build data science capability but it could take forever before these 3 camps could come to same page. This slowness, also trips this particular method as one of the most efficient one.

3. If one method is expensive but fast and other is cost effective but slow, what is the best method? It is somewhere between the slider of fast-expensive and slow-cost effective. So, hybrid looks to be bringing the best of both worlds. Having a data scientist and a council of SMEs from respective circles working together could keep the expense at check and at the same time brings the three camps closer faster via their SMEs. How many data scientist to begin with? Answer could be found out based on the company size, its complexity and wallet size. Now you could further hack the process to hire contracting data scientist to work as a liaison till the three camps find their overlap in professional capacity. So, this is the particular method which could businesses could explore to hack the data science world and get their businesses to big data compliant and data driven capable business.

So, experimenting with hybrid of shared responsibility between 3 circles of business, statistics and technology with a data scientist as a manager will bring businesses to speed when it comes to adapting themselves to big data ways of doing things.

Source by v1shal

New MIT algorithm rubs shoulders with human intuition in big data analysis

We all know that computers are pretty good at crunching numbers. But when it comes to analyzing reams of data and looking for important patterns, humans still come in handy: We’re pretty good at figuring out what variables in the data can help us answer particular questions. Now researchers at MIT claim to have designed an algorithm that can beat most humans at that task.

[AI can now muddle its way through the math SAT about as well as you can]

Max Kanter, who created the algorithm as part of his master’s thesis at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) along with his advisor Kalyan Veeramachaneni, entered the algorithm into three major big data competitions. In a paper to be presented this week at IEEE International Conference on Data Science and Advanced Analytics, they announced that their “Data Science Machine” has beaten 615 of the 906 human teams it’s come up against.

The algorithm didn’t get the top score in any of its three competitions. But in two of them, it created models that were 94 percent and 96 percent as accurate as those of the winning teams. In the third, it managed to create a model that was 87 percent as accurate. The algorithm used raw datasets to make models predicting things such as when a student would be most at risk of dropping an online course, or what indicated that a customer during a sale would turn into a repeat buyer.

Kanter and Veeramachaneni’s algorithm isn’t meant to throw human data scientists out — at least not anytime soon. But since it seems to do a decent job of approximating human “intuition” with much less time and manpower, they hope it can provide a good benchmark.

[MIT researchers can listen to your conversation by watching your potato chip bag]

“If the Data Science Machine performance is adequate for the purposes of the problem, no further work is necessary,” they wrote in the study.

That might not be sufficient for companies relying on intense data analysis to help them increase profits, but it could help answer data-based questions that are being ignored.

“We view the Data Science Machine as a natural complement to human intelligence,” Kanter said in a statement. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”

This post has been updated to clarify that Kalyan Veeramachaneni also contributed to the study. 

View original post HERE.

Originally Posted at: New MIT algorithm rubs shoulders with human intuition in big data analysis

From Mad Men to Math Men: Is Mass Advertising Really Dead?

The Role of Media Mix Modeling and Measurement in Integrated Marketing.

At the end of the year Ogilvy is coming out with a new book on the role of Mass Advertising in this new world of Digital Media, Omni-channel and quantitative marketing in which we now live. Forrester research speculated about a more balanced approach to marketing measurement at its recent conference. Forrester proclaimed that gone are the days of the unmeasured Mad men approach to advertising with its large spending on ad buys that largely only drove soft metrics such as Brand Impressions and Customer Consideration. The new balanced approach to ad measurement includes a more Mathematical approach where programs have hard metrics many of which are financial (Sales, ROI, Quality Customer relationships) is here to stay. The hypothesis that Forrester put forward in their talk was that marketing has almost gone too far to Quantitative Marketing and they suggested the best blend is where Marketing Research as well as quantitative and behavioral data both have a role in measuring integrated campaigns. So what does that mean for Mass Advertising you ask?

Firstly, and the ad agencies can breathe a sigh of relief, Mass Marketing is not dead but is subject to many new standards namely:

Mass will be a smaller part of the Omni-Channel Mix of activities that CMO’s can allocate their spending toward but that allocation should be guided by concrete measures and for very large buckets of spend, Media Mix or Marketing Mix Modeling can help with decision making. The last statistic that we saw from Forrester was that Digital Media spend was about to or already surpassed mass media ad spend. So CMO’s should not be surprised to see that SEM, Google AdWords, Digital/Social and Direct Marketing are a significant 50% or more of the overall investment.
CFO’s are asking CMO’s for the returns on programmatic and digital media buys. How much money does each media buy make and how do you know it’s working? Gone are the days of “always on” mass advertising that could get away with reporting back only GRP’s or brand health measures. The faster the CMO’s are on board with this shift the better they can ensure a dynamic process for marketing investments.
The willingness to turn off and match market test mass media to ensure that it is still working. Many firms need to assess whether TV, and Print works for their brand, or campaign in light of their current target markets. Many ad agencies and marketing service providers have advanced audience selection and matching tools to help with this problem. (Nielsen, Acxiom and many more) These tools typically integrate audience profiling as well as privacy protected behavior information.
The Need to run more integrated campaigns with standard offers across channels and a smarter way of connecting the call to action and the drivers to Omni-channel within each media. For example, mention that consumers can download the firm’s app in the app store in other online advertising. The Integration of channels within a campaign will require more complex measurement attribution rules as well as additional test marketing and test and learn principles.
In this post we briefly explore two of these changes namely media mix modeling and advertising measurement options. If you want more specifics, please feel free to contact us if we can be helpful at

Firstly, let’s discuss that there is always a way to measure mass advertising and that it is not true that you need to leave it turned on for all eternity to do so for example: If you want to understand does my Media buy in NYC or a matched Market like LA (Large Markets) bring in the level of sales and inquiry to other channels that I need at a certain threshold, we posit that you can always:

Conduct a simple pre and post ad campaign lift analysis to determine the level of sales and other performance metrics prior to, during and after the ad campaign has run.
Secondly, you can hold out a group of matched markets to serve as control for performance comparison against the market you are running the ad in.
Shut off the media in a market for a very brief period of time. This can allow you to compare “dark” period performance with the “on” program period with Seasonality adjustments to derive some intelligence about performance and perhaps create a performance factor or base line from which to measure going forward. Such a factor can be leveraged for future measurement without shutting off programs. This is what we call dues paying in marketing analytics. You may have to sacrifice a small amount of sales to read the results each year. This is one way to ensure measurement of mass advertising.
Finally, you can study any behavioral changes in cross sell and upsell rates of current customers who may increase their relationship because of the campaign you are running.
Another point to make is that Enterprise Marketing Automation can help with the tracking and measuring of ad campaigns. For really large Integrated Marketing Budgets we recommend, Media Mix Modeling or Marketing Mix Modeling. There are a number of firms(Market Share Partners Inc is one firm.) that provide these models and we can discuss that in future posts. The Basic Mechanics of Marketing Mix Modeling(MMM) is as follows:

MMM uses econometrics to help understand the relationship between Sales and the various marketing tactics that drive Sales

Combines a variety of marketing tactics (channels, campaigns, etc.) with large data sets of historical performance data
Regression Modeling and Statistical analysis is often performed on available data to estimate the impact of various promotional tactics on sales in order to forecast the future sets of promotional tactics
This analysis allows you to predict sales based on mathematical correlations to historical marketing drivers & market conditions
Ultimately, the model uses predictive analytics to optimize future marketing investments to drive increases in sales, profits & share
Allows you to understand ROI for each media channel, campaign and execution strategy
Which media vehicles/campaigns are most effective at driving revenue/profit and share
Shows you what your incremental sales would be at different levels of media spend
Optimal Spending Mix by Channel to generate the most sales
The model establishes a link between individual drivers and Sales
Allows you to identify a sales response curve to advertising
So the good news is … Mass Advertising is far from dead! Its effectiveness will be looked at in the broader context of integrated campaigns with an eye toward contributing hard returns such as more customers and quality relationships. In addition, Mass advertising will be looked at in the context of how it integrates with digital, for example when the firm runs the TV ad do searches for the firm’s products and brand increase and then do those searches convert to sales. The Funnel in the new Omni-channel world is still being established.

In Summary, overall Mass advertising must be understood at the segment, customer and brand level as we believe it has a role in the overall marketing mix when targeted and used in the most efficient way. A more thoughtful view of marketing efficiency is now emerging and includes such aspects as matching the TV ad in the right channel to the right audience, understanding metrics and measures as well as integration points in how mass can complement digital channels as part of an integrated Omni-Channel Strategy. Viewing Advertising as a Separate discipline from Digital Marketing is on its way to disappearing and marketers must be well versed in both online and offline as the lines will continue to blur, change and optimize. Flexibility is key. The Organizations inside companies will continue to merge to reflect this integration and to avoid siloed thinking and sub-par results.

We look forward to dialoguing and getting your thoughts and experience on these topics and understanding counterpoints and other points of view to ours.

Thanks Tony Branda


Mad Men and Math Men

Source: From Mad Men to Math Men: Is Mass Advertising Really Dead?

Better Recruiting Metrics Lead to Better Talent Analytics


According to Josh Bersin in Deloitte’s 2013 report, Talent Analytics: From Small Data to Big Data, 75% of HR leaders acknowledge analytics are important to the success of their organizations. But 51% have no formal talent analytics plans in place. Nearly 40% say they don’t have the resources to conduct sound talent analytics. Asked to rate their own workforce analytics skills, another 56% said poor.

As Bersin further noted in a recent PeopleFluent article, HR Forecast 2014, “Only 14% of the companies we studied are even starting to analyze people-related data in a statistical way and correlate it to business outcomes. The rest are still dealing with reporting, data cleaning and infrastructure challenges.”

There’s a striking gap between the large number of companies that recognize the importance of metrics and talent analytics and the smaller number that actually have the means and expertise to put them to use.

Yes, we do need to gather and maintain the right people data first, such as when and where applicants apply for jobs, and the specific skills an employee has. But data is just information captured by recruiting system or software already in place. It doesn’t tell any story.

Compare data against goals or thresholds and it turns into insight, a.k.a workforce metrics — measurements with a goal in mind, otherwise known as Key Performance Indicators (KPIs), all of which gauge quantifiable components of a company’s performance. Metrics reflect critical factors for success and help a company measure its progress towards strategic goals.

But here’s where it gets sticky. You don’t set off on a cross-country road trip until you know how to read the map.

For companies, it’s important to agree on the right business metrics – and it all starts with recruiting. Even with standard metrics for retention and attrition in place, some companies also track dozens of meaningless metrics— not tied to specific business goals, not helping to improve business outcomes.

I’ve seen recruiting organizations spend all their time in the metrics-gathering phase, and never get around to acting on the results — in industry parlance, “boiling the ocean.” You’re far better off gathering a limited number of metrics that you actually analyze and then act upon.

Today many organizations are focused on developing recruiting metrics and analytics because there’s so much data available today on candidates and internal employees (regardless of classification). Based on my own recruiting experience and that of many other recruiting leaders, here are what I consider the Top 5 Recruiting Metrics:

1. New growth vs. attrition rates. What percentage of the positions you fill are new hires vs. attrition? This shows what true growth really looks like. If you are hiring mostly due to attrition, it would indicate that selection, talent engagement, development and succession planning need attention. You can also break this metric down by division/department, by manager and more.

2. Quality of hires. More and more, the holy grail of hiring. Happily, all measurable: what individual performances look like, how long they stay, whether or not they are top performers, what competencies comprise their performance, where are they being hired from and why.

3. Sourcing. Measuring not just the what but the why of your best talent pools: job boards, social media, other companies, current employees, etc. This metric should also be applied to quality of hire: you’ll want to know where the best candidates are coming from. Also, if you want to know the percentage rate for a specific source, divide the number of source hires by the number of external hires. (For example, total Monster job board hires divided by total external hires.)

4. Effectiveness ratio. How many openings do you have versus how many you’re actually filling?  You can also measure your recruitment rate by dividing the total number of new hires per year by the total number of regular headcount reporting to work each year. Your requisitions filled percent can be tallied by dividing the total number of filled requisitions by the total number of approved requisitions.

5. Satisfaction rating. An important one, because it’s not paid much attention to when your other metrics are in good shape. Satisfaction ratings can be gleaned from surveys of candidates, new hires and current employees looking for internal mobility. While your overall metrics may be positive, it’s important to find out how people experience your hiring process.

As your business leaves behind those tedious spreadsheets and manual reports and moves into Talent Analytics, metrics are going to be what feeds those results. Consider which metrics are the most appropriate for your business — and why. And then, the real analysis can begin, and help your organization make better talent-related decisions.

Article originally appeared HERE.

Source by analyticsweekpick

4 Tips to Landing Your Dream Job in Big Data


It is no longer enough to apply for a data-driven job armed with an engineering degree and the ability to program. Over the past few years, big data has expanded outside of IT and business analytics and into departments as diverse as marketing, manufacturing, product development and sales.

As demand for data expertise grows across industries, the bar for what makes an acceptable, or better yet excellent, hire rises at pace. Competition is fierce, particularly among high-growth startups that tout big paychecks, pre-IPO stock options and premium perks. The most coveted data jobs require not only hard skills — like coding in Python or knowing how to stand up a Hadoop cluster — but critical soft skills such as how you present your findings to your company colleagues.

Working in the database industry for the past four years, I have spent numerous hours with candidates vying for an opportunity to join the big-data movement. Building the next great database company requires more than hiring a competent team of engineers with diverse backgrounds, skills and experiences. You have to find people who dare to innovate and embrace unorthodox approaches to work.

After interviewing a range of professionals working in the big data arena, I have found four mantras that distinguish top candidates in the process:

1. Subscribe to a multi-model mindset, not a monolithic one.

We no longer live in a one-solution world. The best solutions are often a mix of ideas and strategies, coalesced from diverse sources. Modern challenges require a distributed approach, which mirrors the distributed nature of the technology platforms emerging today.

2. Have a thesis about data and its positive impact on the world.

Know why you want to work in big data. Your future colleagues want to hear a compelling narrative about the ways in which you believe data is shaping the world, and why it matters. Use concrete examples to bolster your thesis and highlight any work you have done with data previously.

3. Hit the circuit; get the t-shirt.

Making strong interpersonal connections has always been important in business, so hanging out behind your screen all day will not cut it. Make it a point to meet tons of people in the data field. Go to conferences, meetups and other gatherings and actively, as well as respectfully, engage on social-media channels. One of the strongest indicators for success within a technology company is if you were referred in by an existing team member, so it is worth expanding your network.

4. Get your hands dirty.

Download a new software package or try a new tool regularly. New technologies are readily available online on GitHub and many vendors offer free community editions of their software. Build things. Break things. Fix things. Repeat. Post on your Github account, so your online profile is noticed too. This is a great way to show you are a self-motivated, creative problem solver with an active toolkit at your disposal.

As you enter the big data world, it is important to stay in tune with what makes you unique. When in doubt, keep your high-level goals in mind and push yourself to learn new things, meet people outside of your regular band of cohorts and embrace the twists and turns of an evolving industry. You will be a standout in no time.

See Original post HERE.

Source: 4 Tips to Landing Your Dream Job in Big Data by analyticsweekpick

The Evolution Of The Geek [Infographics]

The word geek is a slang term for odd or non-mainstream people, with different connotations ranging from “a computer expert or enthusiast” to “a person heavily interested in a hobby”, with a general pejorative meaning of “a peculiar or otherwise dislikable person, esp[ecially] one who is perceived to be overly intellectual”.[1]

Although often considered as a pejorative, the term is also often used self-referentially without malice or as a source of pride. – Wikipedia


Here is an infographic on “The Evolution of the Geek ” by Visually.

The Evolution of the Geek

Source: The Evolution Of The Geek [Infographics]

A beginners guide to data analytics

This is Part I of our three-part June 2015 print cover story on healthcare analytics. Part I focuses on the first steps of launching an analytics program. Part II focuses on intermediate strategies, and Part III focuses on the advanced stages of an analytics use.

This first part may sting a bit: To those healthcare organizations in the beginning stages of rolling out a data analytics program, chances are you’re going to do it completely and utterly wrong.

At least that’s according to Eugene Kolker, chief data scientist at Seattle Children’s Hospital, who has been working in data analysis for the past 25 years. When talking about doing the initial metrics part of it, “The majority of places, whether they’re small or large, they’re going to do it wrong,” he tells Healthcare IT News. And when you’re dealing with people’s lives, that’s hardly something to take lightly.

Kolker would much prefer that not to be the case, but from his experiences and what he’s seen transpire in the analytics arena across other industries, there’s some unfortunate implications for the healthcare beginners.

“What’s the worst that can happen if Amazon screws up (with analytics)?…It’s not life and death like in healthcare.”


But it doesn’t have to be this way. Careful, methodical planning can position an organization for success, he said. But there’s more than a few things you have to pay serious attention to.

First, you need to get executive buy in. Data analytics can help the organization improve performance in myriad arenas. It can save money in the changing value-based reimbursement world. Better yet, it can save lives. And, if you’re trying to meet strategic objectives, it may be a significant part of the equation there too.

As Kolker pointed out in a presentation given at the April 2015 CDO Summit in San Jose, California, data and analytics should be considered a “core service,” similar to that of finance, HR and IT.

Once you get your executive buy in, it’s on to the most important part of it all: the people. If you don’t have people who have empathy, if you don’t have a team who communicate and manage well, you can count on a failed program, said Kolker, who explained that this process took him years to finally get right. People. Process. Technology – in that order of importance.

“Usually data scientists are data scientists not because they like to work with people but because they like to work with data and computers, so it’s a very different mindset,” he said. It’s important, however, “to have those kind of people who can be compassionate,” who can do analysis without bias.

And why is that? “What’s the worst that can happen if Amazon screws up (with analytics)?” Kolker asked. “It’s not life and death like in healthcare,” where “it’s about very complex issues about very complex people. … The pressure for innovation is much much higher.”

[Part II: Clinical & business intelligence: the right stuff]

[Part III: Advanced analytics: All systems go]

When in the beginning stages of anything analytics, the aim is to start slow but not necessarily to start easy, wrote Steven Escaravage and Joachim Roski, principals at Booz Allen, in a 2014 Health Affairs piece on data analytics. Both have worked on 30 big data projects with various federal health agencies and put forth their lessons learned for those ready to take a similar path.

One of those lessons?

Make sure you get the right data that addresses the strategic healthcare problem you’re trying to measure or compare, not just the data that’s easiest to obtain.

“While this can speed up a project, the analytic results are likely to have only limited value,” they explained. “We have found that when organizations develop a ‘weighted data wish list’ and allocate their resources toward acquiring high-impact data sources as well as easy-to-acquire sources, they discover greater returns on their big data investment.”

So this may lead one to ask: What exactly is the right data? What metrics do you want? Don’t expect a clear-cut answer here, as it’s subjective by organization.

First, “you need to know the strategic goals for your business,” added Kolker. “Then you start working on them, how are you going to get data from your systems, how are you going to compare yourself outside?”

In his presentation at the CDO Summit this April, Kolker described one of Seattle Children’s data analytics projects that sought to evaluate the effectiveness of a vendor tool that predicted severe clinical deterioration, or SCD, of a child’s health versus the performance of a home-grown internal tool that had been used by the hospital since 2009.

After looking at cost, performance, development and maintenance, utility, EHR integration and algorithms, Kolker and his team found for buy versus build, using an external vendor tool was not usable for predicting SCD but that it could be tested for something else. And furthermore, the home-grown tool needed to be integrated into the EHR.

Kolker and his team have also helped develop a metric to identify medically complex patients after the hospital’s chief medical officer came to them wanting to boost outcomes for these patients. Medically complex patients typically have high readmissions and consume considerable hospital resources, and SCH wanted to improve outcomes for this group without increasing costs for the hospital.

For folks at the Nebraska Methodist Health System, utilizing a population risk management application that had a variety of metrics built in was a big help, explained Linda Burt, chief financial officer of the health system, in Healthcare IT News’ sister publication webinar this past April.

Katrina Belt

“The common ones you often hear of such as admissions per 1,000, ED visits per 1,000, high-risk high end imaging per 1,000,” she said. Using the application, the health system was able to identity that a specific cancer presented the biggest opportunity for cost alignment.

And health system CFO Katrina Belt’s advice? “We like to put a toe in the water and not do a cannon ball off the high dive,” she advised. Belt, the CFO at Baptist Health in Montgomery, Alabama, said a predictive analytics tool is sifting through various clinical and financial data to identify opportunities for improvement.

In a Healthcare Finance webinar this April, Belt said Baptist Health started by looking at its self-pay population and discovered that although its ER visits were declining, intensive care visits by patients with acute care conditions were up on upward trend.

Belt recommended starting with claims data.

“We found that with our particular analytics company, we could give them so much billing data that was complete and so much that we could glean from just the 835 and 837 file that it was a great place for us to start,” she said. Do something you can get from your billing data, Belt continued, and once you learn “to slide and dice it,” share with your physicians. “Once they see the power in it,” she said, “that’s when we started bringing in the clinical data,” such as tackling CAUTIs.

But some argue an organization shouldn’t start with an analytics platform. Rather, as Booz Allen’s Escaravage and Roski wrote, start with the problem; then go to a data scientist for help with it.

One federal health agency they worked with on an analytics project, for instance, failed to allow the data experts “free rein” to identify new patterns and insight, and instead provided generic BI reports to end users. Ultimately, the results were disappointing.

“We strongly encouraged the agency to make sure subject matter experts could have direct access to the data to develop their own queries and analytics,” Escaravage and Roski continued.  Overall, when in the beginning phases of any analytics project, one thing to keep in mind, as Kolker reinforced: “Don’t do it yourself.” If you do, “you’re going to fail,” he said. Instead, “do your homework; talk to people who did it.”

To read the complete article on Healthcar IT News, click here.


3 Vendors Lead the Wave for Big Data Predictive Analytics

Enterprises have lots of solid choices for big data predictive analytics.

That’s the key takeaway from Forrester’s just released Wave for Big Data Predictive Analytics Solutions for the second quarter of 2015.

That being said, the products Forrester analysts Mike Gualtieri and Rowan Curran evaluated are quite different.

Data scientists are more likely to appreciated some, while business analysts will like others. Some were built for the cloud, others weren’t.

They all can be used to prepare data sets, develop models using both statistical and machine learning algorithms, deploy and manage predictive analytics lifecycles, and tools for data scientists, business analysts and application developers.

General Purpose

It’s important to note that there are plenty of strong predictive analytics solution providers that weren’t included in this Wave, and it’s not because their offerings aren’t any good.

Instead Forrester focused specifically on “general purpose” solutions rather than those geared toward more specific purposes like customer analytics, cross-selling, smarter logistics, e-commerce and so on. BloomReach, Qubit, Certona, Apigee and FusionOps, among others, are examples of vendors in the aforementioned categories.

The authors also noted that open source software community is driving predictive analytics into the mainstream. Developers have an abundant selection of API’s within reach that they can leverage via popular programming languages like Java, Python and Scala to prepare data and predictive models.

Not only that but, according to report, many BI platforms also offer “some predictive analytics capabilities.” Information Builders, MicroStrategy and Tibco, for example, integrate with R easily.

The “open source nature” of BI solutions like Birt, OpenText and Tibco Jaspersoft make R integration simpler.

Fractal Analytics, Opera Solutions, Teradata’s Think Big and Beyond The Art and the like also provide worthwhile solutions and were singled out as alternatives to buying software. The authors also noted that larger consulting companies like Accenture, Deloitte, Infosys and Virtuasa all have predictive analytics and/or big data practices.

In total, Forrester looked at 13 vendors: Alpine Data Labs, Alteryx, Agnoss, Dell, FICO, IBM, KNIME, Microsoft, Oracle, Predixion Software, RapidMiner, SAP and SAS.

Forrester’s selection criteria in the most general sense rates solution providers according to their Current Offering (components include: architecture, security, data, analysis, model management, usability and tooling, business applications) and Strategy (components include acquisition and pricing, ability to execute, implementation support, solution road map, and go-to-market growth rate.) Each main category carries 50 percent weight.

Leading the Wave

IBM, SAS and SAP — three tried and trusted providers — lead this Forrester Wave:.

IBM achieved perfect scores in the seven of the twelve criteria: Data, Usability and Tooling, Model Management, Ability to Execute, Implementation Support, Solution Road Map and Go-to Market Growth Rate. “With customers deriving insights from data sets with scores of thousands of features, IBM’s predictive analytics has the power to take on truly big data and emerge with critical insights,” note the report’s authors. Where does IBM fall short? Mostly in the Acquisition and Pricing category.

SAS is the granddaddy of predictive analytics and, like IBM, it achieved a perfect score many times over. It’s interesting to note that it scored highest among all vendors in Analysis. It was weighed down, however, by its strategy in areas like Go-to-Market Growth Rate and Acquisition and Pricing. This may not be as a big problem by next year, at least if Gartner was right in its most recent MQ on BI and Analytics Platforms Leaders, where it noted that SAS was aware of the drawback and was addressing the issue.

“SAP’s relentless investment in analytics pays off,” Forrester notes in its report. And as we’ve reiterated many times, the vendor’s predictive offerings include some snazzy differentiating features like analytics tools that you don’t have to be a data scientist to use, a visual tool that lets users analyze several databases at once, and for SAP Hana customers SAP’s Predictive Analytics Library (PAL) to analyze big data.

The Strong Performers

Not only does RapidMiner’s predictive analytics platform include more than 1,500 methods across all stages of the predictive analytics life cycle, but with a single click they can also be integrated into the cloud. There’s also a nifty “wisdom of the crowds” feature that Forrester singles out; it helps users sidestep mistakes made, by others, in the past and get to insights quicker. What’s the downside? Implementation support and security.

Alteryx takes the pains out of data prep, which is often the hardest and most miserable part of a data worker’s job. They also offer a tool for that helps data scientists collaborate with business users via a visual tool. Add to that an analytical apps gallery to help users share their data prep and modeling workflows with other users, and you’ve set a company up with what it needs to bring forth actionable insights. While Alteryx shines in areas like Data, Ability to Execute, and Go-to-Market Growth Rate, there’s room for improvement in Architecture and Security.

Oracle rates as a strong performer, even though it doesn’t offer a single purpose solution. Instead its Oracle SQL Developer tool includes a visual interface to allow data analysts to create analytical workflows and models, this according to Forrester. Not only that, but Oracle also takes advantage of open-source R for analysis, and has revised a number of its algorithms to take advantage of Oracle’s database architecture and Hadoop.

FICO, yes, Forrester’s talking about the credit scoring people, has taken its years of experience in actionable predictive analytics, built a solution and taken it to the Cloud where its use is frictionless and available to others. It could be a gem for data scientists who are continuously building and deploying models. FICO’s market offering has lots of room for improvement in areas like Data and Business Applications, though.

Agnoss aims to make it easier for non-data scientists to get busy with predictive analytics tools via support services and intuitive interfaces for developing predictive models. While the solution provider has focused its go-to-market offerings on decision trees until recently, it now also offers Strategy Tree capability, which helps advanced users to create complex cohorts from trees.

Alpine Data Labs offers “the most comprehensive collaboration tools of all the vendors in the Forrester Wave, and still manages to make the interface simple and familiar to users of any mainstream social media site,” wrote Gualtieri and Curran in the report. The fact that not enough people buy Alpine products seems to be the problem. It might be a matter of acquisition and pricing options, it’s here that Alpine scores lowest among all vendors in the Wave.

Dell plans to go big in the big data and predictive analytics game. It bought its way into the market when it acquired Statistica. “Statistica has a comprehensive library of analysis algorithms and modeling tools and a significant installed base,” say the authors. Dell scores second lowest among Wave vendors in architecture, so it has a lot of room for improvement there.

KNIME is the open source contender in Forrester’s Wave. And though “free” isn’t the selling point of open source, it rates; perhaps only second to the passion of its developers. “KNIME’s flexible platform is supported by a community of thousands of developers who drive the continued evolution of the platform by contributing extensions essential to the marketplace: such as prebuilt industry APIs, geospatial mapping, and decision tree ensembles,” write the researchers. KNIME competes with only Microsoft for a low score on business applications and is in last place, by itself, when it comes to architecture. It has a perfect score when it comes to data.

Make Room for Contenders

Both Microsoft and Predixion Software bring something to the market that others do not.

They seem to be buds waiting to blossom. Microsoft, for its part, has its new Azure Machine Learning offering as well as the assets of Revolution Analytics, which it recently acquired. Not only that, but the company’s market reach and deep pockets cannot be overstated. While Microsoft brought home lower scores than many of the vendors evaluated in this Wave, it’s somewhat forgivable because its big data, predictive analytics solutions may be the youngest.

Predixion Software, according to Forrester, offers a unique tool, namely (MSLM), a machine learning semantic model that packages up transformations, analysis, and scoring of data that can be deployed in.NET or Java OSGI containers. “This means that users can embed entire predictive workflows in applications,” says the report.

Plenty of Good Choices

The key takeaways from Forrester’s research indicate that more classes of users can now have access to “modern predictive power” and that predictive analytics now allow organizations to embed intelligence and insight.

The analysts, of course, suggest that you download their report, which, in fact, might be worthwhile doing. This is a rapidly evolving market and vendors are upgrading their products at a rapid clip. We know this because there’s rarely a week where a new product announcement or feature does not cross our desks.

And if it’s true that the organizations who best leverage data will win the future, then working with the right tools might be an important differentiator.

Originally posted via “3 Vendors Lead the Wave for Big Data Predictive Analytics”


Source: 3 Vendors Lead the Wave for Big Data Predictive Analytics