Apr 30, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Big Data knows everything  Source

[ AnalyticsWeek BYTES]

>> Emerging Applications of Deep Learning: “Making Data Speak” by jelaniharper

>> October 3, 2016 Health and Biotech Analytics News Roundup by pstein

>> How Airbnb Uses Big Data And Machine Learning To Guide Hosts To The Perfect Price by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

R Basics – R Programming Language Introduction

image

Learn the essentials of R Programming – R Beginner Level!… more

[ FEATURED READ]

Superintelligence: Paths, Dangers, Strategies

image

The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but … more

[ TIPS & TRICKS OF THE WEEK]

Grow at the speed of collaboration
A research by Cornerstone On Demand pointed out the need for better collaboration within workforce, and data analytics domain is no different. A rapidly changing and growing industry like data analytics is very difficult to catchup by isolated workforce. A good collaborative work-environment facilitate better flow of ideas, improved team dynamics, rapid learning, and increasing ability to cut through the noise. So, embrace collaborative team dynamics.

[ DATA SCIENCE Q&A]

Q:When would you use random forests Vs SVM and why?
A: * In a case of a multi-class classification problem: SVM will require one-against-all method (memory intensive)
* If one needs to know the variable importance (random forests can perform it as well)
* If one needs to get a model fast (SVM is long to tune, need to choose the appropriate kernel and its parameters, for instance sigma and epsilon)
* In a semi-supervised learning context (random forest and dissimilarity measure): SVM can work only in a supervised learning mode

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

We chose it because we deal with huge amounts of data. Besides, it sounds really cool. – Larry Page

[ PODCAST OF THE WEEK]

@AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

 @AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

140,000 to 190,000. Too few people with deep analytical skills to fill the demand of Big Data jobs in the U.S. by 2018.

Sourced from: Analytics.CLUB #WEB Newsletter

Can the internet be decentralized through blockchain technology?

When it comes to cybersecurity debates, a raging one these days is about the freedom on the internet. In the current Age of Information, several authorities, including governmental agencies have been striving to gain complete control over the internet.  A frequently brought-up aspect of internet freedom is ‘decentralization,’ which refers to a rather idealistic version of the internet, with no centralized servers, etc.

Taking the notion of decentralization into account, several cybersecurity specialists and researchers have shone a spotlight on blockchain technology, and the powerful role it could play in decentralizing the flow of data on the web.

However, when it comes to propagating the notions of security and transparency on the internet, we feel it our moral obligation to ponder over the implications that the implementation of the blockchain technology might have.

In a digital landscape where people have to create petitions and fight for the smallest snippets of online freedom, it is crucial that we dive off the deep end, and analyze the problem, along with the proposed solution from all sides and perspectives. Before we can do that, however, we’d like to start the article off by presenting to our readers an overview of the importance of decentralization of the internet.

Why does the decentralization of the internet matter?

Before we can elaborate on what decentralization is, and the importance it bears on the current digital landscape, we’ll need a brief recap over what centralization is and the problems it bears on the functioning of the web today.

On the opposite side of the spectrum, centralization refers to a couple of entities ‘owning’ the internet, which leads to a power dynamic between smaller and large tech companies, in which the dice almost always rolls in favor of the larger fish in the pond.

Opposed to the very concept of decentralization, centralization refers to a totalitarian rule over the web, while decentralization takes a much more democratic approach towards the internet. Although alluding the present centralized state of the internet to living under a totalitarian regime might seem excessive, one need looks no further than China for a real-life demonstration of the dire impact that an unregulated control over the internet can have.

Keeping all this in mind, it comes as no surprise that freedom on the internet continues to decline, according to a report published by Freedom House. The same report also points a finger towards China, for being the worst abuser of internet freedom, for the fourth consecutive year.

Moreover, the findings of the report depict a bleak future for internet freedom as well, since out of the 65 countries surveilled in the report, it was revealed that 33 of those countries had seen a decline in internet freedom (compared to 2018), while only 16 had seen an increase. Taking the dire implications of these stats into consideration, the need for a decentralized internet becomes apparent, since the present-day internet is dominated by a group of companies known as the FAANGs, which consist of Silicon Valley giants such as Facebook, Apple, Amazon, Netflix, and Google.

The notion of a decentralized web rises from the concerns of many- it might not come off as a surprise to many, but data-mining practices on the internet are rising in popularity, which renders personal information into a commodity for sellers and advertisers to exploit. Furthermore, the centralization of the internet today leaves a lot of room for the manipulation of data by those at the top of the hierarchy, along with several loopholes that can be exploited by cybercriminals to gain access to sensitive data.

The importance of decentralization of the internet can be understood from this point that hackers often steal databases that aggregate personal information which bad actors use to open financial accounts under the stolen identities. Such data consists of credit card details, social security numbers, and other information which attackers can use to benefit themselves. As hackers usually obtain such information through phishing attacks so, individuals must be aware of to combat such attacks. Also, it is revealed that the majority of the inbox phishing scams in 2018 were related to credential and email scams; therefore, everyone should know how to tackle it.

The concept of a decentralized internet offers a fix to the oligopoly seen in action over the internet today, since, in a decentralized web, there is no need for a centralized data storage server. If the notion of a decentralized internet was sprung up to life, the web would rely on a network of multiple participating computers, which would all share equally in the momentous responsibility of storing valuable information on the internet.

How can blockchain help decentralize the internet?

If what we’ve talked about above sounds eerily familiar, chances are you’ve heard or read about blockchain technology. When it comes to decentralizing the internet, blockchain technology can prove to be crucial since it employs a peer-to-peer network protocol, in which data is stored across multiple computers or “nodes.”

At this point, however, the only popular application of blockchain is related to the tech’s use in cryptocurrency, where it plays a fundamental role in authorizing transactions through a native coin or token.

In a similar way, the blockchain technology can be used to decentralize the web- however, the process isn’t going to be an easy one. Considering that one of the greatest drawbacks of the centralized internet is how tedious data management becomes on the web; an analysis of web hosting industry stats by HostScore revealed that the problem is only going to grow in complexity since the web hosting market is only expected to grow.

Furthermore, the analysis done by HostScore also brought into light that the global hosting saturation is expected to grow at a staggering rate of 13.25%, which paints a rather bleak picture about worldwide data storage, and the problems it hosts.

Having said that, the analysis done by HostScore in no way means that blockchain is of no use in decentralizing the internet, we just need to look for alternatives rooted in the technology, that aren’t too impractical to rely on.

Two of these more ‘practical’ approaches to the integration of blockchain technology include shifting data storage to network backbones (using cryptographic ledgers) and relying on a ledger-based retrieval and storage system.

Both of these alternatives are based on a cryptographic ledger. So, they immensely relieve the central servers of some of their storage. Migrating to a network backbone via a cryptographic ledger can also prove to be extremely lucrative for small businesses. Especially since a network backbone could help them upgrade their upload and download speeds. And provide a greater level of much-needed security to them.

When it comes to larger enterprises, however, cybersecurity specialists propose something else. A reliance on cryptographic ledgers and retrieval. This encourages larger companies to store data through blockchain, secured with identity authorization tokens. Which in turn allows for the safe transmission and transaction of data within organizations.

In conclusion

At the end of the article, we’d like to reinstate the idea we’ve mentioned above. A decentralized internet is absolutely crucial for securing our digital landscape for future generations to come. Moreover, blockchain technology still has a lot of untapped potentials. This elevates cryptocurrency tech to a staple in the cybersecurity diet of many.

 

The post Can the internet be decentralized through blockchain technology? appeared first on Big Data Made Simple.

Source: Can the internet be decentralized through blockchain technology? by administrator

Dickson Tang (@imDicksonT) on Building a Career Edge over Robots using #3iFramework #JobsOfFuture #Podcast

 

In this podcast Dickson Tang shared his perspective on building a future and open mindset organization by working on it’s 3 Is: Individual, Infrastructure and Ideas. He shared his perspective on various organization types and individuals who could benefit from this 3iFramework, elaborated in details in his book: “Leadership for future of work: ways to build career edge over robots with human creativity book”. This podcast is great for anyone seeking to learn about ways to be open, innovative and change agent within an organization.

Dickson’s Book:

Leadership for future of work: 9 ways to build career edge over robots with human creativity by Dickson Tang amzn.to/2McxeIS

Dickson’s Recommended Read:
The Creative Economy: How People Make Money From Ideas by John Howkins amzn.to/2MdLotA

Podcast Link:
iTunes: math.im/jofitunes
Youtube: math.im/jofyoutube

Dickson’s BIO:
Dickson Tang is the author of Leadership for future of work: ways to build career edge over robots with human creativity book. He helps senior leaders (CEO, MD and HR) build creative and effective teams in preparation for the future / robot economy. Dickson is a leadership ideas expert, focusing on how leadership will evolve in the future of work. 15+ years of experience in management, business consulting, marketing, organizational strategies and training & development. Corporate experience with several leading companies such as KPMG Advisory, Gartner and Netscape Inc.

Dickson’s expertise on leadership, creativity and future of work have earned him invitations and opportunities to work with leaders and professionals from various organizations such as Cartier, CITIC Telecom, DHL, Exterran, Hypertherm, JVC Kenwood, Mannheim Business School, Montblanc and others.

He lives in Singapore, Asia.
LinkedIN: www.linkedin.com/in/imDicksonT
Twitter: www.twitter.com/imDicksonT
Facebook: www.facebook.com/imDicksonT
Youtube: www.youtube.com/channel/UC2b4BUeMnPP0fAzGLyEOuxQ

About #Podcast:
#JobsOfFuture is created to spark the conversation around the future of work, worker and workplace. This podcast invite movers and shakers in the industry who are shaping or helping us understand the transformation in work.

Wanna Join?
If you or any you know wants to join in,
Register your interest @ analyticsweek.com/

Want to sponsor?
Email us @ info@analyticsweek.com

Keywords:
#JobsOfFuture #FutureOfWork #FutureOfWorker #FutuerOfWorkplace #Work #Worker #Workplace

Source: Dickson Tang (@imDicksonT) on Building a Career Edge over Robots using #3iFramework #JobsOfFuture #Podcast

Apr 23, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data interpretation  Source

[ AnalyticsWeek BYTES]

>> Data Virtualization: A Spectrum of Approaches – A GigaOm Market Landscape Report by analyticsweekpick

>> Getting to Love: Customer Word Clouds by bobehayes

>> IBM and Hadoop Challenge You to Use Big Data for Good by bobehayes

Wanna write? Click Here

[ FEATURED COURSE]

Statistical Thinking and Data Analysis

image

This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and n… more

[ FEATURED READ]

The Industries of the Future

image

The New York Times bestseller, from leading innovation expert Alec Ross, a “fascinating vision” (Forbes) of what’s next for the world and how to navigate the changes the future will bring…. more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:Do you know / used data reduction techniques other than PCA? What do you think of step-wise regression? What kind of step-wise techniques are you familiar with?
A: data reduction techniques other than PCA?:
Partial least squares: like PCR (principal component regression) but chooses the principal components in a supervised way. Gives higher weights to variables that are most strongly related to the response

step-wise regression?
– the choice of predictive variables are carried out using a systematic procedure
– Usually, it takes the form of a sequence of F-tests, t-tests, adjusted R-squared, AIC, BIC
– at any given step, the model is fit using unconstrained least squares
– can get stuck in local optima
– Better: Lasso

step-wise techniques:
– Forward-selection: begin with no variables, adding them when they improve a chosen model comparison criterion
– Backward-selection: begin with all the variables, removing them when it improves a chosen model comparison criterion

Better than reduced data:
Example 1: If all the components have a high variance: which components to discard with a guarantee that there will be no significant loss of the information?
Example 2 (classification):
– One has 2 classes; the within class variance is very high as compared to between class variance
– PCA might discard the very information that separates the two classes

Better than a sample:
– When number of variables is high relative to the number of observations

Source

[ VIDEO OF THE WEEK]

@JustinBorgman on Running a data science startup, one decision at a time #Futureofdata #Podcast

 @JustinBorgman on Running a data science startup, one decision at a time #Futureofdata #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data beats emotions. – Sean Rad, founder of Ad.ly

[ PODCAST OF THE WEEK]

@JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast

 @JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

According to Twitter’s own research in early 2012, it sees roughly 175 million tweets every day, and has more than 465 million accounts.

Sourced from: Analytics.CLUB #WEB Newsletter

Perfecting Sensor Data Analytics with Cyberforaging

One of the most significant points of evolution in the modern data landscape is the inexorable transition of a centralized model of data management to a decentralized, distributed one. The harbingers of this change include:

  • Big Data: The sheer amounts of data, and speeds at which they are generated today, result in frequently occurring situations in which data is engendered outside of the enterprise, as opposed to within it.
  • Cloud Ubiquity: The copious quantities of data have made the cloud an increasingly viable medium for accessing and processing them, particularly in terms of big data, social media and sentiment analysis, and the integration of external unstructured data with internal, structured data.
  • The Internet of Things: The merging of big data and the cloud have resulted in the transformative power of the IoT, which emphasizes the increased appositeness of mobile technologies in accounting for this convergence.
  • Sensor Data: The continuity of data created by remote sensors and the analytics rigors associated with gleaning insight from such data typify the combination of the foregoing factors.

According to Ryft Vice President of Engineering Pat McGarry, the emergence of these developments has created a common problem for organizations today: “There’s just so much data growing exponentially, and if there’s one problem we have it’s how the heck do we analyze it to get the answers we need quickly enough?”

The solution is partly based on either supplementing or replacing centralized methods of transforming, transmitting, and analyzing data (which has considerable infrastructure, network, and temporal costs) with edge computing, in which analytics are performed at the cloud’s edge closer and quicker to the devices that need them.

It also involves extending the utility of the cloud’s edge via the phenomenon known as cyberforaging–the sharing of computational resources between mobile devices to bolster the advantages of edge computing and account for the challenges of analyzing sensor data.

Sensor Data Analytics
Currently, the business value generated by cyberforaging is most notably found in marketing. But as McGarry pointed out, the greater significance of this amalgamation of big data, cloud computing, edge computing, and mobile technologies is that “More importantly, it’s a way to analyze sensor data in real time. That’s really what’s happening there.” The tailored marketing efforts of real-time sensor data analysis arguably exceed any others, since they are predicated on actual behaviors of customers as they are engaging in them. Contextually-based advertising systems can leverage the sensor capabilities of mobile devices and cloudlets by determining what marketing materials are sent, in what order, and in what context. Typically this process begins with designing enterprise apps that customers download to their phones. As McGarry noted, when “you walk into a store, you have their app and you’re on Wi-Fi, you’re connected right there at their store. They can literally talk to you directly then, and they can offload things about you, directly analyze them, and send the results back.”

Power of the End Device
In some respects, cyberforaging is possible due to the increased power of end devices. Whether such devices are mobile (in the form of tablets, smart phones, or laptops) or otherwise, the increased computational power found at the edge of networks makes performing computations and issuing the results there at the cloud’s fringe advantageous. Cyberforaging is largely predicated on the sensor capabilities of mobile devices, which can be used to detect additional devices nearby (known as cloudlets) to offload computations or even vital data preparation for performing analytics. Cloudlets are any variety of devices with servers and internet connections that function between end points and the cloud for the purposes of offloading. They also work without an existing internet connection—once their resources have been provisioned. The benefit of cloudlets is that they are closer to the fringe of the network and enable preparation and analytics without having to constantly transmit data generated at the cloud’s edge to a few centralized locations.

Computational and Preparation Off-Loading
The power of mobile devices (such as smartphones) is greatly augmented by cloudlet-based cyberforaging. Heavy analytics jobs can be performed on the fly with reduced costs and infrastructure compared to those for maintaining centralized data centers. Remote data preparation can be used to facilitate ETL for BI and analytics purposes, or in some instances, be avoided altogether. These cyberforaging benefits also pertain to greater flexibility and autonomy for data access while furthering the self-service movement. The classic example of cyberforaging is found in telecommunication style applications, but the potential for this phenomenon to revamp sensor data analytics and analytics at the edge of networks on the whole is substantial. “It’s being done in some places, but it’s not being done to the extent it needs to be done,” McGarry remarked. “It has to be this way. There’s so much data being generated now at the edge of these networks; you have to do something. You can’t ship it all to a central location anymore. You’ve got to ship it somewhere else, or you’ve got to process it right there as locally as possible.”

Real-Time Mobile Analytics
The concept of cyberforaging can be as simple as utilizing a popular search engine’s cluster to perform an operation and send the results back to a mobile device. It’s centrality to the way it solidifies the distributed paradigm of data management lies in the way it reduces physical infrastructure, network and bandwidth strains, and costs associated for analytics. End users can reap the benefits of distributed analytics without having to wait for centralized approaches and the bottlenecks they create. Additionally, cyberforaging increases the power of mobile devices (or any device at the edge of the cloud), their speed, and the autonomy that is part of the self-service movement in data management. These advantages will resonate even further as the IoT continues to gain credence. Moreover, they help to bolster the utility of sensor data analytics and make them less complicated and more viable for organizations seeking to exploit this technology.

“Throughout history, knowledge has been power,” McGarry stated. “We’re generating so much data, and the trick is getting knowledge out of that data to use it as a competitive advantage over somebody else. That is really at the crux of it, and all of this stuff is related directly to that. So I think businesses in the end have to figure this stuff out.”

Originally Posted at: Perfecting Sensor Data Analytics with Cyberforaging by jelaniharper

Announcing RStudio and Databricks Integration

At Databricks, we are thrilled to announce the integration of RStudio with the Databricks Unified Analytics Platform. You can try it out now with this RMarkdown notebook (Rmd | HTML) or visit us at www.databricks.com/rstudio.

For R practitioners looking at scaling out R-based advanced analytics to big data, Databricks provides a Unified Analytics Platform that gets up and running in seconds, integrates with RStudio to provide ease of use, and enables you to automatically run and execute R workloads at unprecedented scale across single or multiple nodes.

Integrating Databricks and RStudio together allows data scientists to address a number of challenges including:

  1. Increase productivity among your data science teams: Data scientists using R can use their favorite IDE using SparkR or sparklyr to seamlessly execute jobs on Spark to scale your R-based analytics. At the same time you can get your environment up and running quickly to provide scale without the need for cluster management.
  2. Simplify access and provide the best possible dataset: R users can get access to the full ETL capabilities of Databricks to provide access to relevant datasets including optimizing data formats, cleaning up data, and joining datasets to provide the perfect dataset for your analytics
  3. Scale R-based analytics to big data: Move from data science to big data science by scaling up current R-based analysis to the analytics volume based on Apache Spark running on Databricks. At the same time, you can keep costs under control with the auto-scaling of Databricks to automatically scale usage up and down based upon your analytics needs.

Introducing Databricks RStudio Integration
With Databricks RStudio Integration, both popular R packages for interacting with Apache Spark, SparkR or sparklyr can be used the inside the RStudio IDE on Databricks. When multiple users use a cluster, each creates a separate SparkR Context or sparklyr connection, but they are all talking to a single Databricks managed Spark application allowing unique opportunities for collaboration between users. Together, RStudio can take advantage of Databricks’ cluster management and Apache Spark to perform such as a massive model selection as noted in the figure below.

You can run this demo on your own using this k-nearest neighbors (KNN) RMarkdown regression demo (Rmd | HTML).

Next Steps
Our goal is to make R-based analytics easier to use and more scalable with RStudio and Databricks. To dive deeper into the RStudio integration architecture, technical details on how users can access RStudio on Databricks clusters, and examples of the power of distributed computing and the interactivity of RStudio – and to get started today, visit www.databricks.com/rstudio.

Try Databricks for free. Get started today.

The post Announcing RStudio and Databricks Integration appeared first on Databricks.

Source by analyticsweek

Apr 16, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data Storage  Source

[ AnalyticsWeek BYTES]

>> @ReshanRichards on creating a learning startup for preparing for #jobsoffuture #podcast by admin

>> Experience the magic of shuffling columns in Talend Dynamic Schema by analyticsweekpick

>> The Potential Of Big Data In Africa by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

CS229 – Machine Learning

image

This course provides a broad introduction to machine learning and statistical pattern recognition. … more

[ FEATURED READ]

How to Create a Mind: The Secret of Human Thought Revealed

image

Ray Kurzweil is arguably today’s most influential—and often controversial—futurist. In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse… more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:Which kernels do you know? How to choose a kernel?
A: * Gaussian kernel
* Linear kernel
* Polynomial kernel
* Laplace kernel
* Esoteric kernels: string kernels, chi-square kernels
* If number of features is large (relative to number of observations): SVM with linear kernel ; e.g. text classification with lots of words, small training example
* If number of features is small, number of observations is intermediate: Gaussian kernel
* If number of features is small, number of observations is small: linear kernel

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek Panel Discussion: Marketing Analytics

 @AnalyticsWeek Panel Discussion: Marketing Analytics

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

You can use all the quantitative data you can get, but you still have to distrust it and use your own intelligence and judgment. – Alvin Tof

[ PODCAST OF THE WEEK]

Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

 Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Estimates suggest that by better integrating big data, healthcare could save as much as $300 billion a year — that’s equal to reducing costs by $1000 a year for every man, woman, and child.

Sourced from: Analytics.CLUB #WEB Newsletter

3 Questions to Ask Your Embedded Analytics Support Team

Finding the right embedded analytics partner is about a lot more than initial showmanship. A flashy sales process doesn’t guarantee success with embedded analytics. Especially if you’re an OEM or ISV, you have to make sure your embedded analytics solution offers the support you need to get into production and succeed long term.

Ask potential vendors what kind of technical and account support you’ll get after the sales process is over. These three questions will guide your search for a partner that helps you every step of the way.

>> Related: Gartner’s 5 Best Practices to Choosing an Embedded Analytics Platform Provider <<

  1. Will your vendor support you on two levels, providing customer account assistance and technical support?

Your post-sales customer success team should be more than just a standby contact. Look for a support team that provides help on both a technical and account level. They should act as trusted teammates who will be liaisons for your short- and long-term needs. Can the support team demonstrate and articulate their role as a resource?

At the end of the day, your customer and technical success teams should act as a consolidated resource to bring you the tools—whether it’s one-on-one support, product documentation, or new capabilities—to ensure your success with embedded analytics. They should keep you informed on the latest capabilities and market progressions. To that end, look for an embedded analytics vendor that regularly introduces new advancements, updates, and products to the embedded analytics market.

  1. How often will your vendor get involved in your production roll-out? Will they also support future enhancements?

Your customer success team will be critical in getting your analytics project into production. Ask about the cadence of engagements to get a sense of their future involvement. For instance, how often will your customer success team engage with your development and product teams?

At a minimum, expect quarterly check-ins with your customer support and technical account management teams. Look for a vendor that works with most of their customers to enhance offerings and create competitive differentiation. Your support team should lift some of the ongoing burden from your shoulders. After all, you have your own mission-critical application to build, manage, and deploy. Your customer success team should act as trusted advisors who help shape product roadmaps as your application evolves. 

  1. Does your vendor specialize in embedding dashboards, reports, and analytics in existing applications?

Most large-scale data discovery applications were designed as quick, formulaic, out-of-the-box solutions for business analysts to create data visualizations for a handful of internal team members. But the world of embedded analytics is different. To successfully embed dashboards and reports in your application, you need a vendor that will seamlessly integrate into your core product.

Look for a partner that specializes in helping OEMs, ISVs, and SaaS companies with embedded analytics. Ask them to walk you through the key milestones and timelines to successfully brand, secure, deploy, and scale information to your user base—all while maintaining your unique application experience.

Conclusion: With out-of-the-box analytics solutions, it’s easy to feel like you are only a login I.D. to a support ticketing system—not a cherished partner. Look for a vendor with a customer success structure that goes beyond the veneer of “sideline support.” If they don’t demonstrate honesty or trustworthiness, they won’t be hyper-focused on your success and growth. Asking the three questions outline here will help you find the right vendor for both your immediate and long-term requirements.

See how Dresner rates analytics vendors in the 2018 Dresner Advisory Services Embedded Business Intelligence Market Study >

 

Source

The Five Faces of the Analytics Dream Team

chasm2

The chasm between Business and IT is well documented and has existed since the first punch-card mainframe dimmed the lights of MIT to solve the ballistic trajectory of WWII munitions.  Analytics and now Data Science are trapped in the middle.  Everyone hopes they’ll deliver the productivity gains, but the jury is still out.

Some studies suggest that analytics projects have an 80% failure rate. A recent HBR article put it at 100% for data science projects. That’s abysmal. And there are dozens of reasons why it’s so poor. In this article, we’ll look at the team.

A helpful starting point is to imagine your dream team.  Who would you hire, and what would their roles be?  I suggest that there are five distinct job descriptions:

So who’s in the analytics dream team. 

Data Steward – this skillset is alive and well in most organizations.  Almost everyone has a data warehouse, talks about the ETL process, and has had discussions around the business rules of cleaning up and storing their data. What they should be talking about is how to get the data out more quickly and cleanly. A typical project is 80% data wrangling, so don’t skimp on number or quality here. The data steward will use tools such as MongoDB, MySQL, Oracle, and if she’s a superstar, she’ll dabble in Python and web scraping and know the difference between JSON and XML. Maybe you’ll give her a raise can call her a data engineer.

Analytic Explorer – this skillset is a tough one to find.  It requires math, statistics, and modeling along with a healthy dose of creativity and skepticism. This is a person who can spin straw into gold or write tomorrow’s news today.  His job is to ask the right questions, explore your data, and distill it down to insights that will support your most critical decisions.  He’ll use tools such as TensorFlow, R, MATLAB, ArcInfo, SAS, Tableau, and SPSS.  If he’s a superstar, he’ll know all about Reinforcement Learning, Bayes, Optimization, and the difference between precision, accuracy, and skill. 

Information Artist – This is the role of a creative analytical.  Her goal is to sell the results to the decision-maker.  And the lack of emphasis on this skillset is one of the reasons analytics is such a failure (and why Apple is such a success).  Edward Tufte – the godfather of data visualization – speculates that the lack of good data design contributed to both the Columbia and Challenger space shuttle tragedies.  Think of this person as being as crucial as your sales force.  In fact, that is her job – to sell the right answer. But she’s a whole lot more than a graphic designer. She gets aggregation, normalization and signal versus noise.  And she also get mood, white space, and font kerning. Excel and PowerPoint may be her go to tools, although she’s more likely to use Photoshop, Moqups, and D3.  If she’s a superstar, she’ll be as comfortable talking about the math behind the visuals as she is talking about the psychology behind her design.

Automator – If the Explorer finds the path through the dark forest to the fountain of youth, and the visualizer designs a beautiful bottle for the elixir,  then the Automator turns that path into an eight-lane highway and builds a factory to bottle that stuff as soon as it comes out of the ground.  His job is to operationalize the work of the Explorer and Visualizer.  He makes sure that results are timely and fast.  He adds scale. He might use traditional coding methods like C# or Java or he might fiddle with JavaScript and D3. Or he might even be the guru of Vue.JS or React.

The Champion – The champion stands with one foot in the land of “gut feel”, and the other planted firmly in the side of “evidence”.  She can speak the language of the geeks, and translate it to that of the battle-hardened general.  She believes strongly in data-driven decision making, but also recognizes the value of deep domain experience.  She’s tireless in her efforts to sculpt the processes of the organization to support analytics. She aims to harvest the brightest insights from the sharp young analysts and the cleverest hacks from the wily old veterans.  Her focus is adoption and impact. If she’s a superstar, she’ll make you believe that this analytics thing was your idea in the first place.

faces2

So that’s your dream team: a steward, an explorer, an artist, an automator and a champion.

The dream team in the real world.

But there’s a problem.  This team rarely exists in the wild.  Most companies hire the Data Steward, and then try to do the rest through a major software implementation.  Unfortunately, the software is not meant to explore and discover.  And it was designed by engineers who don’t understand the psychology of data visuals.  It’s like expecting your bookkeeper to be your CFO.  Sure they can both do accounting, but you won’t be happy with the results.

In other instances, organizations will try to shoehorn engineers into the roles “in their spare time”.  Again, with neither the training nor the time to explore the data or design the results, they’re doomed to fail.  These skillsets are distinct, and they shouldn’t be ignored.

So what’s the canny company to do?  If you’re extremely lucky, you’ll find the unicorn of the 21st century known as the Data Scientist, pay her a quarter million, and watch the magic happen. (A data scientist can do all five roles.) Or you can try to develop these skills in-house.  Or you can hire contractors – perhaps engaging a consulting firm to take on the Explorer or Visualizer roles for a time.  Or you can outsource the whole thing.

What’s important is that you recognize that each of these roles is necessary.  Neither software nor “Dave in Engineering” can replace them. Happy hunting.

Also in this series:

Source: The Five Faces of the Analytics Dream Team

Apr 09, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Statistics  Source

[ AnalyticsWeek BYTES]

>> The Competitive Advantage of Managing Relationships with Multi-Domain Master Data Management by jelaniharper

>> Data Scientists and the Practice of Data Science by bobehayes

>> Who Is Your ‘Biggest Fan’ on Facebook? Navigating the Facebook Graph API ~ 2016 Tutorial by nbhaskar

Wanna write? Click Here

[ FEATURED COURSE]

Applied Data Science: An Introduction

image

As the world’s data grow exponentially, organizations across all sectors, including government and not-for-profit, need to understand, manage and use big, complex data sets—known as big data…. more

[ FEATURED READ]

The Industries of the Future

image

The New York Times bestseller, from leading innovation expert Alec Ross, a “fascinating vision” (Forbes) of what’s next for the world and how to navigate the changes the future will bring…. more

[ TIPS & TRICKS OF THE WEEK]

Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.

[ DATA SCIENCE Q&A]

Q:How to optimize algorithms? (parallel processing and/or faster algorithms). Provide examples for both?
A: Premature optimization is the root of all evil – Donald Knuth

Parallel processing: for instance in R with a single machine.
– doParallel and foreach package
– doParallel: parallel backend, will select n-cores of the machine
– for each: assign tasks for each core
– using Hadoop on a single node
– using Hadoop on multi-node

Faster algorithm:
– In computer science: Pareto principle; 90% of the execution time is spent executing 10% of the code
– Data structure: affect performance
– Caching: avoid unnecessary work
– Improve source code level
For instance: on early C compilers, WHILE(something) was slower than FOR(;;), because WHILE evaluated “something” and then had a conditional jump which tested if it was true while FOR had unconditional jump.

Source

[ VIDEO OF THE WEEK]

Decision-Making: The Last Mile of Analytics and Visualization

 Decision-Making: The Last Mile of Analytics and Visualization

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Numbers have an important story to tell. They rely on you to give them a voice. – Stephen Few

[ PODCAST OF THE WEEK]

@BrianHaugli @The_Hanover ?on Building a #Leadership #Security #Mindset #FutureOfData #Podcast

 @BrianHaugli @The_Hanover ?on Building a #Leadership #Security #Mindset #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

14.9 percent of marketers polled in Crain’s BtoB Magazine are still wondering ‘What is Big Data?’

Sourced from: Analytics.CLUB #WEB Newsletter