Using sparklyr with Microsoft R Server

The sparklyr package (by RStudio) provides a high-level interface between R and Apache Spark. Among many other things, it allows you to filter and aggregate data in Spark using the dplyr syntax. In Microsoft R Server 9.1, you can now connect to a a Spark session using the sparklyr package as the interface, allowing you to combine the data-preparation capabilities of sparklyr and the data-analysis capabilities of Microsoft R Server in the same environment.

In a presentation by at the Spark Summit (embedded below, and you can find the slides here), Ali Zaidi shows how to connect to a Spark session from Microsoft R Server, and use the sparklyr package to extract a data set. He then shows how to build predictive models on this data (specifically, a deep Neural Network and a Boosted Trees classifier). He also shows how to build general ensemble models, cross-validate hyper-parameters in parallel, and even gives a preview of forthcoming streaming analysis capabilities.

[youtube https://www.youtube.com/watch?v=8-xvKlz26vg?rel=0&w=500&h=281]

Any easy way to try out these capabilities is with Azure HDInsight 3.6, which provides a managed Spark 2.1 instance with Microsoft R Server 9.1.

Spark Summit: Extending the R API for Spark with sparklyr and Microsoft R Server

Originally Posted at: Using sparklyr with Microsoft R Server

Nov 01, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Accuracy check  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> The User Experience of State Government Websites by analyticsweek

>> Marginal gains: the rise of data analytics in sport by analyticsweekpick

>> The Pitfalls of Using Predictive Models by bobehayes

Wanna write? Click Here

[ NEWS BYTES]

>>
 How to Avoid the Trap of Fragmented Security Analytics – Security Intelligence (blog) Under  Analytics

>>
 Are You Spending Too Much (or Too Little) on Cybersecurity? – Data Center Knowledge Under  Data Center

>>
 Most UK businesses are not insured against security breaches and data loss, says study – Information Age Under  Data Security

More NEWS ? Click Here

[ FEATURED COURSE]

Python for Beginners with Examples

image

A practical Python course for beginners with examples and exercises…. more

[ FEATURED READ]

Superintelligence: Paths, Dangers, Strategies

image

The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but … more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:You have data on the durations of calls to a call center. Generate a plan for how you would code and analyze these data. Explain a plausible scenario for what the distribution of these durations might look like. How could you test, even graphically, whether your expectations are borne out?
A: 1. Exploratory data analysis
* Histogram of durations
* histogram of durations per service type, per day of week, per hours of day (durations can be systematically longer from 10am to 1pm for instance), per employee…
2. Distribution: lognormal?

3. Test graphically with QQ plot: sample quantiles of log(durations)log?(durations) Vs normal quantiles

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

 @AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Everybody gets so much information all day long that they lose their common sense. – Gertrude Stein

[ PODCAST OF THE WEEK]

Solving #FutureOfOrgs with #Detonate mindset (by @steven_goldbach & @geofftuff) #FutureOfData #Podcast

 Solving #FutureOfOrgs with #Detonate mindset (by @steven_goldbach & @geofftuff) #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Brands and organizations on Facebook receive 34,722 Likes every minute of the day.

Sourced from: Analytics.CLUB #WEB Newsletter

@DrJasonBrooks talked about the Fabric and Future of Leadership #JobsOfFuture #Podcast

[youtube https://www.youtube.com/watch?v=SB29nSaCppU]

In this podcast Jason talked about the fabric of a great transformative leadership. He shared some tactical steps that current leadership could follow to ensure their relevance and their association with transformative teams. Jason emphasized the role of team, leader and organization in create a healthy future proof culture. It is a good session for the leadership of tomorrow.

Jason’s Recommended Read:
Reset: Reformatting Your Purpose for Tomorrow’s World by Jason Brooks https://amzn.to/2rAuywh
Essentialism: The Disciplined Pursuit of Less by Greg McKeown https://amzn.to/2jOX8Xi

Podcast Link:
iTunes: http://math.im/itunes
GooglePlay: http://math.im/gplay

Jason’s BIO:
Dr. Jason Brooks is an executive, entrepreneur, consulting and leadership psychologist, bestselling author, and speaker with over 24 years of demonstrated results in the design, implementation and evaluation of leadership and organizational development, organizational effectiveness, and human capital management solutions, He work to grow leaders and enhance workforce performance and overall individual and company success. He is a results-oriented, high-impact executive leader with experience in start-up, high-growth, and operationally mature multi-million and multi-billion dollar companies in multiple industries.

About #Podcast:
#JobsOfFuture podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join?
If you or any you know wants to join in,
Register your interest @ info@analyticsweek.com

Want to sponsor?
Email us @ info@analyticsweek.com

Keywords:
#JobsOfFuture #Leadership #Podcast #Future of #Work #Worker & #Workplace

Source: @DrJasonBrooks talked about the Fabric and Future of Leadership #JobsOfFuture #Podcast by v1shal