May 25, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Big Data knows everything  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Your Agile Data Warehousing Architect: Excel by v1shal

>> The future of marketing automation depends on data analytics at scale by anum

>> Google loses data as lightning strikes by analyticsweekpick

Wanna write? Click Here

[ NEWS BYTES]

>>
 Big Data Startup Tamr Wins Financial Investment From GE Ventures – CRN Under  Big Data

>>
 Securonix Unveils Big Data Security Analytics Platform With Unprecedented Threat Prediction, Detection and … – Broadway World Under  Big Data Security

>>
 Installing Ubuntu On Windows 10 — On vSphere – Virtualization Review Under  Virtualization

More NEWS ? Click Here

[ FEATURED COURSE]

Hadoop Starter Kit

image

Hadoop learning made easy and fun. Learn HDFS, MapReduce and introduction to Pig and Hive with FREE cluster access…. more

[ FEATURED READ]

Machine Learning With Random Forests And Decision Trees: A Visual Guide For Beginners

image

If you are looking for a book to help you understand how the machine learning algorithms “Random Forest” and “Decision Trees” work behind the scenes, then this is a good book for you. Those two algorithms are commonly u… more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:What is: collaborative filtering, n-grams, cosine distance?
A: Collaborative filtering:
– Technique used by some recommender systems
– Filtering for information or patterns using techniques involving collaboration of multiple agents: viewpoints, data sources.
1. A user expresses his/her preferences by rating items (movies, CDs.)
2. The system matches this user’s ratings against other users’ and finds people with most similar tastes
3. With similar users, the system recommends items that the similar users have rated highly but not yet being rated by this user

n-grams:
– Contiguous sequence of n items from a given sequence of text or speech
– ‘Andrew is a talented data scientist”
– Bi-gram: ‘Andrew is”, ‘is a”, ‘a talented”.
– Tri-grams: ‘Andrew is a”, ‘is a talented”, ‘a talented data”.
– An n-gram model models sequences using statistical properties of n-grams; see: Shannon Game
– More concisely, n-gram model: P(Xi|Xi?(n?1)…Xi?1): Markov model
– N-gram model: each word depends only on the n?1 last words

Issues:
– when facing infrequent n-grams
– solution: smooth the probability distributions by assigning non-zero probabilities to unseen words or n-grams
– Methods: Good-Turing, Backoff, Kneser-Kney smoothing

Cosine distance:
– How similar are two documents?
– Perfect similarity/agreement: 1
– No agreement : 0 (orthogonality)
– Measures the orientation, not magnitude

Given two vectors A and B representing word frequencies:
cosine-similarity(A,B)=?A,B?/||A||?||B||

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek: Big Data Health Informatics for the 21st Century: Gil Alterovitz

 @AnalyticsWeek: Big Data Health Informatics for the 21st Century: Gil Alterovitz

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Torture the data, and it will confess to anything. – Ronald Coase

[ PODCAST OF THE WEEK]

#FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

 #FutureOfData Podcast: Conversation With Sean Naismith, Enova Decisions

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

The data volumes are exploding, more data has been created in the past two years than in the entire previous history of the human race.

Sourced from: Analytics.CLUB #WEB Newsletter

Leave a Reply

Your email address will not be published. Required fields are marked *