[ COVER OF THE WEEK ]
[ NEWS BYTES]
[ FEATURED COURSE]
[ FEATURED READ]
In the world’s top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask. In The Mast… more
[ TIPS & TRICKS OF THE WEEK]
Data Have Meaning
We live in a Big Data world in which everything is quantified. While the emphasis of Big Data has been focused on distinguishing the three characteristics of data (the infamous three Vs), we need to be cognizant of the fact that data have meaning. That is, the numbers in your data represent something of interest, an outcome that is important to your business. The meaning of those numbers is about the veracity of your data.
[ DATA SCIENCE Q&A]
Q:How frequently an algorithm must be updated?
A: You want to update an algorithm when:
– You want the model to evolve as data streams through infrastructure
– The underlying data source is changing
– Example: a retail store model that remains accurate as the business grows
– Dealing with non-stationarity
– Incremental algorithms: the model is updated every time it sees a new training example
Note: simple, you always have an up-to-date model but you cant incorporate data to different degrees.
Sometimes mandatory: when data must be discarded once seen (privacy)
– Periodic re-training in batch mode: simply buffer the relevant data and update the model every-so-often
Note: more decisions and more complex implementations
– Is the sacrifice worth it?
– Data horizon: how quickly do you need the most recent training example to be part of your model?
– Data obsolescence: how long does it take before data is irrelevant to the model? Are some older instances
more relevant than the newer ones?
Economics: generally, newer instances are more relevant than older ones. However, data from the same month, quarter or year of the last year can be more relevant than the same periods of the current year. In a recession period: data from previous recessions can be more relevant than newer data from different economic cycles.
[ VIDEO OF THE WEEK]
Subscribe to Youtube
[ QUOTE OF THE WEEK]
He uses statistics as a drunken man uses lamp postsfor support rather than for illumination. Andrew Lang
[ PODCAST OF THE WEEK]
[ FACT OF THE WEEK]
More than 200bn HD movies which would take a person 47m years to watch.