Sep 24, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data Mining  Source

[ AnalyticsWeek BYTES]

>> Make or Buy? by analyticsweek

>> Datumbox Machine Learning Framework 0.7.0 Released by administrator

>> An Update on Project Zen: Improving Apache Spark for Python Users by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Introduction to Apache Spark

image

Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals…. more

[ FEATURED READ]

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

image

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored f… more

[ TIPS & TRICKS OF THE WEEK]

Grow at the speed of collaboration
A research by Cornerstone On Demand pointed out the need for better collaboration within workforce, and data analytics domain is no different. A rapidly changing and growing industry like data analytics is very difficult to catchup by isolated workforce. A good collaborative work-environment facilitate better flow of ideas, improved team dynamics, rapid learning, and increasing ability to cut through the noise. So, embrace collaborative team dynamics.

[ DATA SCIENCE Q&A]

Q:Explain what a long-tailed distribution is and provide three examples of relevant phenomena that have long tails. Why are they important in classification and regression problems?
A: * In long tailed distributions, a high frequency population is followed by a low frequency population, which gradually tails off asymptotically
* Rule of thumb: majority of occurrences (more than half, and when Pareto principles applies, 80%) are accounted for by the first 20% items in the distribution
* The least frequently occurring 80% of items are more important as a proportion of the total population
* Zipf’s law, Pareto distribution, power laws

Examples:
1) Natural language
– Given some corpus of natural language – The frequency of any word is inversely proportional to its rank in the frequency table
– The most frequent word will occur twice as often as the second most frequent, three times as often as the third most frequent…
– The” accounts for 7% of all word occurrences (70000 over 1 million)
– ‘of” accounts for 3.5%, followed by ‘and”…
– Only 135 vocabulary items are needed to account for half the English corpus!

2. Allocation of wealth among individuals: the larger portion of the wealth of any society is controlled by a smaller percentage of the people

3. File size distribution of Internet Traffic

Additional: Hard disk error rates, values of oil reserves in a field (a few large fields, many small ones), sizes of sand particles, sizes of meteorites

Importance in classification and regression problems:
– Skewed distribution
– Which metrics to use? Accuracy paradox (classification), F-score, AUC
– Issue when using models that make assumptions on the linearity (linear regression): need to apply a monotone transformation on the data (logarithm, square root, sigmoid function…)
– Issue when sampling: your data becomes even more unbalanced! Using of stratified sampling of random sampling, SMOTE (‘Synthetic Minority Over-sampling Technique”, NV Chawla) or anomaly detection approach

Source

[ VIDEO OF THE WEEK]

Want to fix #DataScience ? fix #governance by @StephenGatchell @Dell #FutureOfData #Podcast

 Want to fix #DataScience ? fix #governance by @StephenGatchell @Dell #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

It is a capital mistake to theorize before one has data. Insensibly, one begins to twist the facts to suit theories, instead of theories to

[ PODCAST OF THE WEEK]

George (@RedPointCTO / @RedPointGlobal) on becoming an unbiased #Technologist in #DataDriven World #FutureOfData #Podcast

 George (@RedPointCTO / @RedPointGlobal) on becoming an unbiased #Technologist in #DataDriven World #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Data is growing faster than ever before and by the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet.

Sourced from: Analytics.CLUB #WEB Newsletter

Building your data ecosystem with French solutions

Webinar Empirik Building your data ecosystem with French solutions

With the current health crisis, many people have started looking into the benefits of relocating strategic industries to Europe. But what about the data and digital marketing sectors – both of which are highly dependent on the US tech giants? Here are the views of four leading French players in the data sector: AB Tasty, AT Internet, Commanders Act and Reeport. 

The dangers of the GAFA monopoly 

The GAFA have successfully invaded and colonised the European digital sector, particularly in the data value chain. In the process, they have made many companies dependent on their tools.  

Here are the risks of US monopolies for Europe and its citizens – and why it’s vital to relocate the data. 

The Internet was designed to be open and decentralised. The GAFA’s objective is to close off the entire digital ecosystem around their offers.

Mathieu Llorens, AT Internet 

Mathieu Llorens, CEO of AT Internet, explains how the GAFA monopoly is an insidious problem – “Data is the raw material of the economy, and we leave the majority of this data intelligence (marketing, customer knowledge, etc.) in the hands of the GAFA. It is important to remember that the Internet was designed to be open and decentralised, and the GAFA’s objective is to close off the entire digital ecosystem around their respective offers. Google made its position perfectly clear during the implementation of the GDPR. The US company refuses to accept any responsibility in its general terms and conditions, it enforces its terms and conditions and passes on the responsibility of the related consent to its customers. And its approach is ‘take it or leave it’.” 

Digitalisation is not a neutral topic. It is at the centre of major societal, economic and political issues. 

Michael Froment, Commanders Act 

According to Michael Froment, CEO of Commanders Act, there is a possible danger to the security of states and companies if the data is no longer circulating due to potential political issues.  
“Digitalisation is not a neutral topic… It is at the centre of strong societal, economic and political issues (protection of privacy, freedom of the press, democracy, etc.). All of our businesses in the data sector rely on data-based decision support. If they are no longer controlled or if their access is blocked, the danger for our economy is real.” 

Rémi Aubert, president and co-founder of AB Tasty, points out the dominance of some GAFA in specific areas – “Although they have a strong monopoly on advertising for example, there is a distinction between Google, which has all the customer data, and Microsoft, which is more of a technology provider.” 

Etienne Gautheron, product and operations director at Reeport, highlights Google’s role as judge and jury – “The marketer must always have a clear understanding of their provider’s business model. Without transparency, and if we rely simply on figures, we can lose autonomy and decision-making quality.” 

On issue of impartiality, Mathieu Llorens also raises the problem of the statistical bias of Google Analytics – “GA selectively reassigns a share of traffic to itself, encouraging brands to invest more in… Google, while blocking the direct traffic of the biggest advertisers.”

The advantages of Made in France 

With the current crisis, many people have been calling for French and European alternatives to Google and other US publishers. But is the Made in France approach feasible and sustainable for brands that use data and data marketing solutions? 

The main advantage of French solution providers is their high quality.  

Etienne Gautheron, Reeport 

Reeport’s Etienne Gautheron didn’t really see a strong trend for Made in France during this crisis. Nevertheless, he highlights the “general awareness of data protection with the implementation of the GDPR, and the conservation of data in Europe.” He also mentions the need to “listen very specifically to French customers, who want to understand and be involved in their solution provider’s roadmap,” and that “the advantage of French solution providers is their high quality.” 

The customer relationship with French suppliers is different – it is far closer and there is more dialogue.  

Rémi Aubert, AB Tasty 

AB Tasty’s Rémi Aubert doesn’t believe in a Made in France approach to strategic issues other than health and defence – “Made in France does not necessarily win over the customer… In the field of data, we must first compete on a technological level.” He explains that “the advantages of the AB Tasty personalisation solution are that it can provide both quantitative and qualitative analyses (which its US competitors do not necessarily do)… There is also a different customer relationship when it comes from a French or European player, which is far closer and involves more dialogue. And the crisis has highlighted these differences with the US players, who tend to put the customer in the background in terms of their direct relationship.” 

Michael Froment points out that the French solutions (mentioned here) are successful and are capable of penetrating international markets. “Made in US is not necessarily a guarantee of superior quality. Obviously, there needs to be an equivalent value proposition in terms of French players’ functionalities – it’s important to note that that some American brands also trust European solutions.” 

If the data is strategic, you cannot work with a biased and one-sided solution.

Mathieu Llorens, AT Internet

“When the data is truly strategic, you cannot work with a biased and one-sided solution. When the entry level is free, you need to position yourself on a more advanced and more qualitative offer which is aimed at mature companies in terms of data use,” explains Mathieu Llorens. “ French players do not have the marketing power of the GAFA – it is not Made in France that makes the difference, and these players can only develop based the quality of their functional offer and their expertise. Only the product can make the difference.” 

The illusion of free access and solution integration 

What is the positioning of French solutions in the face of competition from Google, which offers free and fully integrated tools? 

According to Etienne Gautheron “the free Google Data Studio was not particularly bad news for Reeport. Indeed, it has helped to evangelize the market. 90% of Reeport’s users are former Google Data Studio users. Reeport’s offer is aimed at complex organisations that inevitably have more advanced needs in terms of governance and compliance with legislation, which Google does not meet. The interconnection of data solutions is also an essential requirement and must be very simple, as it is the case today with AT Internet or Content Square. The national or European framework naturally places constraints, but from constraints comes innovation. It is therefore an opportunity to learn how to adapt and be pragmatic in order to address very specific uses.” 

Google’s built-in tools are not up to standard, and that’s why we exist. 

Rémi Aubert, AB Tasty 

For Rémi Aubert, Google’s integrated tools are not up to standard, which is why French solutions are present and gaining market share – “The problem with the free service is that it shifts contract signatures to less mature customers… there is also a need to put human (and therefore financial) resources into a data tool to achieve efficiency. However, if companies do not put resources into the tool, they will not put them into the human resources. When it comes to integration, it is the role of players such as AB Tasty, AT Internet, etc. to make it as fluid and simple as possible.” 

Free access often leads to a problem of data ownership.  

Mathieu Llorens, AT Internet 

Mathieu Llorens explains that “as far as the analytical part is concerned, being free of charge is relative. The Google Premium service (offer equivalent to AT Internet’s), for small audience volumes, is 10 times more expensive. In other words, you need to have a solid basis for comparison. The free service also leads to a problem of ownership of the data. In its general conditions, Google mentions that it can reuse the data for its own purposes, which is contrary to the entire spirit of GDPR.” He highlights that “free access is often only temporary in order to ensnare the user in a closed ecosystem and destroy any competition before imposing a paid solution. Take the example of Google Maps, which wiped out most of its competitors with an almost free offer, before multiplying its prices by 50 in a few weeks. As for the integration of Google tools, when it excludes third party tools, it raises a legal problem of vertical integration, which is called abuse of dominant position.” 

According to Michael Froment, the technical integration of French solutions is not even an issue – “Technically, the interconnections are already simple and efficient. The main challenge is to make these connectors known to users. It is therefore more of a marketing challenge than a technological one.” 

The end of cookies? 

With the announced end of cookies on Chrome, Google is threatening the data players who use them in their technologies. It promises to be able to combine paths and measure conversion with an alternative to cookies. However, are there any particular concerns?

Google’s strategy is to monopolise the information it is simply supposed to organise.   

Michael Froment, Commanders Act 

“The market is gradually shrinking,” explains Michael Froment. “First there was a ban on third party cookies on Google’s network, then the deletion of natural keywords in the search engine, and now the removal of the standard cookie. Google’s strategy is basically to gain access to the information that it is simply supposed to organise. The market will probably change with the emergence of players who will want to make the link between the brand’s information (cookies first) and those in the advertising ecosystem (Google and Facebook).” 

On the issue of cookies and to compensate for their removal, Rémi Aubert says “it will be important to work on user intention algorithms. With consumer behaviour being instantaneous, the lack of information for path reconciliation does not necessarily represent a genuine challenge.” 

We are reaching a tipping point, but there are still reasons to be optimistic.

Mathieu Llorens, AT Internet 

Mathieu Llorens talks about a tipping point – “With the end of cookies, data control is about to be entirely in the hands of Google. The American company has managed to hijack the GDPR in favour of its Chrome browser and its logged universes. Nevertheless, there is still reason for optimism, as almost all US states have filed complaints related to these privacy and abuse of dominance issues. The entire digital ecosystem, in France and elsewhere, has an interest in keeping the Internet open and free of cartels, for healthy competition that promotes the development of businesses.”

A brave new world 

How can we reverse the trend of an Internet that is closing down? How will French solutions be able to conquer new market segments? What strategies should be put in place and what can be expected from public policies for the data sector? 

French technology is well developed, but we need to be able to better market our know-how.

Rémi Aubert, AB Tasty 

Rémi Aubert looks at the specific nature of the US market, which is much larger than the European market – “Europe is not sufficiently incorporated in legal and business terms. A global alignment of European markets would be a good basis for the expansion of companies in the digital and data sector. European technology is state-of-the-art, but the point of improvement is in marketing to be able to better sell our know-how.” 

“French companies are well supported by the State,” explains Mathieu Llorens. “I do not expect national economic favouritism, but existing laws should be enforced, especially in terms of international taxation.” 

We need a European label for technology, to unify the market and gain recognition.

Michael Froment, Commanders Act

Michael Froment believes in the establishment of a European label for tech, which would unify the market and ensure that French solutions are recognised in all countries – “The notion of industrial policy and the planning of the tools that are lacking (an alternative to Zoom, for example) must also be brought back to the forefront. In certain areas, Europe is absent, which inevitably leaves room for US alternatives.” 

So, is it possible? 

Is it possible to build a data ecosystem with French solutions? 

For Michael Froment, it is both possible, desirable, and indeed necessary in some cases – “The way in which we communicate about the interconnectivity of solutions is key. The other important point is to have buyers of our French solutions, which would allow us to build a real digital industrial sector.” 

Mathieu Llorens is convinced that the choice of French solutions is safer (for all the reasons mentioned above), and ultimately more economical – “It is also essential to raise awareness and improve communication about connectors between solutions.” 

Etienne Gautheron refers to the diversity of solutions that can meet multiple needs – “The people-centred culture of the French players is necessary and a genuine advantage.” 

To round off, Rémy Aubert underlines the international aspect of the French and European players. “Our strength lies in our ability to address markets all over the world in compliance with legal rules.” 

Article Building your data ecosystem with French solutions first appeared on Digital Analytics Blog.

Source by analyticsweekpick

Sep 17, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://analyticsweek.com/tw/blogpull.php): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://analyticsweek.com/tw/blogpull.php): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://analyticsweek.com/tw/blogpull.php): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Weak data  Source

[ FEATURED COURSE]

CS229 – Machine Learning

image

This course provides a broad introduction to machine learning and statistical pattern recognition. … more

[ FEATURED READ]

Storytelling with Data: A Data Visualization Guide for Business Professionals

image

Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You’ll discover the power of storytelling and the way to make data a pivotal point in your story. Th… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:Why is naive Bayes so bad? How would you improve a spam detection algorithm that uses naive Bayes?
A: Naïve: the features are assumed independent/uncorrelated
Assumption not feasible in many cases
Improvement: decorrelate features (covariance matrix into identity matrix)

Source

[ VIDEO OF THE WEEK]

@JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast

 @JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

We chose it because we deal with huge amounts of data. Besides, it sounds really cool. – Larry Page

[ PODCAST OF THE WEEK]

Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

 Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In 2008, Google was processing 20,000 terabytes of data (20 petabytes) a day.

Sourced from: Analytics.CLUB #WEB Newsletter

How Big Data Is Transforming The Fight Against Cancer

When we talk about Big Data, we often talk about it in terms of business, and in particular how it can be used to generate money. But it’s important to remember that the possibilities go way further. Science has the task of expanding humanity’s horizons–whether that’s by exploring space or discovering more about the tiny organisms that make up the natural world. From the start, data has played a role in this. But in the age of Big Data–the field which has emerged thanks to the explosion in the amount of data we create and capture, and the advanced computer analysis that has become possible in recent years–it’s more important than ever.

The fight against cancer is the search for the Holy Grail of medicine. Almost everyone will be affected at some point in their lives, either personally or by proxy through a loved one. So it’s no surprise that Big Data is being put to use in many ways to aid the task of improving care, identifying risks and hopefully eventually producing cures.

One such project is the American Society for Clinical Oncology’s CancerLinQ initiative, which aims to collate data from every cancer patient in the US and make it available for analysis in the hope that it will reveal patterns that lead to new insights.

These could be useful for doctors providing treatment–accessing up-to-date information on how thousands of others have reacted to a proposed treatment plan will enable them to tailor treatments to individual patients and provide the best chance of a positive outcome.

AI-Driven Diagnosis

It was recently announced that 14 cancer institutes across the United States and Canada would be using IBM IBM +1.81%’s Watson analytics engine to match cancer patients with the treatments most likely to help them.

As well as recommending the relevant cancer drug most likely to treat a particular patient’s cancer, Watson can even recommend drugs that have not been used to treat cancer before. Since it is programmed with specific details of how thousands of medicines interact with the human body, Watson can suggest anything which it thinks might interact beneficially with the cell affected by mutation which is causing the cancer. Of course, a doctor will probably have to take many other issues into consideration before prescribing whatever the AI-driven Watson suggests, but it surely will speed up the process.

Specialist programs also exist for research into particular types of cancer–for example the Dragon Master Foundation partnered with five U.S. pediatric hospitals to create a database of tissue samples from patients with rare childhood brain tumors.

This research is all based around our understanding of the fact that cancer is caused by cell mutations, and clues to when and how it will develop are often hidden within the complex genetic data contained in our bodies– ur genome.

Genomic Data

Genomic data is BIG. To store the raw code output by a genome sequencer on a computer takes around 200 gigabytes. Attempts to use comparative analysis of these genomes to isolate factors which make us susceptible to cancer involve ever-growing databases – one is planned to contain one million – putting it firmly within the realms of Big Data.

This isn’t a challenge for Big Data, it’s merely a question of scale, and programs such as the Folding @ Home initiative exist to harness a worldwide, distributed network of processing power which has vastly accelerated the rate at which this data is being harvested and decoded, since the first one was completed in 2003.

A greater challenge exists–and possible defenses against–the Big C. Thankfully, making sense of messy, unstructured data is a task at which Big Data excels.

Decoding Messy data

Flatiron Health has created the OncologyCloud–a big data program which aims to collect data from the medical records, doctors’ notes, interactions with carers and billing information of the 96% of cancer patients who do not take part in clinical trials.

Data from these patients is effectively “siloed” in systems maintained by the individual organizations diagnosing, delivering and arranging the finances for the care. As the company states “Most of the ‘clinically valuable’ [cancer] data resides in doctor and nurse notes, pathology reports, PDFs and other unstructured forms […] Traditional population health analytics are built using claims data, which can attain quick results but lack the depth required to understand a disease as complex as cancer […] To run analytics for oncology off of claims data would be the equivalent of analyzing an iceberg by only looking at what’s above the surface. If you want the ‘clinical truth’ you have to get into the details.”

The Flatiron project aims to take all of this messy, unstructured data captured at every stage of the patient’s interaction with care providers, and structure it so it too can be used for comparative analysis, alongside the data of millions of others.

So that, in a nutshell, is the current despatch from the Big Data front of the war betweenscience and cancer. It’s a fight that’s far from won but significant advances are being made. Just this year a study concluded that thanks to the advances in spotting and treating cancer, by 2050 no one under 80 will be dying from the disease. Big Data-powered research and treatment programs will undoubtedly play a part in that victory, just as they continue to give us answers in every field of science.

To read the original article on Forbes, click here.

Source

Sep 10, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Extrapolating  Source

[ AnalyticsWeek BYTES]

>> Beyond Speed, What Do Consumers Want from 5G? by administrator

>> Customer analytics for the CMO by analyticsweekpick

>> Essential Construction Site Safety Checklists and Resources for Australia and Singapore by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Probability & Statistics

image

This course introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and… more

[ FEATURED READ]

Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 4th Edition

image

The eagerly anticipated Fourth Edition of the title that pioneered the comparison of qualitative, quantitative, and mixed methods research design is here! For all three approaches, Creswell includes a preliminary conside… more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:Examples of NoSQL architecture?
A: * Key-value: in a key-value NoSQL database, all of the data within consists of an indexed key and a value. Cassandra, DynamoDB
* Column-based: designed for storing data tables as sections of columns of data rather than as rows of data. HBase, SAP HANA
* Document Database: map a key to some document that contains structured information. The key is used to retrieve the document. MongoDB, CouchDB
* Graph Database: designed for data whose relations are well-represented as a graph and has elements which are interconnected, with an undetermined number of relations between them. Polyglot Neo4J

Source

[ VIDEO OF THE WEEK]

#GlobalBusiness at the speed of The #BigAnalytics

 #GlobalBusiness at the speed of The #BigAnalytics

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The temptation to form premature theories upon insufficient data is the bane of our profession. – Sherlock Holmes

[ PODCAST OF THE WEEK]

Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

 Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

40% projected growth in global data generated per year vs. 5% growth in global IT spending.

Sourced from: Analytics.CLUB #WEB Newsletter

Four levels of Hadoop adoption maturity

Photo courtesy of Apache Spark
Photo courtesy of Apache Spark

So you’ve been monitoring or are already on the journey with Hadoop — and you’re wondering: Where are we on the adoption curve compared to the market in general?

Based on my interactions with numerous companies, I want to share what I think that curve looks like so that you can orientate your organization and decide if you’re leading the way or lagging behind. Neither is inherently bad, but you do need to be conscious of where you are and why.

Stage 1: Monitoring

The monitoring stage is where you generally find two types of organizations:

  1. The first type are those who don’t believe they need to deal with big data. These companies might be struggling with other issues and feel that big data is just too far into the future — or they believe they’ll never have the need to deal with big data in their market segment.
  2. The sccond type are those who believe there are other technologies more mature than Hadoop which can meet their needs. Sometimes a few people in such organizations have played with Hadoop and decided it doesn’t meet their needs or lacks specific capabilities. From that point on, it’s a case of keeping an eye on the market and how it’s developing.

The big questions for these organizations are: What would it take for them to look at Hadoop? Is that well defined and agreed upon? Are they ready to move if there’s a sudden need to spring into action?

Stage 2: Investigating

In this stage, organizations generally have a small group playing with Hadoop technology (when I say Hadoop technology, I mean the ecosystem — not just Map/Reduce, etc.). The group experimenting with Hadoop is usually in IT, or is an IT-savvy group in a business unit.

In these organizations, there’s no real Hadoop mandate yet but, the investigations are designed to determine what Hadoop might be useful for or how to begin addressing big data challenges on the horizon.

In this case, companies are either using the free Hadoop Apache distribution download or one of the free downloads from an established commercial Hadoop Distribution vendor like Cloudera, Hortonworks or MapR.

All of the effort in this level is useful. It’s not a commitment to Hadoop, but it is building the skills and knowledge necessary to consider Hadoop’s IT/business implications — and also to be ready to quickly move to the next level should the time come.

The big questions for these organizations are: What would it take for them to move beyond trials and into using Hadoop? Is that well-defined and agreed upon? Are they looking at more than just storing data? Are they also looking at how to utilize that stored data through visualization and analytics?

Stage 3: Implementing

In this stage, organizations have deployed a Hadoop cluster and have at least one project running on it. They’ve largely moved on from the Apache distribution because they needed additional capabilities offered by one of the commercial vendors such as support, back-up, management tools, other SQL data stores, etc.

The companies in this stage generally have up to three Hadoop projects, either in production, or close to it. Initial projects often focus on new business challenges, or on using data that was not previously accessible. Where possible, existing end user toolsets are used to minimize the need for training and maximize on delivering quick ROI on those early projects.

This phase is the riskiest one. If ROI is not delivered, the value of Hadoop can be undermined. At the same time, skills are at a premium and experimentation is likely happening as organizations build production projects out. At this stage you could say the bubble is at risk of bursting for some organizations. Few of the organizations I’m working with are past this phase yet.

Stage 4: Established

Established organizations have a number of projects in production with plans for many more and a large Hadoop cluster. Generally speaking, these companies will also be working on a broad enterprise architecture where Hadoop is taking on an increasingly important role in their five-year vision.

They’ll be working with a commercial Hadoop distribution to influence the development of features and functions they require to support their future architecture, and will be growing the size of their clusters rapidly. This is the group hoping for large returns from their investment both in savings, and also perhaps in disruptive market changes they’re trying to enable through the use of Hadoop and big data.

These advanced organizations will help drive requirements for the next generation of Hadoop — all aligned to the business issues they need solving. If an industry is not well represented in this group, it’s possible that industry’s needs will take a back seat to those of more well-represented industries.

Conclusion

There are already many companies publicly moving forwards with Hadoop such as Yahoo, Home Depot, Rogers, Schlumberger, Barclays Bank, Symantec, Verizon, British Telecom, ING, Port of Rotterdam, British Airways, Truecar, EDF, Sanoma, Octo Technology, HSBC, Orange France, Shazam, CERN to name but a few.

You can see these companies, and more, on the websites of the commercial distributions such as Cloudera,Hortonworks and MapR or by reviewing the various recordings from Strata or Hadoop conferences. If you’re exploring Hadoop, it may be that now is the time to put your foot down and accelerate a little more.

These are some of my initial impressions. If you’re using Hadoop, do you fit into one of these buckets or is there another that I might be missing? Would you name the buckets differently? Any other characteristics you would add to any of the groups?

To read the original article on S.A.S. Voices, click here.

Source: Four levels of Hadoop adoption maturity

Sep 03, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Trust the data  Source

[ AnalyticsWeek BYTES]

>> A Gentle Introduction to Linear Regression With Maximum Likelihood Estimation by administrator

>> The Only Recipe For A Data Story You’ll Ever Need by analyticsweek

>> 9th edition of Aegis Graham Bell Award nominations open by administrator

Wanna write? Click Here

[ FEATURED COURSE]

Tackle Real Data Challenges

image

Learn scalable data management, evaluate big data technologies, and design effective visualizations…. more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:Provide examples of machine-to-machine communications?
A: Telemedicine
– Heart patients wear specialized monitor which gather information regarding heart state
– The collected data is sent to an electronic implanted device which sends back electric shocks to the patient for correcting incorrect rhythms

Product restocking
– Vending machines are capable of messaging the distributor whenever an item is running out of stock

Source

[ VIDEO OF THE WEEK]

@AngelaZutavern & @JoshDSullivan @BoozAllen discussed Mathematical Corporation #FutureOfData

 @AngelaZutavern & @JoshDSullivan @BoozAllen discussed Mathematical Corporation #FutureOfData

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Numbers have an important story to tell. They rely on you to give them a voice. – Stephen Few

[ PODCAST OF THE WEEK]

#FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

 #FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

140,000 to 190,000. Too few people with deep analytical skills to fill the demand of Big Data jobs in the U.S. by 2018.

Sourced from: Analytics.CLUB #WEB Newsletter