Month: September 2020
Sep 24, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
[ COVER OF THE WEEK ]
Data Mining Source
[ AnalyticsWeek BYTES]
>> Make or Buy? by analyticsweek
>> Datumbox Machine Learning Framework 0.7.0 Released by administrator
>> An Update on Project Zen: Improving Apache Spark for Python Users by analyticsweekpick
[ FEATURED COURSE]
![]() |
[ FEATURED READ]
Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
![]() |
[ TIPS & TRICKS OF THE WEEK]
Grow at the speed of collaboration
A research by Cornerstone On Demand pointed out the need for better collaboration within workforce, and data analytics domain is no different. A rapidly changing and growing industry like data analytics is very difficult to catchup by isolated workforce. A good collaborative work-environment facilitate better flow of ideas, improved team dynamics, rapid learning, and increasing ability to cut through the noise. So, embrace collaborative team dynamics.
[ DATA SCIENCE Q&A]
Q:Explain what a long-tailed distribution is and provide three examples of relevant phenomena that have long tails. Why are they important in classification and regression problems?
A: * In long tailed distributions, a high frequency population is followed by a low frequency population, which gradually tails off asymptotically
* Rule of thumb: majority of occurrences (more than half, and when Pareto principles applies, 80%) are accounted for by the first 20% items in the distribution
* The least frequently occurring 80% of items are more important as a proportion of the total population
* Zipfs law, Pareto distribution, power laws
Examples:
1) Natural language
– Given some corpus of natural language – The frequency of any word is inversely proportional to its rank in the frequency table
– The most frequent word will occur twice as often as the second most frequent, three times as often as the third most frequent
– The accounts for 7% of all word occurrences (70000 over 1 million)
– ‘of accounts for 3.5%, followed by ‘and
– Only 135 vocabulary items are needed to account for half the English corpus!
2. Allocation of wealth among individuals: the larger portion of the wealth of any society is controlled by a smaller percentage of the people
3. File size distribution of Internet Traffic
Additional: Hard disk error rates, values of oil reserves in a field (a few large fields, many small ones), sizes of sand particles, sizes of meteorites
Importance in classification and regression problems:
– Skewed distribution
– Which metrics to use? Accuracy paradox (classification), F-score, AUC
– Issue when using models that make assumptions on the linearity (linear regression): need to apply a monotone transformation on the data (logarithm, square root, sigmoid function
)
– Issue when sampling: your data becomes even more unbalanced! Using of stratified sampling of random sampling, SMOTE (‘Synthetic Minority Over-sampling Technique, NV Chawla) or anomaly detection approach
Source
[ VIDEO OF THE WEEK]
Want to fix #DataScience ? fix #governance by @StephenGatchell @Dell #FutureOfData #Podcast
Subscribe to Youtube
[ QUOTE OF THE WEEK]
It is a capital mistake to theorize before one has data. Insensibly, one begins to twist the facts to suit theories, instead of theories to
[ PODCAST OF THE WEEK]
Subscribe
[ FACT OF THE WEEK]
Data is growing faster than ever before and by the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet.
Sourced from: Analytics.CLUB #WEB Newsletter
Building your data ecosystem with French solutions
With the current health crisis, many people have started looking into the benefits of relocating strategic industries to Europe. But what about the data and digital marketing sectors â both of which are highly dependent on the US tech giants? Here are the views of four leading French players in the data sector: AB Tasty, AT Internet, Commanders Act and Reeport.Â
The dangers of the GAFA monopolyÂ
The GAFA have successfully invaded and colonised the European digital sector, particularly in the data value chain. In the process, they have made many companies dependent on their tools. Â
Here are the risks of US monopolies for Europe and its citizens â and why itâs vital to relocate the data.Â
The Internet was designed to be open and decentralised. The GAFAâs objective is to close off the entire digital ecosystem around their offers.
Mathieu Llorens, AT InternetÂ
Mathieu Llorens, CEO of AT Internet, explains how the GAFA monopoly is an insidious problem â âData is the raw material of the economy, and we leave the majority of this data intelligence (marketing, customer knowledge, etc.) in the hands of the GAFA. It is important to remember that the Internet was designed to be open and decentralised, and the GAFAâs objective is to close off the entire digital ecosystem around their respective offers. Google made its position perfectly clear during the implementation of the GDPR. The US company refuses to accept any responsibility in its general terms and conditions, it enforces its terms and conditions and passes on the responsibility of the related consent to its customers. And its approach is âtake it or leave itâ.âÂ
Digitalisation is not a neutral topic. It is at the centre of major societal, economic and political issues.Â
Michael Froment, Commanders ActÂ
According to Michael Froment, CEO of Commanders Act, there is a possible danger to the security of states and companies if the data is no longer circulating due to potential political issues. Â
âDigitalisation is not a neutral topicâ¦Â It is at the centre of strong societal, economic and political issues (protection of privacy, freedom of the press, democracy, etc.). All of our businesses in the data sector rely on data-based decision support. If they are no longer controlled or if their access is blocked, the danger for our economy is real.âÂ
Rémi Aubert, president and co-founder of AB Tasty, points out the dominance of some GAFA in specific areas â âAlthough they have a strong monopoly on advertising for example, there is a distinction between Google, which has all the customer data, and Microsoft, which is more of a technology provider.âÂ
Etienne Gautheron, product and operations director at Reeport, highlights Googleâs role as judge and jury â âThe marketer must always have a clear understanding of their providerâs business model. Without transparency, and if we rely simply on figures, we can lose autonomy and decision-making quality.âÂ
On issue of impartiality, Mathieu Llorens also raises the problem of the statistical bias of Google Analytics â âGA selectively reassigns a share of traffic to itself, encouraging brands to invest more in⦠Google, while blocking the direct traffic of the biggest advertisers.â
The advantages of Made in FranceÂ
With the current crisis, many people have been calling for French and European alternatives to Google and other US publishers. But is the Made in France approach feasible and sustainable for brands that use data and data marketing solutions?Â
The main advantage of French solution providers is their high quality. Â
Etienne Gautheron, ReeportÂ
Reeportâs Etienne Gautheron didnât really see a strong trend for Made in France during this crisis. Nevertheless, he highlights the âgeneral awareness of data protection with the implementation of the GDPR, and the conservation of data in Europe.â He also mentions the need to âlisten very specifically to French customers, who want to understand and be involved in their solution providerâs roadmap,â and that âthe advantage of French solution providers is their high quality.âÂ
The customer relationship with French suppliers is different â it is far closer and there is more dialogue. Â
Rémi Aubert, AB TastyÂ
AB Tastyâs Rémi Aubert doesnât believe in a Made in France approach to strategic issues other than health and defence â âMade in France does not necessarily win over the customer⦠In the field of data, we must first compete on a technological level.â He explains that âthe advantages of the AB Tasty personalisation solution are that it can provide both quantitative and qualitative analyses (which its US competitors do not necessarily do)â¦Â There is also a different customer relationship when it comes from a French or European player, which is far closer and involves more dialogue. And the crisis has highlighted these differences with the US players, who tend to put the customer in the background in terms of their direct relationship.âÂ
Michael Froment points out that the French solutions (mentioned here) are successful and are capable of penetrating international markets. âMade in US is not necessarily a guarantee of superior quality. Obviously, there needs to be an equivalent value proposition in terms of French playersâ functionalities â itâs important to note that that some American brands also trust European solutions.âÂ
If the data is strategic, you cannot work with a biased and one-sided solution.
Mathieu Llorens, AT Internet
âWhen the data is truly strategic, you cannot work with a biased and one-sided solution. When the entry level is free, you need to position yourself on a more advanced and more qualitative offer which is aimed at mature companies in terms of data use,â explains Mathieu Llorens. â French players do not have the marketing power of the GAFA â it is not Made in France that makes the difference, and these players can only develop based the quality of their functional offer and their expertise. Only the product can make the difference.âÂ
The illusion of free access and solution integrationÂ
What is the positioning of French solutions in the face of competition from Google, which offers free and fully integrated tools?Â
According to Etienne Gautheron âthe free Google Data Studio was not particularly bad news for Reeport. Indeed, it has helped to evangelize the market. 90% of Reeportâs users are former Google Data Studio users. Reeportâs offer is aimed at complex organisations that inevitably have more advanced needs in terms of governance and compliance with legislation, which Google does not meet. The interconnection of data solutions is also an essential requirement and must be very simple, as it is the case today with AT Internet or Content Square. The national or European framework naturally places constraints, but from constraints comes innovation. It is therefore an opportunity to learn how to adapt and be pragmatic in order to address very specific uses.âÂ
Googleâs built-in tools are not up to standard, and thatâs why we exist.Â
Rémi Aubert, AB TastyÂ
For Rémi Aubert, Googleâs integrated tools are not up to standard, which is why French solutions are present and gaining market share â âThe problem with the free service is that it shifts contract signatures to less mature customers⦠there is also a need to put human (and therefore financial) resources into a data tool to achieve efficiency. However, if companies do not put resources into the tool, they will not put them into the human resources. When it comes to integration, it is the role of players such as AB Tasty, AT Internet, etc. to make it as fluid and simple as possible.âÂ
Free access often leads to a problem of data ownership. Â
Mathieu Llorens, AT InternetÂ
Mathieu Llorens explains that âas far as the analytical part is concerned, being free of charge is relative. The Google Premium service (offer equivalent to AT Internetâs), for small audience volumes, is 10 times more expensive. In other words, you need to have a solid basis for comparison. The free service also leads to a problem of ownership of the data. In its general conditions, Google mentions that it can reuse the data for its own purposes, which is contrary to the entire spirit of GDPR.â He highlights that âfree access is often only temporary in order to ensnare the user in a closed ecosystem and destroy any competition before imposing a paid solution. Take the example of Google Maps, which wiped out most of its competitors with an almost free offer, before multiplying its prices by 50 in a few weeks. As for the integration of Google tools, when it excludes third party tools, it raises a legal problem of vertical integration, which is called abuse of dominant position.âÂ
According to Michael Froment, the technical integration of French solutions is not even an issue â âTechnically, the interconnections are already simple and efficient. The main challenge is to make these connectors known to users. It is therefore more of a marketing challenge than a technological one.âÂ
The end of cookies?Â
With the announced end of cookies on Chrome, Google is threatening the data players who use them in their technologies. It promises to be able to combine paths and measure conversion with an alternative to cookies. However, are there any particular concerns?
Googleâs strategy is to monopolise the information it is simply supposed to organise.  Â
Michael Froment, Commanders ActÂ
âThe market is gradually shrinking,â explains Michael Froment. âFirst there was a ban on third party cookies on Googleâs network, then the deletion of natural keywords in the search engine, and now the removal of the standard cookie. Googleâs strategy is basically to gain access to the information that it is simply supposed to organise. The market will probably change with the emergence of players who will want to make the link between the brandâs information (cookies first) and those in the advertising ecosystem (Google and Facebook).âÂ
On the issue of cookies and to compensate for their removal, Rémi Aubert says âit will be important to work on user intention algorithms. With consumer behaviour being instantaneous, the lack of information for path reconciliation does not necessarily represent a genuine challenge.âÂ
We are reaching a tipping point, but there are still reasons to be optimistic.
Mathieu Llorens, AT InternetÂ
Mathieu Llorens talks about a tipping point â âWith the end of cookies, data control is about to be entirely in the hands of Google. The American company has managed to hijack the GDPR in favour of its Chrome browser and its logged universes. Nevertheless, there is still reason for optimism, as almost all US states have filed complaints related to these privacy and abuse of dominance issues. The entire digital ecosystem, in France and elsewhere, has an interest in keeping the Internet open and free of cartels, for healthy competition that promotes the development of businesses.â
A brave new worldÂ
How can we reverse the trend of an Internet that is closing down? How will French solutions be able to conquer new market segments? What strategies should be put in place and what can be expected from public policies for the data sector?Â
French technology is well developed, but we need to be able to better market our know-how.
Rémi Aubert, AB TastyÂ
Rémi Aubert looks at the specific nature of the US market, which is much larger than the European market â âEurope is not sufficiently incorporated in legal and business terms. A global alignment of European markets would be a good basis for the expansion of companies in the digital and data sector. European technology is state-of-the-art, but the point of improvement is in marketing to be able to better sell our know-how.âÂ
âFrench companies are well supported by the State,â explains Mathieu Llorens. âI do not expect national economic favouritism, but existing laws should be enforced, especially in terms of international taxation.âÂ
We need a European label for technology, to unify the market and gain recognition.
Michael Froment, Commanders Act
Michael Froment believes in the establishment of a European label for tech, which would unify the market and ensure that French solutions are recognised in all countries â âThe notion of industrial policy and the planning of the tools that are lacking (an alternative to Zoom, for example) must also be brought back to the forefront. In certain areas, Europe is absent, which inevitably leaves room for US alternatives.âÂ
So, is it possible?Â
Is it possible to build a data ecosystem with French solutions?Â
For Michael Froment, it is both possible, desirable, and indeed necessary in some cases â âThe way in which we communicate about the interconnectivity of solutions is key. The other important point is to have buyers of our French solutions, which would allow us to build a real digital industrial sector.âÂ
Mathieu Llorens is convinced that the choice of French solutions is safer (for all the reasons mentioned above), and ultimately more economical â âIt is also essential to raise awareness and improve communication about connectors between solutions.âÂ
Etienne Gautheron refers to the diversity of solutions that can meet multiple needs â âThe people-centred culture of the French players is necessary and a genuine advantage.âÂ
To round off, Rémy Aubert underlines the international aspect of the French and European players. âOur strength lies in our ability to address markets all over the world in compliance with legal rules.âÂ
Article Building your data ecosystem with French solutions first appeared on Digital Analytics Blog.
Source: by
Sep 17, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://analyticsweek.com/tw/blogpull.php): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://analyticsweek.com/tw/blogpull.php): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://analyticsweek.com/tw/blogpull.php): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
[ COVER OF THE WEEK ]
Weak data Source
[ FEATURED COURSE]
![]() |
[ FEATURED READ]
Storytelling with Data: A Data Visualization Guide for Business Professionals
![]() |
[ TIPS & TRICKS OF THE WEEK]
Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.
[ DATA SCIENCE Q&A]
Q:Why is naive Bayes so bad? How would you improve a spam detection algorithm that uses naive Bayes?
A: Naïve: the features are assumed independent/uncorrelated
Assumption not feasible in many cases
Improvement: decorrelate features (covariance matrix into identity matrix)
Source
[ VIDEO OF THE WEEK]
@JohnTLangton from @Wolters_Kluwer discussed his #AI Lead Startup Journey #FutureOfData #Podcast
Subscribe to Youtube
[ QUOTE OF THE WEEK]
We chose it because we deal with huge amounts of data. Besides, it sounds really cool. Larry Page
[ PODCAST OF THE WEEK]
Understanding #BigData #BigOpportunity in Big HR by @MarcRind #FutureOfData #Podcast
Subscribe
[ FACT OF THE WEEK]
In 2008, Google was processing 20,000 terabytes of data (20 petabytes) a day.
Sourced from: Analytics.CLUB #WEB Newsletter
How Big Data Is Transforming The Fight Against Cancer
When we talk about Big Data, we often talk about it in terms of business, and in particular how it can be used to generate money. But itâs important to remember that the possibilities go way further. Science has the task of expanding humanityâs horizonsâwhether thatâs by exploring space or discovering more about the tiny organisms that make up the natural world. From the start, data has played a role in this. But in the age of Big Dataâthe field which has emerged thanks to the explosion in the amount of data we create and capture, and the advanced computer analysis that has become possible in recent yearsâitâs more important than ever.
The fight against cancer is the search for the Holy Grail of medicine. Almost everyone will be affected at some point in their lives, either personally or by proxy through a loved one. So itâs no surprise that Big Data is being put to use in many ways to aid the task of improving care, identifying risks and hopefully eventually producing cures.
One such project is the American Society for Clinical Oncologyâs CancerLinQ initiative, which aims to collate data from every cancer patient in the US and make it available for analysis in the hope that it will reveal patterns that lead to new insights.
These could be useful for doctors providing treatmentâaccessing up-to-date information on how thousands of others have reacted to a proposed treatment plan will enable them to tailor treatments to individual patients and provide the best chance of a positive outcome.
AI-Driven Diagnosis
It was recently announced that 14 cancer institutes across the United States and Canada would be using IBM IBM +1.81%âs Watson analytics engine to match cancer patients with the treatments most likely to help them.
As well as recommending the relevant cancer drug most likely to treat a particular patientâs cancer, Watson can even recommend drugs that have not been used to treat cancer before. Since it is programmed with specific details of how thousands of medicines interact with the human body, Watson can suggest anything which it thinks might interact beneficially with the cell affected by mutation which is causing the cancer. Of course, a doctor will probably have to take many other issues into consideration before prescribing whatever the AI-driven Watson suggests, but it surely will speed up the process.
Specialist programs also exist for research into particular types of cancerâfor example the Dragon Master Foundation partnered with five U.S. pediatric hospitals to create a database of tissue samples from patients with rare childhood brain tumors.
This research is all based around our understanding of the fact that cancer is caused by cell mutations, and clues to when and how it will develop are often hidden within the complex genetic data contained in our bodiesâ ur genome.
Genomic Data
Genomic data is BIG. To store the raw code output by a genome sequencer on a computer takes around 200 gigabytes. Attempts to use comparative analysis of these genomes to isolate factors which make us susceptible to cancer involve ever-growing databases â one is planned to contain one million â putting it firmly within the realms of Big Data.
This isnât a challenge for Big Data, itâs merely a question of scale, and programs such as the Folding @ Home initiative exist to harness a worldwide, distributed network of processing power which has vastly accelerated the rate at which this data is being harvested and decoded, since the first one was completed in 2003.
A greater challenge existsâand possible defenses againstâthe Big C. Thankfully, making sense of messy, unstructured data is a task at which Big Data excels.
Decoding Messy data
Flatiron Health has created the OncologyCloudâa big data program which aims to collect data from the medical records, doctorsâ notes, interactions with carers and billing information of the 96% of cancer patients who do not take part in clinical trials.
Data from these patients is effectively âsiloedâ in systems maintained by the individual organizations diagnosing, delivering and arranging the finances for the care. As the company states âMost of the âclinically valuableâ [cancer] data resides in doctor and nurse notes, pathology reports, PDFs and other unstructured forms [â¦] Traditional population health analytics are built using claims data, which can attain quick results but lack the depth required to understand a disease as complex as cancer [â¦] To run analytics for oncology off of claims data would be the equivalent of analyzing an iceberg by only looking at whatâs above the surface. If you want the âclinical truthâ you have to get into the details.â
The Flatiron project aims to take all of this messy, unstructured data captured at every stage of the patientâs interaction with care providers, and structure it so it too can be used for comparative analysis, alongside the data of millions of others.
So that, in a nutshell, is the current despatch from the Big Data front of the war betweenscience and cancer. Itâs a fight thatâs far from won but significant advances are being made. Just this year a study concluded that thanks to the advances in spotting and treating cancer, by 2050 no one under 80 will be dying from the disease. Big Data-powered research and treatment programs will undoubtedly play a part in that victory, just as they continue to give us answers in every field of science.
To read the original article on Forbes, click here.
Sep 10, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
[ COVER OF THE WEEK ]
Extrapolating Source
[ AnalyticsWeek BYTES]
>> Beyond Speed, What Do Consumers Want from 5G? by administrator
>> Customer analytics for the CMO by analyticsweekpick
>> Essential Construction Site Safety Checklists and Resources for Australia and Singapore by analyticsweekpick
[ FEATURED COURSE]
![]() |
[ FEATURED READ]
Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 4th Edition
![]() |
[ TIPS & TRICKS OF THE WEEK]
Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.
[ DATA SCIENCE Q&A]
Q:Examples of NoSQL architecture?
A: * Key-value: in a key-value NoSQL database, all of the data within consists of an indexed key and a value. Cassandra, DynamoDB
* Column-based: designed for storing data tables as sections of columns of data rather than as rows of data. HBase, SAP HANA
* Document Database: map a key to some document that contains structured information. The key is used to retrieve the document. MongoDB, CouchDB
* Graph Database: designed for data whose relations are well-represented as a graph and has elements which are interconnected, with an undetermined number of relations between them. Polyglot Neo4J
Source
[ VIDEO OF THE WEEK]
#GlobalBusiness at the speed of The #BigAnalytics
Subscribe to Youtube
[ QUOTE OF THE WEEK]
The temptation to form premature theories upon insufficient data is the bane of our profession. – Sherlock Holmes
[ PODCAST OF THE WEEK]
Pascal Marmier (@pmarmier) @SwissRe discusses running data driven innovation catalyst
Subscribe
[ FACT OF THE WEEK]
40% projected growth in global data generated per year vs. 5% growth in global IT spending.
Sourced from: Analytics.CLUB #WEB Newsletter
Four levels of Hadoop adoption maturity

So youâve been monitoring or are already on the journey with Hadoop — and youâre wondering: Where are we on the adoption curve compared to the market in general?
Based on my interactions with numerous companies, I want to share what I think that curve looks like so that you can orientate your organization and decide if youâre leading the way or lagging behind. Neither is inherently bad, but you do need to be conscious of where you are and why.
Stage 1: Monitoring
The monitoring stage is where you generally find two types of organizations:
- The first type are those who donât believe they need to deal with big data. These companies might be struggling with other issues and feel that big data is just too far into the future — or they believe theyâll never have the need to deal with big data in their market segment.
- The sccond type are those who believe there are other technologies more mature than Hadoop which can meet their needs. Sometimes a few people in such organizations have played with Hadoop and decided it doesnât meet their needs or lacks specific capabilities. From that point on, itâs a case of keeping an eye on the market and how itâs developing.
The big questions for these organizations are: What would it take for them to look at Hadoop? Is that well defined and agreed upon? Are they ready to move if thereâs a sudden need to spring into action?
Stage 2: Investigating
In this stage, organizations generally have a small group playing with Hadoop technology (when I say Hadoop technology, I mean the ecosystem — not just Map/Reduce, etc.). The group experimenting with Hadoop is usually in IT, or is an IT-savvy group in a business unit.
In these organizations, thereâs no real Hadoop mandate yet but, the investigations are designed to determine what Hadoop might be useful for or how to begin addressing big data challenges on the horizon.
In this case, companies are either using the free Hadoop Apache distribution download or one of the free downloads from an established commercial Hadoop Distribution vendor like Cloudera, Hortonworks or MapR.
All of the effort in this level is useful. Itâs not a commitment to Hadoop, but it is building the skills and knowledge necessary to consider Hadoopâs IT/business implications — and also to be ready to quickly move to the next level should the time come.
The big questions for these organizations are: What would it take for them to move beyond trials and into using Hadoop? Is that well-defined and agreed upon? Are they looking at more than just storing data? Are they also looking at how to utilize that stored data through visualization and analytics?
Stage 3: Implementing
In this stage, organizations have deployed a Hadoop cluster and have at least one project running on it. Theyâve largely moved on from the Apache distribution because they needed additional capabilities offered by one of the commercial vendors such as support, back-up, management tools, other SQL data stores, etc.
The companies in this stage generally have up to three Hadoop projects, either in production, or close to it. Initial projects often focus on new business challenges, or on using data that was not previously accessible. Where possible, existing end user toolsets are used to minimize the need for training and maximize on delivering quick ROI on those early projects.
This phase is the riskiest one. If ROI is not delivered, the value of Hadoop can be undermined. At the same time, skills are at a premium and experimentation is likely happening as organizations build production projects out. At this stage you could say the bubble is at risk of bursting for some organizations. Few of the organizations Iâm working with are past this phase yet.
Stage 4: Established
Established organizations have a number of projects in production with plans for many more and a large Hadoop cluster. Generally speaking, these companies will also be working on a broad enterprise architecture where Hadoop is taking on an increasingly important role in their five-year vision.
Theyâll be working with a commercial Hadoop distribution to influence the development of features and functions they require to support their future architecture, and will be growing the size of their clusters rapidly. This is the group hoping for large returns from their investment both in savings, and also perhaps in disruptive market changes theyâre trying to enable through the use of Hadoop and big data.
These advanced organizations will help drive requirements for the next generation of Hadoop — all aligned to the business issues they need solving. If an industry is not well represented in this group, itâs possible that industryâs needs will take a back seat to those of more well-represented industries.
Conclusion
There are already many companies publicly moving forwards with Hadoop such as Yahoo, Home Depot, Rogers, Schlumberger, Barclays Bank, Symantec, Verizon, British Telecom, ING, Port of Rotterdam, British Airways, Truecar, EDF, Sanoma, Octo Technology, HSBC, Orange France, Shazam, CERN to name but a few.
You can see these companies, and more, on the websites of the commercial distributions such as Cloudera,Hortonworks and MapR or by reviewing the various recordings from Strata or Hadoop conferences. If youâre exploring Hadoop, it may be that now is the time to put your foot down and accelerate a little more.
These are some of my initial impressions. If youâre using Hadoop, do you fit into one of these buckets or is there another that I might be missing? Would you name the buckets differently? Any other characteristics you would add to any of the groups?
To read the original article on S.A.S. Voices, click here.
Sep 03, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15
[ COVER OF THE WEEK ]
Trust the data Source
[ AnalyticsWeek BYTES]
>> A Gentle Introduction to Linear Regression With Maximum Likelihood Estimation by administrator
>> The Only Recipe For A Data Story You’ll Ever Need by analyticsweek
>> 9th edition of Aegis Graham Bell Award nominations open by administrator
[ FEATURED COURSE]
![]() |
[ FEATURED READ]
![]() |
[ TIPS & TRICKS OF THE WEEK]
Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.
[ DATA SCIENCE Q&A]
Q:Provide examples of machine-to-machine communications?
A: Telemedicine
– Heart patients wear specialized monitor which gather information regarding heart state
– The collected data is sent to an electronic implanted device which sends back electric shocks to the patient for correcting incorrect rhythms
Product restocking
– Vending machines are capable of messaging the distributor whenever an item is running out of stock
Source
[ VIDEO OF THE WEEK]
@AngelaZutavern & @JoshDSullivan @BoozAllen discussed Mathematical Corporation #FutureOfData
Subscribe to Youtube
[ QUOTE OF THE WEEK]
Numbers have an important story to tell. They rely on you to give them a voice. Stephen Few
[ PODCAST OF THE WEEK]
#FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership
Subscribe
[ FACT OF THE WEEK]
140,000 to 190,000. Too few people with deep analytical skills to fill the demand of Big Data jobs in the U.S. by 2018.