Jul 02, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Big Data knows everything  Source

[ AnalyticsWeek BYTES]

>> Ex P&G’s Top Boss Shares 5 Secrets To Success by v1shal

>> What’s New in Zoomdata 4.9: Microservices for Elastic Scalability by analyticsweek

>> @BrianHaugli @The_Hanover ‏on Building a #Leadership #Security #Mindset by v1shal

Wanna write? Click Here

[ FEATURED COURSE]

Hadoop Starter Kit

image

Hadoop learning made easy and fun. Learn HDFS, MapReduce and introduction to Pig and Hive with FREE cluster access…. more

[ FEATURED READ]

The Black Swan: The Impact of the Highly Improbable

image

A black swan is an event, positive or negative, that is deemed improbable yet causes massive consequences. In this groundbreaking and prophetic book, Taleb shows in a playful way that Black Swan events explain almost eve… more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:Is it better to design robust or accurate algorithms?
A: A. The ultimate goal is to design systems with good generalization capacity, that is, systems that correctly identify patterns in data instances not seen before
B. The generalization performance of a learning system strongly depends on the complexity of the model assumed
C. If the model is too simple, the system can only capture the actual data regularities in a rough manner. In this case, the system poor generalization properties and is said to suffer from underfitting
D. By contrast, when the model is too complex, the system can identify accidental patterns in the training data that need not be present in the test set. These spurious patterns can be the result of random fluctuations or of measurement errors during the data collection process. In this case, the generalization capacity of the learning system is also poor. The learning system is said to be affected by overfitting
E. Spurious patterns, which are only present by accident in the data, tend to have complex forms. This is the idea behind the principle of Occam’s razor for avoiding overfitting: simpler models are preferred if more complex models do not significantly improve the quality of the description for the observations
Quick response: Occam’s Razor. It depends on the learning task. Choose the right balance
F. Ensemble learning can help balancing bias/variance (several weak learners together = strong learner)
Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek Panel Discussion: Big Data Analytics

 @AnalyticsWeek Panel Discussion: Big Data Analytics

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The data fabric is the next middleware. – Todd Papaioannou

[ PODCAST OF THE WEEK]

#DataScience Approach to Reducing #Employee #Attrition

 #DataScience Approach to Reducing #Employee #Attrition

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In late 2011, IDC Digital Universe published a report indicating that some 1.8 zettabytes of data will be created that year.

Sourced from: Analytics.CLUB #WEB Newsletter

Building Big Analytics as a Sustainable Competitive Advantage

Building Big Analytics as a Sustainable Competitive Advantage
Building Big Analytics as a Sustainable Competitive Advantage

Every passing day more and more organizations are investing more and more resources in acquiring a stronger analytics strategy for handling their big-data. Certainly a great move if done right. So, when everyone is focusing on analytics, why not start thinking about building it as a competitive advantage?

In a recent research conducted – 96% – said that analytics will become more important to their organizations in the next three years, according to a Deloitte report based on a mix of 100 online surveys and 35 interviews conducted with senior executives at 35 companies in North America, the UK and Asia.

With increased hype around big-data, analytics is already a crucial resource in almost every company but it has still lot of room for improvements. Like every new technology, lot of communication channels needs to be established, loads of buy-ins needs to happen and charter needs to be approved and budgeted. But world is moving there. So, why not rethink about the current analytics strategy and build it to provide a sustained competitive edge over the long term to help business stay competitive and ahead of the curve against other competitors?

No, it’s not a rocket science nor it is extremely difficult to do. All it requires is a roadmap, consistent deployment and frequent iterative executions. As a roadmap consider following 5 steps to pursue the organization to the road of analytics driven competitive edge.

Acquire the resources: Getting the magic team, the infrastructure, partnerships, and right relationships to get going on data analytics journey is of utmost importance. As the big-data hype is hitting the market, resources are in great demand, so securing the right mix that could build your analytics business is important and every effort should be made to make sure you acquire right resources. Getting an early head start will go a long way in ensuring timely delivery of analytics framework when the business needs the most. How those resources needs to be acquired is another topic for another blog, but getting yourself prepared to hire is important.

Build analytics across business verticals: Once you have right resources needed to build analytics framework, next step is to make sure analytics frameworks encapsulates various business verticals. Financial department is most used to having some analytics platform. Other departments like marketing, customer, sales, & operations should also be used for capturing data that will further be analyzed for more insights and findings to help business understand the data and insights mores. Not necessarily the more captured data translates to better insights, the quality of data plays an important role. But at start it is difficult to find out the quality of data, so in case of missing information on importance of data fields, the more data we capture the better we get at covering the most of the business through analytics.

Utilize analytics for decision-making: Getting good data and better insights does not necessarily translate to better decisions. The processes need to be deployed that could leverage insights through analytics and account them into decisions made by businesses. A data backed decision tends to work with more accuracy than without it. Decision quality increases with more quantified backings. Analytics driven decision-makings should be done at not only the high business levels but also at the level of front-line workers dealing with customers directly.

Coordinate and align analytics: Another important thing to note here is that often times analytics does not have a clear head or vertical that causes quality issues, lack of ownership and inefficiencies in the system. There needs to be a better coordination charter built and approved to make sure all the stakeholders and processes contribute to analytics framework for its effective functioning. This should also include a clear communication of roles and interactions across various groups and approved processes.

Create a long term strategy: With right infrastructure, laid out communication charter, well managed processes & right utilization of findings the businesses will start using analytics proactively to help create a competitive edge based on findings. One thing that needs to be considered is building it for a long-term strategy. Having a long-term strategy will provide a consistent and sustained roadmap for strategy execution. It will also spread some light on how to view data on long term basis and how to better shape it for faster learning and effective analytics.

So, covering for all the above stated 5 steps will provide for a framework that energizes organizations with state of the art competitive edge that is based on core data and analytics. If done effectively, the data & it’s analysis will provide for an effective learning system that sustainably provides relevant insights in real-time for faster execution.

Source: Building Big Analytics as a Sustainable Competitive Advantage

Jun 25, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Ethics  Source

[ AnalyticsWeek BYTES]

>> How The Guardian’s Ophan analytics engine helps editors make better decisions by analyticsweekpick

>> May 23, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> Data Matching with Different Regional Data Sets by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Applied Data Science: An Introduction

image

As the world’s data grow exponentially, organizations across all sectors, including government and not-for-profit, need to understand, manage and use big, complex data sets—known as big data…. more

[ FEATURED READ]

Data Science from Scratch: First Principles with Python

image

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn … more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:Provide a simple example of how an experimental design can help answer a question about behavior. How does experimental data contrast with observational data?
A: * You are researching the effect of music-listening on studying efficiency
* You might divide your subjects into two groups: one would listen to music and the other (control group) wouldn’t listen anything!
* You give them a test
* Then, you compare grades between the two groups

Differences between observational and experimental data:
– Observational data: measures the characteristics of a population by studying individuals in a sample, but doesn’t attempt to manipulate or influence the variables of interest
– Experimental data: applies a treatment to individuals and attempts to isolate the effects of the treatment on a response variable

Observational data: find 100 women age 30 of which 50 have been smoking a pack a day for 10 years while the other have been smoke free for 10 years. Measure lung capacity for each of the 100 women. Analyze, interpret and draw conclusions from data.

Experimental data: find 100 women age 20 who don’t currently smoke. Randomly assign 50 of the 100 women to the smoking treatment and the other 50 to the no smoking treatment. Those in the smoking group smoke a pack a day for 10 years while those in the control group remain smoke free for 10 years. Measure lung capacity for each of the 100 women.
Analyze, interpret and draw conclusions from data.

Source

[ VIDEO OF THE WEEK]

@DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

 @DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Torture the data, and it will confess to anything. – Ronald Coase

[ PODCAST OF THE WEEK]

Solving #FutureOfWork with #Detonate mindset (by @steven_goldbach & @geofftuff) #JobsOfFuture #Podcast

 Solving #FutureOfWork with #Detonate mindset (by @steven_goldbach & @geofftuff) #JobsOfFuture #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Decoding the human genome originally took 10 years to process; now it can be achieved in one week.

Sourced from: Analytics.CLUB #WEB Newsletter

An Introduction to Central Limit Theorem

In machine learning, statistics play a significant role in achieving data distribution and the study of inferential statistics. A data scientist must understand the math behind sample data and Central Limit Theorem answers most of the problems. Let us discuss the concept of the Central Limit Theorem. It assumes that the distribution in the sample […]

The post An Introduction to Central Limit Theorem appeared first on GreatLearning.

Source

April 3, 2017 Health and Biotech analytics news roundup

Advantages of a Truly Open-Access Data-Sharing Model: Traditionally, data from clinical trials has been siloed, but now there is support for making such data open for all to use. Project Data Sphere is one such effort.

Detecting mutations could lead to earlier liver cancer diagnosis: Aflatoxin induces a mutation that can cause liver cancer. Now, MIT researchers have developed a method to detect this mutation before cancer develops.

Australia launches machine-learning centre to decrypt the personal genome: Geneticists and computer scientists have launched the Garvan-Deakin Program in Advanced Genomic Investigation (PAGI). They hope to work out the complex genetic causes of diseases.

More genomic sequencing announced for Victoria: Selected Australian patients will have access to genomic sequencing. This project is intended to help track drug-resistant “superbugs” as well as 4 other personalized conditions.

No, We Can’t Say Whether Cancer Is Mostly Bad Luck: Last week’s news on the mutations that cause cancer is disputed among cancer scientists.

Source: April 3, 2017 Health and Biotech analytics news roundup by pstein

Jun 18, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Statistics  Source

[ AnalyticsWeek BYTES]

>> Proactive Services Data Management in the Age of Hyper-Distribution by analyticsweekpick

>> How to install and use the Datumbox Machine Learning Framework by administrator

>> Oct 10, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

Wanna write? Click Here

[ FEATURED COURSE]

Artificial Intelligence

image

This course includes interactive demonstrations which are intended to stimulate interest and to help students gain intuition about how artificial intelligence methods work under a variety of circumstances…. more

[ FEATURED READ]

How to Create a Mind: The Secret of Human Thought Revealed

image

Ray Kurzweil is arguably today’s most influential—and often controversial—futurist. In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse… more

[ TIPS & TRICKS OF THE WEEK]

Finding a success in your data science ? Find a mentor
Yes, most of us dont feel a need but most of us really could use one. As most of data science professionals work in their own isolations, getting an unbiased perspective is not easy. Many times, it is also not easy to understand how the data science progression is going to be. Getting a network of mentors address these issues easily, it gives data professionals an outside perspective and unbiased ally. It’s extremely important for successful data science professionals to build a mentor network and use it through their success.

[ DATA SCIENCE Q&A]

Q:How do you take millions of users with 100’s transactions each, amongst 10k’s of products and group the users together in meaningful segments?
A: 1. Some exploratory data analysis (get a first insight)

* Transactions by date
* Count of customers Vs number of items bought
* Total items Vs total basket per customer
* Total items Vs total basket per area

2.Create new features (per customer):

Counts:

* Total baskets (unique days)
* Total items
* Total spent
* Unique product id

Distributions:

* Items per basket
* Spent per basket
* Product id per basket
* Duration between visits
* Product preferences: proportion of items per product cat per basket

3. Too many features, dimension-reduction? PCA?

4. Clustering:

* PCA

5. Interpreting model fit
* View the clustering by principal component axis pairs PC1 Vs PC2, PC2 Vs PC1.
* Interpret each principal component regarding the linear combination it’s obtained from; example: PC1=spendy axis (proportion of baskets containing spendy items, raw counts of items and visits)

Source

[ VIDEO OF THE WEEK]

Data-As-A-Service (#DAAS) to enable compliance reporting

 Data-As-A-Service (#DAAS) to enable compliance reporting

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

We chose it because we deal with huge amounts of data. Besides, it sounds really cool. – Larry Page

[ PODCAST OF THE WEEK]

Unconference Panel Discussion: #Workforce #Analytics Leadership Panel

 Unconference Panel Discussion: #Workforce #Analytics Leadership Panel

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Data is growing faster than ever before and by the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet.

Sourced from: Analytics.CLUB #WEB Newsletter

6 things that you should know about vMwarevSphere 6.5

vSphere 6.5 offers a resilient, highly available, on-demand infrastructure that is the perfect groundwork for any cloud environment. It provides innovation that will assist digital transformation for the business and make the job of the IT administrator simpler. This means that most of their time will be freed up so that they can carry out more innovations instead of maintaining the status quo. Furthermore, vSpehere is the foundation of the hybrid cloud strategy of VMware and is necessary for cross-cloud architectures. Here are essential features of the new and updated vSphere.

vCenter Server appliance

vCenter is an essential backend tool that controls the virtual infrastructure of VMware. vCenter 6.5 has lots of innovative upgraded features. It has a migration tool that aids in shifting from vSphere 5.5 or 6.0 to vSphere 6.5. The vCenter Server appliance also includes the VMware Update Manager that eliminates the need for restarting external VM tasks or using pesky plugins.

vSphere client

In the past, the front-end client that was used for accessing the vCenter Server was quite old-fashioned and stocky. The vSphere has undergone necessary HTML5 alterations. Aside from the foreseeable performance upgrades, the change also makes this tool cross-browser compatible and more mobile-friendly.  The plugins are no longer needed and the UI has been switched for a more cutting-edge aesthetics founded on the VMware Clarity UI.

Backup and restore

The backup and restore capabilities of the VSpher 6.5 is an excellent functionality that enables clients to back up data on any Platform Services Controller appliances or the vCenter Server directly from the Application Programming Interface(API) or Virtual Appliance Management Interface (VAMI). In addition, it is able to back up both VUM and Auto Deploy implanted within the appliance. This backup mainly consists of files that need to be streamed into a preferred storage device through SCP, FTP(s), or HTTP(s) protocols.

Superior automation capabilities

With regards to automation, VMware vSphere 6.5 works perfectly because of the new upgrades. The new PowerCLI tweak has been an excellent addition to the VMware part because it is completely module-based and the APIs are at present in very high demand. This feature enables the IT administrators to entirely computerize tasks down to the virtual machine level.

 Secure boot

The secure boot element of vSphere comprises the -enabled virtual machines. This feature is available in both Linux and Windows VMs and it allows secure boot to be completed through the clicking of a simplified checkbox situated in the VM properties. After it is enabled, only the properly signed VMs can utilize the virtual environment for booting.

 Improved auditing

The Vsphere 6.5 offers clients improved audit-quality logging characteristics. This aids in accessing more forensic details about user actions. With this feature, it is easier to determine what was done, when, by whom, and if any investigations are essential with regards to anomalies and security threats.

VMware’s vSphere developed out of complexity and necessity of expanding the virtualization market. The earlier serve products were not robust enough to deal with the increasing demands of IT departments. As businesses invested in virtualization, they had to consolidate and simplify their physical server farms into virtualized ones and this triggered the need for virtual infrastructure. With these VSphere 6.5 features in mind, you can unleash its full potential and usage. Make the switch today to the new and innovative VMware VSphere 6.5.

 

Source by thomassujain

The power of data in the financial services industry

Changes in business and technology in the financial services industry have opened up new possibilities. With the rapidly growing number of customer interactions through digital banking, there is a huge volume of customer data now available that can provide strategic opportunities for business growth and tremendous prospects for improved management tools.

However, after experiencing the challenges of the recent financial crisis, most Philippine financial services companies are understandably more focused on compliance and risk management, rather than on growth opportunities resulting from improved data and analytics. They are still dominated by data management solutions and have yet to truly embed analytics into business decisions. Data is used operationally and not strategically. They have yet to embrace the key awareness that, in this digital age, acknowledging the value of data as a strategic asset, deploying sophisticated analytics to realize the benefits of that asset, and converting information into insights and practical actions create a competitive advantage.

The industry has always used big data, particularly in credit analysis. Many analytics tools are currently available such as ACL, SQL, SAS, Falcon, Lavastorm, Tableau, Spotfire and Qlikview. At present, the business functions that are most advanced in terms of analytics are finance and risk management. There is also increased use in compliance and internal audits. However, the power of data has remained largely unexploited and untapped. Insights from big data can also be used to make well-informed strategic decisions by using data to effectively extract value from customers, identify risks, and improve operational efficiency.

A few high-growth financial services companies in the Philippines, mostly foreign, are beginning to embed data analytics in sales, marketing, budgeting and planning. They understand that product and service models must be fine-tuned to respond to changing customer preferences and expectations. Using big data techniques can help enhance customer targeting, as well as advice and adjust pricing and resource allocation. Other companies in the financial services industry should consider adopting these initiatives in order to do well in light of increasing competition.

THE CHALLENGES
As with any new idea, gaining an appreciation for the opportunities in data analytics is not without difficulty. Regulations, data privacy, fragmentation and skills shortages are among the challenges facing the financial services industries in this regard.

• Regulation and data privacy concerns still dominate the financial services industry because failure may cause irreversible financial and reputational damage; after all, these businesses rely largely on credibility. The industry has also become a target for cyber attacks. Cybercriminals have developed advanced techniques to infiltrate businesses and fraudulently access sensitive information such as usernames, passwords and credit card details. Top cyber attacks in the financial services industry include phishing (unsolicited e-mails sent without the recipients’ consent to steal login credentials and banking details), and remote access Trojans (fraudulently gaining access to sensitive and private information). Consequently, customers continue to take issue with digital banking.

This should not, however, dissuade companies in the financial services industry; this challenge does not prevent them from exploiting the full potential of data analytics. The industry must find ways to use big data to improve customer service without violating privacy concerns. It must continually reassure customers that their data is valuable and that their privacy has not been violated.

To retain confidence in their ability to safeguard customer data, financial services companies will need to consistently update information security policies, systems and infrastructures, and ensure that they are abreast with best practices.

• The infrastructure of many financial services companies is set up along products or business lines using legacy IT systems that are poorly connected and are unable to communicate with one another. Bridging the gaps between these fragmented systems and replacing them with new platforms represent a serious challenge, making it difficult to introduce new technology solutions. It requires simultaneously running the business while unwinding the legacy systems and migrating smoothly to a new platform with a central data system.

• Another important technical challenge is the lack of skilled data specialists who are able to understand and manage the complexity of the data from the emerging tools and technology and provide high-class analytics with business implications.

STRONG LEADERSHIP AND GOVERNANCE
Strong leadership and governance is the key to the success in the use of data analytics. Leaders with vision and character who are attuned to the fast and continuous growth in business and technology must first make a firm decision to give more impetus to data analytics, integrate the whole company’s data management team by hiring skilled data analysts and orchestrating the motion of extracting and exploiting big data and using it to achieve competitive advantage.

Data analysis was previously considered an IT-level matter. The scale of digitization and data analysis must be adopted as a core strategic issue and must move to the top level of management and be given due attention.

Effective data governance requires an integrated approach. Leaders should commit not just to the technology, but must also see the need to invest in the people, processes and structures necessary to ensure that technology delivers value throughout the business.

Part of the task requires re-educating the organization. Formalized data governance processes must be disseminated, understood, and complied with throughout the business.

Potential data issues should be identified through regular data quality audits, continuously training staff on governance policies and procedures, and conducting regular risk assessments aimed at identifying potential data vulnerabilities.

With these complex requirements and tasks, it may take time for companies to fully appreciate the advantages of data analytics. But with the rapid evolution of technology and increasing competition, forward-looking organizations might seriously consider fast-tracking the necessary steps to fully appreciate the power of data.

Veronica Mae A. Arce is a Senior Director of SGV & Co.

 

Originally posted via “The power of data in the financial services industry”

Source: The power of data in the financial services industry by analyticsweekpick

Jun 11, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data analyst  Source

[ AnalyticsWeek BYTES]

>> Stephen Wunker on future of customer success through cost innovation and data by v1shal

>> Hit the “Easy” Button with Talend & Databricks to Process Data at Scale in the Cloud by analyticsweekpick

>> Voices in AI – Episode 98 – A Conversation with Jerome Glenn by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Probability & Statistics

image

This course introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and… more

[ FEATURED READ]

Hypothesis Testing: A Visual Introduction To Statistical Significance

image

Statistical significance is a way of determining if an outcome occurred by random chance, or did something cause that outcome to be different than the expected baseline. Statistical significance calculations find their … more

[ TIPS & TRICKS OF THE WEEK]

Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.

[ DATA SCIENCE Q&A]

Q:What is the life cycle of a data science project ?
A: 1. Data acquisition
Acquiring data from both internal and external sources, including social media or web scraping. In a steady state, data extraction and routines should be in place, and new sources, once identified would be acquired following the established processes

2. Data preparation
Also called data wrangling: cleaning the data and shaping it into a suitable form for later analyses. Involves exploratory data analysis and feature extraction.

3. Hypothesis & modelling
Like in data mining but not with samples, with all the data instead. Applying machine learning techniques to all the data. A key sub-step: model selection. This involves preparing a training set for model candidates, and validation and test sets for comparing model performances, selecting the best performing model, gauging model accuracy and preventing overfitting

4. Evaluation & interpretation

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

5. Deployment

6. Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

7. Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Source

[ VIDEO OF THE WEEK]

@TimothyChou on World of #IOT & Its #Future Part 2 #FutureOfData #Podcast

 @TimothyChou on World of #IOT & Its #Future Part 2 #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Everybody gets so much information all day long that they lose their common sense. – Gertrude Stein

[ PODCAST OF THE WEEK]

@DrewConway on fabric of an IOT Startup #FutureOfData #Podcast

 @DrewConway on fabric of an IOT Startup #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

More than 200bn HD movies – which would take a person 47m years to watch.

Sourced from: Analytics.CLUB #WEB Newsletter

Improving Self-Service Business Intelligence and Data Science

The heterogeneous complexities of big data present the foremost challenge in delivering that data to the end users who need them most. Those complexities are characterized by:

  • Disparate data sources: The influx of big data multiplied the sheer amount of data sources almost exponentially, including those both external and internal ones. Moreover, the quantity of sources required today are made more complex by…
  • Multiple technologies powering those sources: For almost every instance in which SQL is still deployed, there is seemingly another application, use case, or data source which involves an assortment of alternative technologies. Moreover, accounting for the plethora of technologies in use today is frequently aggravated by contemporary…
  • Architecture and infrastructure complications: With numerous advantages for deployments in the cloud, on-premise, and in hybrid manifestations of the two, contemporary enterprise architecture and infrastructure is increasingly ensnared in a process which protracts time to value for accessing data. The dilatory nature of this reality is only worsened in the wake of…
  • Heightened expectations for data: As data becomes ever entrenched in the personal lives of business users, the traditional lengthy periods of business intelligence and data insight are becoming less tolerable. According to Dremio Chief Marketing Officer Kelly Stirman, “In our personal lives, when we want to use data to answer questions, it’s just a few seconds away on Google…And then you get to work, and your experience is nothing like that. If you want to answer a question or want some piece of data, it’s a multi-week or multi-month process, and you have to ask IT for things. It’s frustrating as well.”

However, a number of recent developments have taken place within the ever-shifting data landscape to substantially accelerate self-service BI and certain aspects of data science. The end result is that despite the variegated factors characterizing today’s big data environments, “for a user, all of my data looks like it’s in a single high performance relational database,” Stirman revealed. “That’s exactly what every analytical tool was designed for. But behind the scenes, your data’s spread across hundreds of different systems and dozens of different technologies.”

Avoiding ETL

Conventional BI platforms were routinely hampered by the ETL process, a prerequisite for both integrating and loading data into tools with schema at variance with that of source systems. The ETL process was significant for three reasons. It was the traditional way of transforming data for application consumption. It was typically the part of the analytics process which absorbed a significant amount of time—and skill—because it required the manual writing of code. Furthermore, it resulted in multiple copies of data which could be extremely costly to organizations. Stirman observed that, “Each time you need a different set of transformations you’re making a different copy of the data. A big financial services institution that we spoke to recently said that on average they have eight copies of every piece of data, and that consumes about 40 percent of their entire IT budget which is over a billion dollars.” ETL is one of the facets of the data engineering process which monopolizes the time and resources of data scientists, who are frequently tasked with transforming data prior to leveraging them.

Modern self-service BI platforms eschew ETL with automated mechanisms that provide virtual (instead of physical) copies of data for transformation. Thus, each subsequent transformation is applied to the virtual replication of the data with swift in-memory technologies that not only accelerate the process, but eliminate the need to dedicate resources to physical copies. “We use a distributed process that can run on thousands of servers and take advantage of the aggregate RAM across thousands of servers,” Stirman said. “We can execute these transformations dynamically and give you a great high-performance experience on the data, even though we’re transforming it on the fly.” End users can enact this process visually without involving script.
Reflections

Today’s self-service BI and data science platforms have also expedited time to insight by making data more available than traditional solutions did. Virtual replications of datasets are useful in this regard because they are stored in the underlying BI solution—instead of in the actual source of data. Thus, these platforms can access that data without retrieving them from the initial data source and incurring the intrinsic delays associated with architectural complexities or slow source systems. According to Stirman, the more of these “copies of the data in a highly optimized format” such a self-service BI or data science solution has, the faster it is at retrieving relevant data for a query. Stirman noted this approach is similar to one used by Google, in which there are not only copies of web pages available but also “all these different ways of structuring data about the data, so when you ask a question they can give you an answer very quickly.” Self-service analytics solutions which optimize their data copies in this manner produce the same effect.

Prioritizing SQL

Competitive platforms in this space are able to account for the multiplicity of technologies the enterprise has to contend with in a holistic fashion. Furthermore, they’re able to do so by continuing to prioritize SQL as the preferred query language which is rewritten into the language relevant to the source data’s technology—even when it isn’t SQL. By rewriting SQL into the query language of the host of non-relational technology options, users effectively have “a single, unified future-proof way to query any data source,” Stirman said. Thus, they can effectively query any data source without understanding its technology or its query language, because the self-service BI platform does. In those instances in which “those sources have something you can’t express in SQL, we augment those capabilities with our distributed execution engine,” Stirman remarked.
User Experience

The crux of self-service platforms for BI and data science is that by eschewing ETL for quicker versions of transformation, leveraging in-memory technologies to access virtual copies of data, and re-writing queries from non-relational technologies into familiar relational ones, users can rely on their tool of choice for analytics. Business end users can choose from any popular Tableau, Qlik, or any other preferred tool, while data scientists can use R, Python, or any other popular data science platform. The fact that these solutions are able to facilitate these advantages at scale and in cloud environments adds to their viability. Consequently, “You log in as a consumer of data and you can see the data, and you can shape it the way you want to yourself without being able to program, without knowing these low level IT skills, and you get the data the way you want it through a powerful self-service model instead of asking IT to do it for you,” Stirman said. “That’s a fundamentally very different approach from the traditional approach.”

 

Source by jelaniharper