Jun 27, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Accuracy  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Tackling 4th Industrial Revolution with HR4.0 by v1shal

>> Three Types Of Context To Make Your Audience Care About Your Data by analyticsweek

>> Remote DBA Experts- Improve Business Intelligence with The Perfect Analytical Experts by thomassujain

Wanna write? Click Here

[ FEATURED COURSE]

Learning from data: Machine learning course

image

This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applicati… more

[ FEATURED READ]

Introduction to Graph Theory (Dover Books on Mathematics)

image

A stimulating excursion into pure mathematics aimed at “the mathematically traumatized,” but great fun for mathematical hobbyists and serious mathematicians as well. Requiring only high school algebra as mathematical bac… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:What are confounding variables?
A: * Extraneous variable in a statistical model that correlates directly or inversely with both the dependent and the independent variable
* A spurious relationship is a perceived relationship between an independent variable and a dependent variable that has been estimated incorrectly
* The estimate fails to account for the confounding factor

Source

[ VIDEO OF THE WEEK]

@DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

 @DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

What we have is a data glut. – Vernon Vinge

[ PODCAST OF THE WEEK]

@JohnNives on ways to demystify AI for enterprise #FutureOfData #Podcast

 @JohnNives on ways to demystify AI for enterprise #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

By 2020, we will have over 6.1 billion smartphone users globally (overtaking basic fixed phone subscriptions).

Sourced from: Analytics.CLUB #WEB Newsletter

New Mob4Hire Report “The Impact of Mobile User Experience on Network Operator Customer Loyalty” Ranks Performance Of Global Wireless Industry

Mob4Hire, in collaboration with leading customer loyalty scientist Business Over Broadway, today announced its Summer Report 2010 of its “Impact of Mobile User Experience on Network Operator Customer Loyalty” international research, conducted during the Spring. The 111-country survey analyzes the impact of mobile apps across many dimensions of the app ecosystem as it relates to customer loyalty of network operators.

Read the full press release here: http://www.prweb.com/releases/2010/08/prweb4334684.htm; The report is available at http://www.mob4hire.com/services/global-mobile-research for only $495 Individual License (1-3 people), $995 Corporate License (3+ people).

Source by bobehayes

Jun 20, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Insights  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> How Airbnb Uses Big Data And Machine Learning To Guide Hosts To The Perfect Price by analyticsweekpick

>> My Conversation with Oracle on Customer Experience Management by bobehayes

>> Accelerating Discovery with a Unified Analytics Platform for Genomics by analyticsweek

Wanna write? Click Here

[ FEATURED COURSE]

R Basics – R Programming Language Introduction

image

Learn the essentials of R Programming – R Beginner Level!… more

[ FEATURED READ]

Antifragile: Things That Gain from Disorder

image

Antifragile is a standalone book in Nassim Nicholas Taleb’s landmark Incerto series, an investigation of opacity, luck, uncertainty, probability, human error, risk, and decision-making in a world we don’t understand. The… more

[ TIPS & TRICKS OF THE WEEK]

Data Analytics Success Starts with Empowerment
Being Data Driven is not as much of a tech challenge as it is an adoption challenge. Adoption has it’s root in cultural DNA of any organization. Great data driven organizations rungs the data driven culture into the corporate DNA. A culture of connection, interactions, sharing and collaboration is what it takes to be data driven. Its about being empowered more than its about being educated.

[ DATA SCIENCE Q&A]

Q:Is mean imputation of missing data acceptable practice? Why or why not?
A: * Bad practice in general
* If just estimating means: mean imputation preserves the mean of the observed data
* Leads to an underestimate of the standard deviation
* Distorts relationships between variables by “pulling” estimates of the correlation toward zero

Source

[ VIDEO OF THE WEEK]

Understanding Data Analytics in Information Security with @JayJarome, @BitSight

 Understanding Data Analytics in Information Security with @JayJarome, @BitSight

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

For every two degrees the temperature goes up, check-ins at ice cream shops go up by 2%. – Andrew Hogue, Foursquare

[ PODCAST OF THE WEEK]

Nick Howe (@Area9Nick @Area9Learning) talks about fabric of learning organization to bring #JobsOfFuture #Podcast

 Nick Howe (@Area9Nick @Area9Learning) talks about fabric of learning organization to bring #JobsOfFuture #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

39 percent of marketers say that their data is collected ‘too infrequently or not real-time enough.’

Sourced from: Analytics.CLUB #WEB Newsletter

Life Might Be Like a Box of Chocolates, But Your Data Strategy Shouldn’t Be

“My momma always said, “Life was like a box of chocolates. You never know what you’re gonna get.” Even if everyone’s life remains full of surprises, the truth is that what applied to Forrest Gump in the 1994 movie by Robert Zemeckis, shouldn’t apply to your data strategy. As you’re making the very first steps into your data strategy, you need to first know what’s inside your data. And this part is critical.  To do so, you need the tools and methodology to step up your data-driven strategy.

<<ebook: Download our full Definitive Guide to Data Governance>>

Why Data Discovery?

With increased affordability and accessibility of data storage over recent years, data lakes have increased in popularity. This left IT teams with a growing number of diverse known and unknown datasets polluting the data lake in volume and variety every day. As a consequence, everyone is facing a data backlog.  It can take weeks for IT teams to publish new data sources in a data warehouse or data lakes. At the same time, it takes hours for line-of-business workers or data scientists to find, understand and put all that data into context. IDC found that only 19 percent of the time spent by data professionals and business users can really be dedicated to analyzing information and delivering valuable business outcomes

Given this new reality, the challenge is now to overcome these obstacles by bringing clarity, transparency and accessibility to your data as well as to extract value from legacy systems and new applications alike. Wherever the data resides (in a traditional data warehouse or hosted in a cloud data lake), you need to establish proper data screening, so you can get the full picture and make sure you have the entire view of the data flow coming in and out your organization.

Know Your Data

When it’s time to get started working on your data, it’s critical to start exploring the different data sources you wish to manage. The good news is that the newly released Talend Data Catalog coupled with the Talend Data Fabric is here to help.

As mentioned in this post, Talend Data Catalog will intelligently discover all the data coming into your data lake so you get an instant picture of what’s going on in any of your datasets.

One of the many interesting use cases of Talend Data Catalog is to identify and screen any datasets that contain sensitive data so that you can further reconcile them and apply data masking, for example, to enable relevant people to use them within the entire organization. This will help reduce the burden of any data team wishing to operationalize regulations compliance across all data pipelines. To discover more about how Talend Data Catalog will help to be compliant with GDPR, take a look at this Talend webcast.

Auto Profiling for All with Data Catalog

Auto-profiling capabilities of Talend Data Catalog facilitate data screening for non-technical people within your organization. Simply put, the data catalog will provide you with automated discovery and intelligent documentation of the datasets in your data lake. It comes with easy to use profiling capabilities that will help you to quickly assess data at a glance. With trusted and auto profiled datasets, you will have powerful and visual profiling indicators, so users can easily find and the right data in a few clicks. 

Not only can Talend Data Catalog bring all of your metadata together in a single place, but it can also automatically draw the links between datasets and connect them to a business glossary. In a nutshell, this allows organizations to:

  • Automate the data inventory
  • Leverage smart semantics for auto-profiling, relationships discovery and classification
  • Document and drive usage now that the data has been enriched and becomes more meaningful

Go further with Data Profiling

Data profiling is a technology that will enable you to discover your datasets in-depth and accurately assess multiple data sources based on the six dimensions of data quality. It will help you to identify if and how your data is inaccurate, inconsistent, incomplete.

Let’s put this in context. Think about a doctor’s exam to assess a patient’s health. Nobody wants to be in the process of having surgery without a precise and close examination. The same applies to data profiling. You need to understand your data before fixing it. As data will often come into the organization as either inoperable, in hidden formats, or unstructured an accurate diagnosis will help you to have a detailed overview of the problem before fixing it. This will save your time for you, your team and your entire organization because you will have primarily mapped this potential minefield.

Easy profiling for power users with Talend Data Preparation: Data profiling shouldn’t be complicated. Rather, it should be simple, fast and visual. For use cases such as Salesforce data cleansing, you may wish to gauge your data quality by delegating some of the basic data profiling activities to business users. They will then be able to do quick profiling on their favorite datasets. With tools like Talend Data Preparation, you will have powerful yet simple built-in profiling capabilities to explore datasets and assess their quality with the help of indicators, trends and patterns.

Advanced profiling for data engineers: Using Talend Data Quality in the Talend Studio, data engineers can start connecting to data sources to analyze their structure (catalogs, schemas, and tables), and stores the description of their metadata in its metadata repository. Then, they can define available data quality analysis including database, content analysis, column analysis, table analysis, redundancy analysis, correlation analysis, and more. These analyses will carry out the data profiling processes that will define the content, structure, and quality of highly complex data structures. The analysis results will be then displayed visually as well.

To go further into data profiling take a look at this webcast: An Introduction to Talend Open Studio for Data Quality.

Keep in mind that not your data strategy should first and foremost start with data discovery. Failure to profile your data would obviously put your entire data strategy at risk. It’s really about analyzing the ground to make sure your data house could be built on solid foundations.

The post Life Might Be Like a Box of Chocolates, But Your Data Strategy Shouldn’t Be appeared first on Talend Real-Time Open Source Data Integration Software.

Originally Posted at: Life Might Be Like a Box of Chocolates, But Your Data Strategy Shouldn’t Be

Jun 13, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data shortage  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Geeks Vs Nerds [Infographics] by v1shal

>> @DarrWest / @BrookingsInst on the Future of Work: AI, Robots & Automation #JobsOfFuture by v1shal

>> User Experience Salaries & Calculator (2018) by analyticsweek

Wanna write? Click Here

[ FEATURED COURSE]

Machine Learning

image

6.867 is an introductory course on machine learning which gives an overview of many concepts, techniques, and algorithms in machine learning, beginning with topics such as classification and linear regression and ending … more

[ FEATURED READ]

The Signal and the Noise: Why So Many Predictions Fail–but Some Don’t

image

People love statistics. Statistics, however, do not always love them back. The Signal and the Noise, Nate Silver’s brilliant and elegant tour of the modern science-slash-art of forecasting, shows what happens when Big Da… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:You are compiling a report for user content uploaded every month and notice a spike in uploads in October. In particular, a spike in picture uploads. What might you think is the cause of this, and how would you test it?
A: * Halloween pictures?
* Look at uploads in countries that don’t observe Halloween as a sort of counter-factual analysis
* Compare uploads mean in October and uploads means with September: hypothesis testing

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @ScottZoldi, @FICO

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @ScottZoldi, @FICO

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data really powers everything that we do. – Jeff Weiner

[ PODCAST OF THE WEEK]

@SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

 @SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In 2008, Google was processing 20,000 terabytes of data (20 petabytes) a day.

Sourced from: Analytics.CLUB #WEB Newsletter

Tutorial: Using R for Scalable Data Analytics

At the recent Strata conference in San Jose, several members of the Microsoft Data Science team presented the tutorial Using R for Scalable Data Analytics: Single Machines to Spark Clusters. The materials are all available online, including the presentation slides and hands-on R scripts. You can follow along with the materials at home, using the Data Science Virtual Machine for Linux, which provides all the necessary components like Spark and Microsoft R Server. (If you don’t already have an Azure account, you can get $200 credit with the Azure free trial.)

The tutorial covers many different techniques for training predictive models at scale, and deploying the trained models as predictive engines within production environments. Among the technologies you’ll use are Microsoft R Server running on Spark, the SparkR package, the sparklyr package and H20 (via the rsparkling package). It also touches on some non-Spark methods, like the bigmemory and ff packages for R (and various other packages that make use of them), and using the foreach package for coarse-grained parallel computations. You’ll also learn how to create prediction engines from these trained models using the mrsdeploy package.

Mrsdeploy

The tutorial also includes scripts for comparing the performance of these various techniques, both for training the predictive model:

Training

and for generating predictions from the trained model:

Scoring

(The above tests used 4 worker nodes and 1 edge node, all with with 16 cores and 112Gb of RAM.)

You can find the tutorial details, including slides and scripts, at the link below.

Strata + Hadoop World 2017, San Jose: Using R for scalable data analytics: From single machines to Hadoop Spark clusters

 

 

Source

Startup Movement Vs Momentum, a Classic Dilemma

Movement Vs Momentum, a classic startup dilemma

In my last post on 80/20 Rules for Startup, I briefly touched an area on Movement Vs Momentum. I got some interest to put some more light into it. So here it goes. Startup is all about working hard to validate hypothesis, as you solve one, another one pops up. So, it should be all about validating highest priority hypothesis one after the other. Startups that find the fastest route through problem solving to revenue win. In this struggle, startups work really hard to move and establish a business. Not all hard work translates to business, which is the primary differentiation between movement and momentum.

 

During this never ending run to make startups successful, it is extremely important to prioritize and not spend time running in circles. So, it is very important to differentiate movement from momentum. To a working brain, it is very easy to confuse momentum with motion as both appear to be identical traits. Momentum is moving things in desired direction, whereas motion/movement is state of doing something. Not every motion is a momentum, which is why sometimes, ever increasing mountains of hours of work results in lesser and lesser accomplishments. Startups are no different; they need some smart help to make them work efficiently to success.

 

This is why we need to understand how startups could work smartly and waste less. Sometimes, it is not only about moving faster but also moving smarter. It is also important to understand it is very difficult to differentiate movement from momentum. As we all know, all momentums are movement but not all movements are momentums. So, instead of making long and inflated expectations to validate before you go out and launch, make small, edible assumptions and keep validating it to prove your idea’s validity. Make your chase after the most important assumptions first. Startups in their early stages, makes several zig-zag turns to diversion and get overworked. Every effort should be made to put the best effort in right direction to help startups save resource and get to result quickly.

So, following 5 are the signs that your startup could be moving but might not have the momentum:

1. You have not spoken to any prospective customers/users on the need of your product. Many startups are suffering from it. Many a times startups are focused in building the best product and do not get their product idea validated early on. This to me is the primary reason most of the startups struggle. They invest a lot of time into building product and defining the feature set and that too without much validation. Sure, any disruptive technology often defines and creates its own market. But how many companies we know that reaches Foursquare, Pinterest, Uber, AirBNB,  or Twitter stardom? The same probability applies to new and upcoming startups. The safer approach is to validate the problem-solution mix with the prospective clients. May be that would result in some quick iterations that could magnify the impact. One should remember that the best ideas often come from clients/prospects. So, if startups are investing a lot of time building product and not getting it validated, they could be moving but just not in the right direction which could increase chances of failure.

2. Your product has several features, solving various issues and going through a long development cycle before customer sees what you are up to. This is another issue plaguing some good teams in getting their products out. Last week, I spoke with 2 startups and both are very ambitious on what they are building and not yet ready to let their prospect see their product. They are all heads down perfecting the product. This is another sign which says that you could be overworking on problem. Remember the fact that your ordinary could be someone’s awesome, and there is no other way to check that than getting your prospects on the platform and allowing them to play a little. This will not only help you with some fundamental issues that crop up in early development but also help you in staying close to the need of your prospects/customers. This has always been the recipe of success for startups. So, stop adding lot of features and work on the one and only crazy feature that solves some real problem for your customers/users. This will help you get the product in hands of your customer quickly and efficiently and save you from wasting time on overworking or overthinking.

3. You are investing majority of your time executing but not focusing enough on planning how to best execute. I am sure, we recognize this problem. As a coder/doer, it is my tendency as well to jump on solving the problem, instead of taking a step back and thinking about the best strategy to solve. Many times the best solution is not the first one that comes to mind. So, if you find yourself doing more and planning less, it should be a red flag to check and revisit the strategy. Needless to say, doing more also works, but just more at the mercy of the probability of startups picking the right strategy. So, it is a good idea to revisit the execution strategy, may be you will find faster, cheaper and optimal way to execute the same thing.

4. You are not iterating enough. Yes, iteration also defines if you are working more than you should. A good execution strategy should make small hypothesis, validate them and keep on iterating till no further assumption are left unchecked. If you have one product and you are working on it forever without revisiting your strategy, it could be a problem. Similar to the point made above, first solution or your current solution might not be the best way to approach a problem, so it is important to keep yourself open to iterate your work and chase after best way to execute things. So, if you have been working really hard on something and not yet iterated on your code, it could be a sign that you may just be moving in circles and not moving in the direction to a successful startup.

5. You are still not working towards getting your first check. This might not apply to all, but you could read it in any way you can. Idea is that a startup should deliver real value, having paid customer or satisfied user endorses that value. The quickly you get to that stage, the faster you will start seeing the value needed you to move on. If you are not getting your first paid engagement quickly, it could make you spin in circles as you have still not validated your idea. It is still not known if someone will actually pay for such service. It is always great to get to the stage where you are delivering the real value to real company, user etc. Missing that could mean unsure state of your startup in delivering any value.

 

Now, you could argue, life is good on the other side of fence as well. Sure, startups could succeed in any shape, form, ideology or strategy. It is just a good strategy to follow a path where risk could be minimized. Getting a startup to succeed is anyone’s guess but having a more calculated strategy will make the success that is much more predictable and certain. So, it does not hurt to adopt some strategy which helps you gain momentum and not just move in any direction.

As a treat, here is a video on 10 things startup should focus on.

Source by v1shal

Jun 06, 19: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Statistics  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Choosing an Analytics Development Approach: How to Optimize Your Business Benefits by analyticsweek

>> Voices in AI – Episode 87: A Conversation with Sameer Maskey by analyticsweekpick

>> PHP Exceeds the Generic Human Expectations. Here’s how the Brand got it Done by thomassujain

Wanna write? Click Here

[ FEATURED COURSE]

R, ggplot, and Simple Linear Regression

image

Begin to use R and ggplot while learning the basics of linear regression… more

[ FEATURED READ]

How to Create a Mind: The Secret of Human Thought Revealed

image

Ray Kurzweil is arguably today’s most influential—and often controversial—futurist. In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:What is star schema? Lookup tables?
A: The star schema is a traditional database schema with a central (fact) table (the “observations”, with database “keys” for joining with satellite tables, and with several fields encoded as ID’s). Satellite tables map ID’s to physical name or description and can be “joined” to the central fact table using the ID fields; these tables are known as lookup tables, and are particularly useful in real-time applications, as they save a lot of memory. Sometimes star schemas involve multiple layers of summarization (summary tables, from granular to less granular) to retrieve information faster.

Lookup tables:
– Array that replace runtime computations with a simpler array indexing operation

Source

[ VIDEO OF THE WEEK]

Surviving Internet of Things

 Surviving Internet of Things

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The temptation to form premature theories upon insufficient data is the bane of our profession. – Sherlock Holmes

[ PODCAST OF THE WEEK]

Understanding #FutureOfData in #Health & #Medicine - @thedataguru / @InovaHealth #FutureOfData #Podcast

 Understanding #FutureOfData in #Health & #Medicine – @thedataguru / @InovaHealth #FutureOfData #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In 2015, a staggering 1 trillion photos will be taken and billions of them will be shared online. By 2017, nearly 80% of photos will be taken on smart phones.

Sourced from: Analytics.CLUB #WEB Newsletter

Data center location – your DATA harbour

data center location choice

When it comes to storing and processing your business’ data using a 3rd party data center and/or hosting providers, there are many factors to consider and not all of them are verifiable by you with complete guarantee of being true, but there is one aspect that you can investigate pretty easily and get a decent idea about. Anyone who either is a home owner or is familiar with the real estate business knows that when purchasing property there is one factor that has a tremendous influence on the price and it is …you probably guessed it by now “location, location, location” that’s what it is all about.

Considering that almost no business can operate today without being dependent on data processing, storage and mobile data access and the information technology infrastructure has become a commodity we tend to employ 3rd party providers to host our data in their facilities. The importance of your data whereabouts has become a vital factor in making a choice of colocation or cloud provider. Have a look at the “Critical decision making points when choosing your IT provider” and let’s focus on the location factor in your decision making process on whom to entrust your data.

Considering the location
There certain key factors to consider when it comes to the location of the provider facilities in order to make the most suitable choice for your business:

  1. Natural disasters: What is the likelihood of environmental calamities like hurricanes, tornadoes, catastrophic hail, major flooding and earthquakes in the area historically and statistically? Natural disaster hazards are a serious threat due to human inability to always forecast them and complete lack of control of these events. Having a disaster recovery or a fail over site in a location that is prone to natural disasters is dangerous and defeats the purpose of this precaution. If your primary data center is located in an accident prone area you should make sure that your disaster recovery and backup sites are outside of high risk zones.
  2. Connectivity and latency: Location of a data center will have a tremendous impact on the selection and the amount of available carriers providing network services. Remote and hard to reach locations will suffer from smaller selection. Data centers in the vicinity of Internet Exchange and Network Access Points enjoy rich selection of carriers and thus often lower latency and higher bandwidth at lower costs. Ideally a multi-tenant data center should be considered a carrier neutral facility, which pretty much means that the company owning the data center is entirely independent of any network provider and thus not in direct competition with them. This usually is a great incentive for carriers to want to offer their services in such facility.
  3. Security: Are the facilities in an area that is easy to secure and designated for business? Considering crime and accidents is this low risk area? Is the facility unmarked and hard to spot during a random check? Data center structures should be hard to detect for passersby and should be in areas that are easy to secure and monitor. Areas with high traffic and crime would increase the risk of your data being vulnerable to theft by physical access.
  4. Political and economic stability: Political and economic stability are critical factors in choosing the location. Countries that show a track record of civil distress and economical struggle can prove to be high risk due to the possibility of facilities seizures due to political reasons or higher risk of bankruptcy. Having a threat of government being overthrown and your data seized or your colocation provider filing for Chapter 13 due to currency devaluation are a huge no-go in any way you look at it. Stability is a key to guarantee your business continuity.
  5. Local laws and culture: Both can have a negative impact on your business from losing ownership of data to not being to operate with the same standards that you might be used to and expect. Make sure that you are not breaking any laws in the country your data resides in, what is allowed in one country could be illegal elsewhere. For example infringement and copyright laws vary greatly between countries and some of your customers’ data could put you in a tight spot. Furthermore make sure that the language and cultural barriers will not turn out to be show stoppers when it comes to your daily operations and in need of troubleshooting.
  6. Future growth: You might think it’s hasty and pointless to look into expansion and growth possibilities that your provider can sustain, but nothing is further from the truth. Finding out that your provider cannot accommodate your growth when you do need to expand can turn into a very pricey endeavor leading to splitting your infrastructure across multiple locations or even forcing you to a provider switch. Always make sure the data center has room to grow and not only space wise, but most importantly power wise. Today’s data centers will sooner cope with the shortage of power supply then with a shortage of space. Find out what is their growth potential in space and power and how soon can they be realized, as you need to know how quickly can they adapt to buffer all their customers growth this being a multi-tenant facility.
  7. Redundancy: The location of the facility will also have an impact on its redundancy. To name a few things of importance in order to run mission critical applications would be continuous access to power from multiple sources in case of outages. Multiple fiber entry into the facility will also ensure increased network redundancy. Redundant cooling and environmental controls are also a must to guarantee operationality. These are the basics redundancy factors and depending on your specific requirements for availability you might need to look much deeper than just your facility infrastructure redundancy. Talk to your provider about this, they will offer you advice.
  8. Accessibility: This is a factor important for multiple reasons, from security concerns to daily operations and disaster situations. Accessibility for specialized maintenance staff and emergency services in case of crisis as well as transportation of equipment and supplies within a reasonable amount of time is of vital importance. Facilities that are outside or immediate reach of such amenities have an increased risk of failure and increased recovery time in case of incidents. There could also be a question of the need of your staff physically accessing the equipment, but with today’s data center providers offering remote hands and usually managed services you can avoid such complications and have provider’s local personnel take care of all physical repairs and trouble shooting.

Consequences
The choice of your provider and its location will have severe consequences for your business if things go wrong and they do go wrong. Data Center Knowledge has made an overview of last year’s top 10 Outages and as you can see some big players in the industry have been brought to their knees.
Such disruptions of service carry tremendous costs for the data centers and those expenses have been increasing yearly. Just to give you an idea what kind of money losses I am talking about here take a look at the findings of Emerson Network Power and Ponemon Institute study.

“The study of U.S.-based data centers quantifies the cost of an unplanned data center outage at slightly more than $7,900 per minute.”
– Emerson Network Power, Ponemon Institute

The companies included in this study listed various reasons for outages from faulty equipment to human error, but 30 percent named weather-related reasons as a root cause.
You can bet on it that these losses need to be compensated somewhere may it be by increase in prices or decrease in staff pay (which usually means hiring less qualified personnel) just to name a few corners that could be cut. While at the same time you might be aiming to accomplish more with steady or perhaps decreasing IT budgets, such actions on provider’s side will prove to be counterproductive when trying to achieve your goals.

These are the average costs of the data center outages for the companies running them and we haven’t even touched the damages to businesses that are suffering from such outages. Your services being unavailable could end up costing you money directly by not meeting your SLA’s or losing customers’ orders to your operation coming to an abrupt stop. Long term impact might be a reputation hit and thus decrease in trust of your businesses abilities. Additionally if you are actively running marketing campaigns and invest in trade shows and other public promotional activities, the event of service outage on your part can reduce their impact on increasing you brand popularity. There is a number of factors that influence just how much downtime actually can cost your business and they are strictly tied to how much your entire operation depends on Information Technology all together and how much of it is being affected by outages, this is however outside of the scope of this article.

Bottom line
Depending on the dynamics of your business the requirements from compliance, law, budgeting and even personal bias of the decision makers will render some of the above mentioned factors of more or less importance, but in the end this decision will have a short and long term impact on your business continuity. So if you are involved in the decision making process my advice is: do your homework, talk to the providers, if possible visit the facilities and take in to account the points above. If you are expecting growth or perhaps want to make a switch from legacy systems (if they are a part of technology supporting your current operations) and you are considering leveraging Infrastructure-as-a-Service models then talk to the providers on your list and see if they offer such services and can accommodate your needs. Following these steps you can decrease and even completely avoid data center location choice negatively impacting your business’ bottom line!

Source