Nov 30, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
SQL Database  Source

[ AnalyticsWeek BYTES]

>> Getting a 360° View of the Customer – Interview with Mark Myers of IBM by bobehayes

>> The Blueprint for Becoming Data Driven: Data Quality by jelaniharper

>> May 04, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

Wanna write? Click Here

[ NEWS BYTES]

>>
 Why Google’s Artificial Intelligence Confused a Turtle for a Rifle – Fortune Under  Artificial Intelligence

>>
 Microsoft Workplace Analytics helps managers understand worker … – TechCrunch Under  Analytics

>>
 Storytelling – Two Essentials for Customer Experience Professionals – Customer Think Under  Customer Experience

More NEWS ? Click Here

[ FEATURED COURSE]

A Course in Machine Learning

image

Machine learning is the study of algorithms that learn from data and experience. It is applied in a vast variety of application areas, from medicine to advertising, from military to pedestrian. Any area in which you need… more

[ FEATURED READ]

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

image

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the “data-analytic thinking” necessary for e… more

[ TIPS & TRICKS OF THE WEEK]

Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.

[ DATA SCIENCE Q&A]

Q:How to clean data?
A: 1. First: detect anomalies and contradictions
Common issues:
* Tidy data: (Hadley Wickam paper)
column names are values, not names, e.g. 26-45…
multiple variables are stored in one column, e.g. m1534 (male of 15-34 years’ old age)
variables are stored in both rows and columns, e.g. tmax, tmin in the same column
multiple types of observational units are stored in the same table. e.g, song dataset and rank dataset in the same table
*a single observational unit is stored in multiple tables (can be combined)
* Data-Type constraints: values in a particular column must be of a particular type: integer, numeric, factor, boolean
* Range constraints: number or dates fall within a certain range. They have minimum/maximum permissible values
* Mandatory constraints: certain columns can’t be empty
* Unique constraints: a field must be unique across a dataset: a same person must have a unique SS number
* Set-membership constraints: the values for a columns must come from a set of discrete values or codes: a gender must be female, male
* Regular expression patterns: for example, phone number may be required to have the pattern: (999)999-9999
* Misspellings
* Missing values
* Outliers
* Cross-field validation: certain conditions that utilize multiple fields must hold. For instance, in laboratory medicine: the sum of the different white blood cell must equal to zero (they are all percentages). In hospital database, a patient’s date or discharge can’t be earlier than the admission date
2. Clean the data using:
* Regular expressions: misspellings, regular expression patterns
* KNN-impute and other missing values imputing methods
* Coercing: data-type constraints
* Melting: tidy data issues
* Date/time parsing
* Removing observations

Source

[ VIDEO OF THE WEEK]

@AngelaZutavern & @JoshDSullivan @BoozAllen discussed Mathematical Corporation #FutureOfData

 @AngelaZutavern & @JoshDSullivan @BoozAllen discussed Mathematical Corporation #FutureOfData

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

What we have is a data glut. – Vernon Vinge

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

140,000 to 190,000. Too few people with deep analytical skills to fill the demand of Big Data jobs in the U.S. by 2018.

Sourced from: Analytics.CLUB #WEB Newsletter

The convoluted world of data scientist

The convoluted world of data scientist.
The convoluted world of data scientist.

Data scientists are not dime a dozen and they are not in abundance as well. Buzz around bigdata has produced a job category that is not only confusing but has been costing companies a lot in their stride to look through the talent pool to dig for a so called data scientist. So, what exactly is the problem and why are we suddenly seeing a lot of data scientist emerging from nowhere with very different skill sets? To understand this we need to understand the bigdata phenomena.

With emergence of big data user companies like Google, Facebook, yahoo etc. and their amazing contribution to open source, new platforms have been developed to process too much data using commodity hardware in fast and yet, cost efficient ways. Now with that phenomenon, every company wants to get savvier when it comes to managing data to gain insights and ultimately building competitive edge over their competitors. But companies are used to understanding small pieces of data using their business analysts. But talk about more data and more tools. Who will fit in? So, they started on lookout for special breed of professional that have the capability to deal with big data and it’s hidden insights.

So, where is the problem here? The problem lies in the fact that only one job title emerged from this phenomenon- data scientist. The professionals who are currently practicing some data science via business analysis, data warehousing or data designing jumped on the bandwagon grabbing the title of the data scientist. What is interesting here is that data scientist job as explained above does not deserve a single job description so it should be handled accordingly. It was never a magical job title that has all the answers for any data curious organization, to be able to understand, develop and manage a data project.

Before we go into what companies should do, let’s reiterate what is a data scientist. As the name suggest, it is something to do with data and scientist. Which means it should include job description that has done some data engineering, data automation, and scientific computing with a hint of business capabilities. If we extrapolate, we are looking at a professional with computer science degree, doctorate in statistical computing and MBA in business. What would be luck in finding that candidate and by-the-way, they should have some industry domain expertise as well. What is the likelihood that such a talent exists? Rare. But, even if they are in abundance, companies should tackle this problem at much granular and sustainable scale. And one more thing to note here is that no two data scientist job requirements are the same. This means that your data scientist requirement could be extremely different from what anyone else is looking for in a data scientist. So, why should we have one title to cater to such a diverse category?

So, what should companies do? First it is important to understand that companies are building data scientists’ capabilities and should not be hiring the herd of data scientists. This means that companies/ hiring managers should understand that they are not looking for a particular individual but a team as a solution. It is important for businesses to clearly articulate those magic skillsets that their so-called data scientist should carry. Following this drill, companies should split the skillset into categories, Data analytics, Business analyst, data warehousing professionals, software developer, and data engineers to name a few. Finding a common island where business analysts, statistical computing modelers and data engineers work in harmony to address a system that handles big data is a great start. Think of it as putting together a central data office. Huh! another buzz word. Don’t worry; I will go into more details in the follow-up blogs. Think of it as a department where business, engineering and statistician work together on a common objective. Data science is nothing but an art to find value in lots of data. So, big-data is to build capability to parse/analyze lots of data. So, business should work through their laundry list of skillset. First identify internal resources that could accommodate that list. Following this, companies should form a hard matrix structure to prove the idea of set of people working together as a data scientist. BTW I am not saying that you need one individual from each category, but, together the team should have all the skills mentioned above

One important take away for companies is to understand that the moment they came across a so called data scientist, it is important to understand which side of data scientist the talent represents. Placing that talent in their respective silo will help provide a clearer vision when it comes to understanding the talent and understanding the void that could stay intact if the resources are not filled accordingly. So, living in this convoluted world of data scientist is hard and tricky. Having some chops into understanding data science as a talent, companies could really play the big data talent game to their advantage and lure some cutting edge people and grow sustainably.

Originally Posted at: The convoluted world of data scientist by v1shal

Important Strategies to Enhance Big Data Access

When you encounter apparently unmanageable and insurmountable Big Data sets, it seems practically impossible to get easy access to the correct data. The fact is, Big Data management could prove to be highly tricky and challenging and may come up with a few issues. However, effective data access could still be attained. Here are a few strategies for effectively achieving superlative data connectivity.

Understanding Hadoop

Hadoop is actually an ecosystem that has been designed for helping organizations in storing mammoth quantities of Big Data. It is important for you to have a sound understanding of ways to successfully bring your Big Data into and take it out of Hadoop so that you could effectively move ahead, as companies are involved in handling challenges of integrating Big Data within already existing data-infrastructure.

Integrating Cloud Data with Already Existing Reporting Applications

Integrating Cloud Data with already existing reporting applications such as Salesforce Dx has totally transformed the way you perceive and work with your customer data. These systems would, however, could face certain complications in acquiring the real-time reports. You would be relying on these reports for perfect business decision-making, thus generating the demand for an effective solution that allows such kind of real-time reporting.

Do Not Let the Sheer Scale of Big Data Get to You

Big Data could be hugely advantageous for businesses but if your organization is not ready to effectively handle it, you may have to do without the business value Big Data actually has on offer for you. Some organizations have the necessary scalable, flexible data infrastructure required for exploiting Big Data in order to achieve crucial business insight.

Access Salesforce Data via SQL

Salesforce data actually provides great value for numerous organizations; however, access issues could prove to be major obstacles in the way of organizations from reaping the fullest possible advantage. However, now businesses could effectively have an easy access to Salesforce data through ODBC and SQL. These smart drivers would be allowing you to create a connection and start implementing your queries in just a few minutes.

Do Accurate Analysis of Big Data

You could get a greater accuracy of big data depending on the technology utilized. There are several Big Data platforms that could be chosen by a company such as Apache Spark and Hadoop could come up with unique and accurate analysis of Big Data sets. More cutting-edge Big Data technology would be successfully generating more state-of-the-art Big Data models. Many organizations would be opting for a reliable Big Data provider. There are a great variety of options open to them and so businesses today could easily locate a Big Data provider that is suitable for their specific requirements and that comes up with accurate results or precise outcomes.

Conclusion

Business organizations must take extra initiative in assessing and analyzing the data collected by them. They must make sure that the data is collected from an authentic and reliable source. They must identify the context behind data generation. Every move involved in the analysis process requires being observed carefully right from the proper data ingestion to its enrichment and preparation. Data protection from external interference is essential.

Author Bio:

Sujain Thomas is a Salesforce consultant and discusses the benefits of Salesforce Dx in her blog posts. She is an avid blogger and has an impressive fan base.

 

Source: Important Strategies to Enhance Big Data Access

Nov 23, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Ethics  Source

[ AnalyticsWeek BYTES]

>> Apr 27, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..) by admin

>> How AI is hacking humanity! Lesson from #Brexit & #Election2016 by v1shal

>> Startup Movement Vs Momentum, a Classic Dilemma by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 Ditching Engagement in Favor of Blunt-Force Awareness Is a Temptation Marketers Must Avoid – Adweek Under  Social Analytics

>>
 Big Data Set to Get Much Bigger by 2021 – Which-50 (blog) Under  Big Data

>>
 Weak cyber-security protocols can rob companies off clients say experts – Exchange4Media Under  cyber security

More NEWS ? Click Here

[ FEATURED COURSE]

Hadoop Starter Kit

image

Hadoop learning made easy and fun. Learn HDFS, MapReduce and introduction to Pig and Hive with FREE cluster access…. more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:Is it better to design robust or accurate algorithms?
A: A. The ultimate goal is to design systems with good generalization capacity, that is, systems that correctly identify patterns in data instances not seen before
B. The generalization performance of a learning system strongly depends on the complexity of the model assumed
C. If the model is too simple, the system can only capture the actual data regularities in a rough manner. In this case, the system poor generalization properties and is said to suffer from underfitting
D. By contrast, when the model is too complex, the system can identify accidental patterns in the training data that need not be present in the test set. These spurious patterns can be the result of random fluctuations or of measurement errors during the data collection process. In this case, the generalization capacity of the learning system is also poor. The learning system is said to be affected by overfitting
E. Spurious patterns, which are only present by accident in the data, tend to have complex forms. This is the idea behind the principle of Occam’s razor for avoiding overfitting: simpler models are preferred if more complex models do not significantly improve the quality of the description for the observations
Quick response: Occam’s Razor. It depends on the learning task. Choose the right balance
F. Ensemble learning can help balancing bias/variance (several weak learners together = strong learner)
Source

[ VIDEO OF THE WEEK]

RShiny Tutorial: Turning Big Data into Business Applications

 RShiny Tutorial: Turning Big Data into Business Applications

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The data fabric is the next middleware. – Todd Papaioannou

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @DavidRose, @DittoLabs

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @DavidRose, @DittoLabs

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

According to Twitter’s own research in early 2012, it sees roughly 175 million tweets every day, and has more than 465 million accounts.

Sourced from: Analytics.CLUB #WEB Newsletter

Please share your thoughts about Steve Jobs

Everybody has an opinion about Steve Jobs. Please tell me what you think of him and how he has impacted your life in this brief survey.

Steve Jobs, co-founder of Apple, passed away earlier this week at the age of 56. In the process of writing about how he impacted my life in my blog, I created an image of him. To make this image, I collected quotes and articles that were written about him in the day following his passing. The quotes were from such notables like President Obama, Mark Zuckerberg, Guy Kawasaki, and Bill Gates, to name a few. Using these descriptive words of Steve Jobs, I created a word cloud in the form of his soon-to-be iconic image on Apple.com.

In the world cloud, the font size of the words is related to the frequency of usage of the words; the larger the font size, the more frequently that word is used to describe Steve Jobs. This picture essentially represents how these people define him, remember him.

I now want to be more purposeful in creating the same image using words from people who never met him but whose lives may have been impacted by him. Could you please complete my one-minute survey about Steve Jobs? I am also going to conduct sentiment analysis on your comments to understand the sentiment behind them. So… your survey responses help to create art and advance science. In addition to feeling good about yourself, I will notify you when this project is completed (if you provide your email address in the survey).

The more people who complete the survey, the more interesting the image(s) become(s) (e.g., look at age differences in sentiment). Please consider sharing the page using your social media savvy.
Thanks,

Bob E. Hayes, Ph.D.

Source: Please share your thoughts about Steve Jobs by bobehayes

Nov 16, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Insights  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> IT jobs to shift to new tech, data analytics, cloud services by analyticsweekpick

>> The Increasing Influence of Cloud Computing by jelaniharper

>> Factoid to Give Big-Data a Perspective by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 Microsoft Azure customers now can run workloads on Cray supercomputers – ZDNet Under  Data Scientist

>>
 ‘Cyber security a major challenge for govt organisations’ – Hindu Business Line Under  cyber security

>>
 Master of machines: the rise of artificial intelligence calls for postgrad experts – The Guardian Under  Artificial Intelligence

More NEWS ? Click Here

[ FEATURED COURSE]

Applied Data Science: An Introduction

image

As the world’s data grow exponentially, organizations across all sectors, including government and not-for-profit, need to understand, manage and use big, complex data sets—known as big data…. more

[ FEATURED READ]

Rise of the Robots: Technology and the Threat of a Jobless Future

image

What are the jobs of the future? How many will there be? And who will have them? As technology continues to accelerate and machines begin taking care of themselves, fewer people will be necessary. Artificial intelligence… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:You have data on the durations of calls to a call center. Generate a plan for how you would code and analyze these data. Explain a plausible scenario for what the distribution of these durations might look like. How could you test, even graphically, whether your expectations are borne out?
A: 1. Exploratory data analysis
* Histogram of durations
* histogram of durations per service type, per day of week, per hours of day (durations can be systematically longer from 10am to 1pm for instance), per employee…
2. Distribution: lognormal?

3. Test graphically with QQ plot: sample quantiles of log(durations)log?(durations) Vs normal quantiles

Source

[ VIDEO OF THE WEEK]

@BrianHaugli @The_Hanover ?on Building a #Leadership #Security #Mindset #FutureOfData #Podcast

 @BrianHaugli @The_Hanover ?on Building a #Leadership #Security #Mindset #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

In God we trust. All others must bring data. – W. Edwards Deming

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Dr. Nipa Basu, @DnBUS

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Dr. Nipa Basu, @DnBUS

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Retailers who leverage the full power of big data could increase their operating margins by as much as 60%.

Sourced from: Analytics.CLUB #WEB Newsletter

Evaluating Hospital Quality using Patient Experience, Health Outcomes and Process of Care Measures

Patient experience (PX) has become an important topic for US hospitals. The Centers for Medicare & Medicaid Services (CMS) will be using patient feedback about their care as part of their reimbursement plan for acute care hospitals (see Hospital Value-Based Purchasing Program). Not surprisingly, hospitals are focusing on improving the patient experience to ensure they receive the maximum of their incentive payments. Additionally, US hospitals track other types of metrics (e.g., process of care and mortality rates) as measures of quality of care.

Given that hospitals have a variety of metrics at their disposal, it would be interesting to understand how these different metrics are related with each other. Do hospitals that receive higher PX ratings (e.g., more satisfied patients) also have better scores on other metrics (lower mortality rates, better process of care measures) than hospitals with lower PX ratings? In this week’s post, I will use the following hospital quality metrics:

  1. Patient Experience
  2. Health Outcomes (mortality rates, re-admission rates)
  3. Process of Care

I will briefly cover each of these metrics below.

Table 1. Descriptive Statistics for PX, Health Outcomes and Process of Care Metrics for US Hospitals (acute care hospitals only)

1. Patient Experience

Patient experience (PX) reflects the patients’ perceptions about their recent inpatient experience. PX is collected by a survey known as HCAHPS (Hospital Consumer Assessment of Healthcare Providers and Systems). HCAHPS (pronounced “H-caps“) is a national, standardized survey of hospital patients and was developed by a partnership of public and private organizations and was created to publicly report the patient’s perspective of hospital care.

The survey asks a random sample of recently discharged patients about important aspects of their hospital experience. The data set includes patient survey results for over 3800 US hospitals on ten measures of patients’ perspectives of care (e.g., nurse communication, pain well controlled). I combined two general questions (Overall hospital rating and recommend) to create a patient advocacy metric. Thus, a total of 9 PX metrics were used. Across all 9 metrics, hospital scores can range from 0 (bad) to 100 (good). You can see the PX measures for different US hospital here.

2. Process of Care

Process of care measures show, in percentage form or as a rate, how often a health care provider gives recommended care; that is, the treatment known to give the best results for most patients with a particular condition. The process of care metric is based on medical information from patient records that reflects the rate or percentage across 12 procedures related to surgical care.  Some of these procedures are related to antibiotics being given/stopped at the right times and treatments to prevent blood clots.  These percentages were translated into scores that ranged from 0 (worse) to 100 (best).  Higher scores indicate that the hospital has a higher rate of following best practices in surgical care. Details of how these metrics were calculated appear below the map.

I calculated an overall Process of Care Metric by averaging each of the 12 process of care scores. The process of care metric was used because it has good measurement properties (internal consistency was .75) and, thus reflects a good overall measure of process of care. You can see the process of care measures for different US hospital here.

3. Health Outcomes

Measures that tell what happened after patients with certain conditions received hospital care are called “Outcome Measures.” We use two general types of outcome measures: 1) 30-day Mortality Rate and 2) 30-day Readmission Rate. The 30-day risk-standardized mortality and 30-day risk-standardized readmission measures for heart attack, heart failure, and pneumonia are produced from Medicare claims and enrollment data using sophisticated statistical modeling techniques that adjust for patient-level risk factors and account for the clustering of patients within hospitals.

The death rates focus on whether patients died within 30 days of their hospitalization. The readmission rates focus on whether patients were hospitalized again within 30 days.

Three mortality rate and readmission rate measures were included in the healthcare dataset for each hospital. These were:

  1. 30-Day Mortality Rate / Readmission Rate from Heart Attack
  2. 30-Day Mortality Rate / Readmission Rate from Heart Failure
  3. 30-Day Mortality Rate / Readmission Rate from Pneumonia

Mortality/Readmission rate is measured in units of 1000 patients. So, if a hospital has a heart attack mortality rate of 15, that means that for every 1000 heart attack patients, 15 of them die get readmitted. You can see the health outcome measures for different US hospital here.

Table 2. Correlations of PX metrics with Health Outcome and Process of Care Metrics for US Hospitals (acute care hospitals only).

Results

The three types of metrics (e.g., PX, Health Outcomes, Process of Care) were housed in separate databases on the data.medicare.gov site. As explained elsewhere in my post on Big Data, I linked these three data sets together by hospital name. Basically, I federated the necessary metrics from their respective database and combined them into a single data set.

Descriptive statistics for each variable are located in Table 1. The correlations of each of the PX measures with each of the Health Outcome and Process of Care Measures is located in Table 2. As you can see, the correlations of PX with other hospital metrics is very low, suggesting that PX measures are assessing something quite different than the Health Outcome Measures and Process of Care Measures.

Patient Loyalty and Health Outcomes and Process of Care

Patient loyalty/advocacy (as measured by the Patient Advocacy Index) is logically correlated with the other measures (except for Death Rate from Heart Failure). Hospitals that have higher patient loyalty ratings have lower death rates, readmission rates and higher levels of process of care. The degree of relationship, however, is quite small (the percent of variance explained by patient advocacy is only 3%).

Patient Experience and Health Outcomes and Process of Care

Patient experience (PX) shows a complex relationship with health outcome and process of care measures. It appears that hospitals that have higher PX ratings also report higher death rates. However, as expected, hospitals that have higher PX ratings report lower readmission rates. Although statistically significant, all of the correlations of PX metrics with other hospitals metrics are low.

The PX dimension that had the highest correlation with readmission rates and process of care measures was “Given Information about my Recovery upon discharge“.  Hospitals who received high scores on this dimensions also experienced lower readmission rates and higher process of care scores.

Summary

Hospitals are tracking different types of quality metrics, metrics being used to evaluate each hospital’s performance. Three different metrics for US hospitals were examined to understand how well they are related to each other (there are many other metrics on which hospitals can be compared). Results show that the patient experience and patient loyalty are only weakly related to other hospital metrics, suggesting that improving the patient experience will have little impact on other hospital measures (health outcomes, process of care).

 

Originally Posted at: Evaluating Hospital Quality using Patient Experience, Health Outcomes and Process of Care Measures by bobehayes

Why Using the ‘Cloud’ Can Undermine Data Protections

By Jack Nicas

While the increasing use of encryption helps smartphone users protect their data, another sometime related technology, cloud computing, can undermine those protections.

The reason: encryption can keep certain smartphone data outside the reach of law enforcement. But once the data is uploaded to companies’ computers connected to the Internet–referred to as “the cloud”–it may be available to authorities with court orders.
“The safest place to keep your data is on a device that you have next to you,” said Marc Rotenberg, head of the Electronic Privacy Information Center. “You take a bit of a risk when you back up your device. Once you do that it’s on another server.”

Encryption and cloud computing “are two competing trends,” Mr. Rotenberg said. “The movement to the cloud has created new privacy risks for users and businesses. Encryption does offer the possibility of restoring those safeguards, but it has to be very strong and it has to be under the control of the user.”

Apple is fighting a government request that it help the Federal Bureau of Investigation unlock the iPhone of Syed Rizwan Farook, the shooter in the December terrorist attack in San Bernardino, Calif.

The FBI believes the phone could contain photos, videos and records of text messages that Mr. Farook generated in the final weeks of his life.

The data produced before then? Apple already provided it to investigators, under a court search warrant. Mr. Farook last backed up his phone to Apple’s cloud service, iCloud, on Oct. 19.

Encryption scrambles data to make it unreadable until accessed with the help of a unique key. The most recent iPhones and Android phones come encrypted by default, with a user’s passcode activating the unique encryption key stored on the device itself. That means a user’s contacts, photos, videos, calendars, notes and, in some cases, text messages are protected from anyone who doesn’t have the phone’s passcode. The list includes hackers, law enforcement and even the companies that make the phones’ software: Apple and Google.

However, Apple and Google software prompt users to back up their devices on the cloud. Doing so puts that data on the companies’ servers, where it is more accessible to law enforcement with court orders.

Apple says it encrypts data stored on its servers, though it holds the encryption key. The exception is so-called iCloud Keychain data that stores users’ passwords and credit-card information; Apple says it can’t access or read that data.

Officials appear to be asking for user data more often. Google said that it received nearly 35,000 government requests for data in 2014 and that it complies with the requests in about 65% of cases. Apple’s data doesn’t allow for a similar comparison since the company reported the number of requests from U.S. authorities in ranges in 2013.

Whether they back up their smartphones to the cloud, most users generate an enormous amount of data that is stored outside their devices, and thus more accessible to law enforcement.

“Your phone is an incredibly intricate surveillance device. It knows everyone you talk to, where you are, where you live and where you work,” said Bruce Schneier, chief technology officer at cybersecurity firm Resilient Systems Inc. “If you were required to carry one by law, you would rebel.”

Google, Yahoo Inc. and others store users’ emails on their servers. Telecom companies keep records of calls and some standard text messages.
Facebook
Inc. and Twitter Inc. store users’ posts, tweets and connections.

Even Snapchat Inc., the messaging service known for photo and video messages that quickly disappear, stores some messages. The company says in its privacy policy that “in many cases” it automatically deletes messages after they are viewed or expire. But it also says that “we may also retain certain information in backup for a limited period or as required by law” and that law enforcement sometimes requires it “to suspend our ordinary server-deletion practices for specific information.”

Snapchat didn’t respond to a request for comment.

Write to Jack Nicas at jack.nicas@wsj.com
(END) Dow Jones Newswires
02-18-161938ET
Copyright (c) 2016 Dow Jones & Company, Inc.

Source: Why Using the ‘Cloud’ Can Undermine Data Protections by analyticsweekpick

Nov 09, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Accuracy check  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Surviving the Internet of Things by v1shal

>> Map of US Hospitals and their Health Outcome Metrics by bobehayes

>> Eradicating Silos Forever with Linked Enterprise Data by jelaniharper

Wanna write? Click Here

[ NEWS BYTES]

>>
 The Importance of TSP Snapshot Statistics – FEDweek Under  Statistics

>>
 World’s largest data center to be built in Arctic Circle – CNBC Under  Data Center

>>
 Hybrid cloud and blockchain solutions will be the future for data … – Information Age Under  Hybrid Cloud

More NEWS ? Click Here

[ FEATURED COURSE]

A Course in Machine Learning

image

Machine learning is the study of algorithms that learn from data and experience. It is applied in a vast variety of application areas, from medicine to advertising, from military to pedestrian. Any area in which you need… more

[ FEATURED READ]

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

image

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the “data-analytic thinking” necessary for e… more

[ TIPS & TRICKS OF THE WEEK]

Finding a success in your data science ? Find a mentor
Yes, most of us dont feel a need but most of us really could use one. As most of data science professionals work in their own isolations, getting an unbiased perspective is not easy. Many times, it is also not easy to understand how the data science progression is going to be. Getting a network of mentors address these issues easily, it gives data professionals an outside perspective and unbiased ally. It’s extremely important for successful data science professionals to build a mentor network and use it through their success.

[ DATA SCIENCE Q&A]

Q:What is statistical power?
A: * sensitivity of a binary hypothesis test
* Probability that the test correctly rejects the null hypothesis H0H0 when the alternative is true H1H1
* Ability of a test to detect an effect, if the effect actually exists
* Power=P(reject H0|H1istrue)
* As power increases, chances of Type II error (false negative) decrease
* Used in the design of experiments, to calculate the minimum sample size required so that one can reasonably detects an effect. i.e: ‘how many times do I need to flip a coin to conclude it is biased?’
* Used to compare tests. Example: between a parametric and a non-parametric test of the same hypothesis

Source

[ VIDEO OF THE WEEK]

Data-As-A-Service (#DAAS) to enable compliance reporting

 Data-As-A-Service (#DAAS) to enable compliance reporting

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

You can use all the quantitative data you can get, but you still have to distrust it and use your own intelligence and judgment. – Alvin Tof

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @MPFlowersNYC, @enigma_data

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @MPFlowersNYC, @enigma_data

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

571 new websites are created every minute of the day.

Sourced from: Analytics.CLUB #WEB Newsletter

Surge in real-time big data and IoT analytics is changing corporate thinking

Big data that can be immediately actionable in business decisions is transforming corporate thinking. One expert cautions that a mindset change is needed to get the most from these analytics.

Gartner reported in September 2014 that 73% of respondents in a third quarter 2014 survey had already invested or planned to invest in big data in the next 24 months. This was an increase from 64% in 2013.

The big data surge has fueled the adoption of Hadoop and other big data batch processing engines, but it is also moving beyond batch and into a real-time big data analytics approach.

Organizations want real-time big data and analytics capability because of an emerging need for big data that can be immediately actionable in business decisions. An example is the use of big data in online advertising, which immediately personalizes ads for viewers when they visit websites based on their customer profiles that big data analytics have captured.

“Customers now expect personalization when they visit websites,” said Jeff Kelley, a big data analytics analyst from Wikibon, a big data research and analytics company. “There are also other real-time big data needs in specific industry verticals that want real-time analytics capabilities.”

The financial services industry is a prime example. “Financial institutions want to cut down on fraud, and they also want to provide excellent service to their customers,” said Kelley. “Several years ago, if a customer tried to use his debit card in another country, he was often denied because of fears of fraud in the system processing the transaction. Now these systems better understand each customer’s habits and the places that he is likely to travel to, so they do a better job at preventing fraud, but also at enabling customers to use their debit cards without these cards being locked down for use when they travel abroad.”

Kelly believes that in the longer term this ability to apply real-time analytics to business problems will grow as the Internet of Things (IoT) becomes a bigger factor in daily life.

“The Internet of Things will enable sensor tacking of consumer type products in businesses and homes,” he said. “You will be collect and analyze data from various pieces of equipment and appliances and optimize performance.”

The process of harnessing IoT data is highly complex, and companies like GE are now investigating the possibilities. If this IoT data can be captured in real time and acted upon, preventive maintenance analytics can be developed to preempt performance problems on equipment and appliances, and it might also be possible for companies to deliver more rigorous sets of service level agreements (SLAs) to their customers.

Kelly is excited at the prospects, but he also cautions that companies have to change the way they view themselves and their data to get the most out of IoT advancement.

“There is a fundamental change of mindset,” he explained, “and it will require different ways of approaching application development and how you look at the business. For example, a company might have to redefine itself from thinking that it only makes ‘makes trains,’ to a company that also ‘services trains with data.'”

The service element, warranties, service contracts, how you interact with the customer, and what you learn from these customer interactions that could be forwarded into predictive selling are all areas that companies might need to rethink and realign in their business as more IoT analytics come online. The end result could be a reformation of customer relationship management (CRM) to a strictly customer-centric model that takes into account every aspect of the customer’s “life cycle” with the company — from initial product purchases, to servicing, to end of product life considerations and a new beginning of the sales cycle.

Originally posted via “Surge in real-time big data and IoT analytics is changing corporate thinking”

Originally Posted at: Surge in real-time big data and IoT analytics is changing corporate thinking by analyticsweekpick