Aug 27, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://news.analyticsweek.com/tw/newspull.php): failed to open stream: HTTP request failed! in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data security  Source

[ AnalyticsWeek BYTES]

>> Transforming the Indian Public Delivery System with Big Data Analytics by administrator

>> High Bounce Rate? Here are the Reasons & What You Should Do by administrator

>> Voices in AI – Episode 90: A Conversation with Norman Sadeh by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Artificial Intelligence

image

This course includes interactive demonstrations which are intended to stimulate interest and to help students gain intuition about how artificial intelligence methods work under a variety of circumstances…. more

[ FEATURED READ]

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

image

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored f… more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:What is statistical power?
A: * sensitivity of a binary hypothesis test
* Probability that the test correctly rejects the null hypothesis H0H0 when the alternative is true H1H1
* Ability of a test to detect an effect, if the effect actually exists
* Power=P(reject H0|H1istrue)
* As power increases, chances of Type II error (false negative) decrease
* Used in the design of experiments, to calculate the minimum sample size required so that one can reasonably detects an effect. i.e: ‘how many times do I need to flip a coin to conclude it is biased?’
* Used to compare tests. Example: between a parametric and a non-parametric test of the same hypothesis

Source

[ VIDEO OF THE WEEK]

@DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

 @DrewConway on creating socially responsible data science practice #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Information is the oil of the 21st century, and analytics is the combustion engine. – Peter Sondergaard

[ PODCAST OF THE WEEK]

Future of HR is more Relationship than Data - Scott Kramer @ValpoU #JobsOfFuture #Podcast

 Future of HR is more Relationship than Data – Scott Kramer @ValpoU #JobsOfFuture #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Facebook users send on average 31.25 million messages and view 2.77 million videos every minute.

Sourced from: Analytics.CLUB #WEB Newsletter

Egnyte Combines Data Governance and Collaboration in Streamlined Content Services Platform

Companies are constantly developing content and creating files. Once upon a time, that just meant adding more storage to a local server, but today the world is cloud-based and collaborative. Organizations need to be able to work together on content in real-time and securely store and manage the data in a way that meets industry […]

The post Egnyte Combines Data Governance and Collaboration in Streamlined Content Services Platform appeared first on TechSpective.

Source by administrator

Karen Harris (@kkharris33 @BainAlerts) discussing #FutureOfWork #Labor2030 #JobsOfFuture #Podcast

In this podcast Karen Harris (@kkharris33 @BainAlerts) sat with Vishal (@Vishaltx from @AnalyticsWeek)to discuss Labor 2030 report. They shared some of the key trends that will play significant role in shaping the future of work, worker and workplace. She sheds light into what businesses could take away from the trends at play today and adjust their strategy to ensure they stay relevant in the future. This is a great session for ones looking to understand the future trends and how the future of work would shape amidst technology disruption.

You could access Bain’s report @ www.bain.com/publications/artic…nd-inequality.aspx

Karen’s Recommended Read:
The Proud Tower: A Portrait of the World Before the War, 1890-1914 by Barbara W. Tuchman amzn.to/2sTge2k
Man’s Search for Meaning by Viktor E. Frankl amzn.to/2JDOYzu
Pride and Prejudice by Jane Austen and Tony Tanner amzn.to/2MbHkeb

Podcast Link:
iTunes: math.im/itunes
GooglePlay: math.im/gplay

Here is Karen’s Bio:
Karen Harris is the Managing Director of Bain & Company’s Macro Trends Group. She is based out of the firm’s New York office.
Karen frequently works with institutional investors to embed macro strategy into their investment strategy and due diligence.
She is regularly featured in major media outlets including the Wall Street Journal, Financial Times, Forbes, Economic Times of India, Caijing China, CEO Forum Australia, Bloomberg Television and Global Entrepolis Singapore.
She is a member of the Council on Foreign Relations, the National Committee on US-China Relations and the Economics Club of New York. She also serves on the Board of Pencils of Promise, a non-profit that partners with local communities in developing countries to build schools, focusing on early education, high potential females and building young leadership at home and abroad.

Karen has an MBA with distinction from Harvard Business School and a JD from Columbia Law School. She graduated with honors from Stanford University, where she received a BA in Economics and a BA in International Relations.

About #Podcast:
#JobsOfFuture podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the work, worker and workplace of the future.

Want to sponsor?
Email us @ info@analyticsweek.com

Keywords:
#JobsOfFuture
JobsOfFuture
Jobs of future
Future of work
Leadership
Strategy

Source: Karen Harris (@kkharris33 @BainAlerts) discussing #FutureOfWork #Labor2030 #JobsOfFuture #Podcast by v1shal

Aug 20, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Fake data  Source

[ AnalyticsWeek BYTES]

>> CAO Now! Why every CEO in America Needs a CAO. by tony

>> June 5, 2017 Health and Biotech analytics news roundup by pstein

>> @JohnNives on ways to demystify AI for enterprise #FutureOfData by admin

Wanna write? Click Here

[ FEATURED COURSE]

Data Mining

image

Data that has relevance for managerial decisions is accumulating at an incredible rate due to a host of technological advances. Electronic data capture has become inexpensive and ubiquitous as a by-product of innovations… more

[ FEATURED READ]

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

image

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the “data-analytic thinking” necessary for e… more

[ TIPS & TRICKS OF THE WEEK]

Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.

[ DATA SCIENCE Q&A]

Q:How do you know if one algorithm is better than other?
A: * In terms of performance on a given data set?
* In terms of performance on several data sets?
* In terms of efficiency?
In terms of performance on several data sets:

– ‘Does learning algorithm A have a higher chance of producing a better predictor than learning algorithm B in the given context?”
– ‘Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets”, A. Lacoste and F. Laviolette
– ‘Statistical Comparisons of Classifiers over Multiple Data Sets”, Janez Demsar

In terms of performance on a given data set:
– One wants to choose between two learning algorithms
– Need to compare their performances and assess the statistical significance

One approach (Not preferred in the literature):
– Multiple k-fold cross validation: run CV multiple times and take the mean and sd
– You have: algorithm A (mean and sd) and algorithm B (mean and sd)
– Is the difference meaningful? (Paired t-test)

Sign-test (classification context):
Simply counts the number of times A has a better metrics than B and assumes this comes from a binomial distribution. Then we can obtain a p-value of the HoHo test: A and B are equal in terms of performance.

Wilcoxon signed rank test (classification context):
Like the sign-test, but the wins (A is better than B) are weighted and assumed coming from a symmetric distribution around a common median. Then, we obtain a p-value of the HoHo test.

Other (without hypothesis testing):
– AUC
– F-Score

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Without big data, you are blind and deaf and in the middle of a freeway. – Geoffrey Moore

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData with Jon Gibs(@jonathangibs) @L2_Digital

 #BigData @AnalyticsWeek #FutureOfData with Jon Gibs(@jonathangibs) @L2_Digital

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Bad data or poor data quality costs US businesses $600 billion annually.

Sourced from: Analytics.CLUB #WEB Newsletter

Leadership Profiles: New CIOs Take the Reins in 12 States

microphone on a dais in front of a room of empty seats

While the first few months of 2020 were notable to be sure, they were perhaps especially daunting for the new permanent state chief information officers who stepped up to their posts amid the turmoil of the novel coronavirus. But the business of government IT pressed on, and while these CIOs were tasked with challenges like quickly transitioning to telework, shoring up cybersecurity and tracking COVID-19 data, the nuts and bolts of gov tech remained: modernizing legacy systems, expanding broadband and coordinating with state leadership on long-term priorities.

Our editorial staff got to know this new class of state CIOs, checking in as they settled into their positions at an unprecedented moment in history.

[slideshow-break]

Tracy Barnes, Indiana

Transparency is a cornerstone of the leadership style Tracy Barnes brings to his position as the head of Indiana’s Office of Technology (IOT). He prefers to engage fully with staff at all levels of the organization, sharing what he knows and relying upon their expertise to make the best possible decisions. But he started the job at the end of March, the same month the state saw its first case of COVID-19. That reality has made the kind of communication he seeks a little harder.

“I want to figure out how I make sure I get that message out there in a comprehensive manner to the full footprint of the agency so everyone understands the value that they’re bringing, especially in this time where tensions are high, pressure is high and expectations are high,” he said. “The ability to walk up and down the aisleways and just say ‘hi’ or ‘good job’ or ‘attaboy’ are limited and probably won’t be available at any true capacity for a while.”

Barnes brings a multi-faceted background to the CIO job, having worked extensively in the ERP area in private industry and higher education both in the United States and abroad. He made the move to government a few years ago as IT director for the state auditor’s office, then as chief of staff for the lieutenant governor. As state CIO, Barnes acknowledges the strong foundation built through the expertise of his predecessors and looks forward to bringing his enterprise-level skills to bear in order to move all agencies forward using technology.

A proponent of as-a-service technologies, Barnes envisions that establishing a sustainable, supportable multi-cloud offering will be a critical part of meeting the state’s needs. He wants to solidify IOT’s role in guiding agencies toward secure solutions that fit within the broader operational IT framework, especially in areas like data integration. Noting that technology investments tend to outlive the agency leadership that was in place at the time of the purchase, he takes a longer-term view of IOT’s responsibility.

“We need to make sure that folks still at the state can continue supporting them and managing them and maintaining them for potential turnover and for succession planning down the road,” he said.

— Noelle Knell

[slideshow-break]

Annette Dunn, Iowa

Annette Dunn is no stranger to the inner workings of government. She was named by Gov. Kim Reynolds to the CIO role in July of 2019, following a four-year stint as IT division director and CIO of the state Department of Transportation. And her DOT post came on the heels of nearly a decade in other roles with the state — notably among those as a key player in a statewide project to equip snowplows with GPS and advanced vehicle location technology that has since been used in a number of other states.

Taking on the state CIO job comes with similar challenges as roles she’s previously held, Dunn said, just on a bigger scale. And rising costs and flat or declining budgets place even more pressure on IT resources.

“We must provide the innovation and access to Iowans that they expect and need,” she said. “This puts a larger burden on the use of data and technology to help us make more strategic decisions and think outside of the box to be able to deliver services in more convenient and customer-friendly ways.”

Getting a handle on the state’s data is a major priority for Dunn. She’s eyeing a robust data warehouse that can be relied upon as a resource to users across the state in order to inform the best possible business decisions. And there’s work to be done to get there: getting a clear picture of the data held by various state systems; deduplication and standardization; and establishing access controls.  

“The creation of a strong, reliable data warehouse that is easily utilized will make us a stronger state and help us make better business decisions now and well into the future,” she added.

Her approach to leading the broader IT organization is to look both outward and inward. She sees vendor partners as playing a critical role in helping agencies meet their technology needs, as they can move more quickly and efficiently, often at a lower cost. But pivoting away from internal development is a big cultural shift, which explains why internal communication is another huge area of focus.

“From a leadership standpoint, it comes down to changing the culture and helping people see the big picture,” she explained. “I spend a lot of time convincing employees that change is just a different way of doing things, and at the end of the day there will always be work for them to do that is important and necessary.”

— Noelle Knell

[slideshow-break]


Jeff Wann, Missouri

Jeff Wann was on the job two weeks when the coronavirus upended life in Missouri, and across the country, all but shutting down the state.

“And everything changed,” Wann remarked on the last day of April, as the state counted more than 7,500 confirmed cases of COVID-19, and 329 fatalities related to the disease. Missouri Gov. Mike Parsons declared a state of emergency on March 13.

Earlier this year, Wann was named Missouri’s new CIO, bringing a long career spanning the public, private and nonprofit sectors. He was tapped, in part, for his leadership in the job of modernizing Missouri’s Office of Administrative IT Services.

An overarching goal was “to help modernize the IT systems, and to help transform processes and procedures, and to help further mature the IT organization,” said Wann.

“The COVID-19 situation has helped us to accelerate that,” he said. “It’s a silver lining in a very dark cloud, because obviously, COVID-19 is a terrible thing for folks. But on the other hand, it’s been a catalyst to help us to be able to help our citizens. And frankly, help other states.”

The crisis required quick action in a number of areas. Tools like chatbots, which can take months to develop, were being turned out in only weeks and days. The state teamed up with the Missouri Hospital Association to launch a new tool, developed by Google, to form a marketplace that matches state suppliers of personal protective equipment with health-care workers. Telephone, GIS and other systems were upgraded and improved to meet the new challenges the crisis called for.

When Missouri does return to more normal operations, Wann plans to return to his punch list for modernizing IT.

“Now, it’s going to be tough,” he added. “Because projected revenue in fiscal year ’21 is not looking good for any state.”

“But I expect to keep going,” Wann continued. “I expect that now that we are training our people in these new technologies, we can continue on doing those things with the budgets as they are.”

— Skip Descant

[slideshow-break]

Brom Stibitz, Michigan

When he started as Michigan CIO on March 4, Brom Stibitz was prepared. A lifelong resident of the state, besides a few years overseas after college, Stibitz had been chief deputy director of the Michigan Department of Technology, Management and Budget for five years. Before that he had been director of executive operations for the state Department of Treasury, a senior policy adviser, and a legislative director at the state House of Representatives. He was was ready to hit the ground running.

Then the pandemic hit.

Like most state CIOs, Stibitz had to set aside what he thought he’d be doing this spring and instead manage organization-wide emergency measures, including telework on a scale that Michigan had never attempted before. He spent much of those first weeks overseeing preparations for nearly 28,000 state employees: doubling VPN firewall capacity, finding laptops and organizing staff training on various tools for working from home.

He defines his broader priorities for Michigan as efficient and effective government, IT accountability, customer experience, and (of course) cybersecurity. He said those weren’t explicit directives from Gov. Gretchen Whitmer, but they appeared to be shared goals among state departments.

“There’s been a lot of focus on, how do we make sure that services of the state are accessible to people, not just that it’s there and people can use it, but how do you make it so people can understand it and it’s truly accessible?” he said. “The other area of focus has been efficiency … How do we make sure that we’re consolidating around solutions instead of expanding the state’s footprint?”

Asked what recent IT projects he’s most glad to have done, Stibitz mentioned a couple that weren’t flashy, but critical: developing a single sign-on solution, now used by more than 200 applications, to simplify security; and transitioning state employees to Microsoft Office 365, which reduced their reliance on network storage, revealed several practical and budgetary efficiencies, and unwittingly prepared the state to work from home.

In April, Stibitz was fairly sanguine about the results of the state’s telework operation, but under no illusions about the economic challenges to come.

“We’re looking at precipitous declines in revenue over the next six, 12, 18 months,” he said. “So there’s going to be more pressure than ever on IT to (a) find efficiencies within what we’re doing, and (b) to find solutions that can help agencies or customers save money.”

— Andrew Westrope

[slideshow-break]

John Salazar, New Mexico

When John Salazar became New Mexico’s IT secretary on March 2, he knew he’d have to address issues such as dated infrastructure and broadband access.

What he didn’t know was that his new role would soon revolve around responding to a pandemic that would infect more than 1 million Americans within two months. The workload has been gigantic.

“We’re working weekends. We’re working nights. It’s been a challenge for us,” Salazar said.

The first big hurdle was ensuring that roughly 20,000 government employees could work from home. Salazar thinks the mission was accomplished, but not without hiccups. The state filed an emergency order with a vendor for 1,000 laptops, but the machines came a month late, so Salazar’s team had to get creative in an IT structure where agencies manage their own networks and workstations.

“The first couple of weeks were chaos,” Salazar recalled. “We were all working in different directions.”

He spent a lot of time in April collaborating with the New Mexico Department of Health, which has two legacy systems that receive COVID-19 testing data from the Centers for Disease Control and Prevention. The goal was to create a dashboard with relevant information for Gov. Michelle Lujan Grisham, which required Salazar’s team to, among other steps, stand up a platform in the cloud and develop data interfaces between the legacy systems.

Having previously worked as a CIO in two different state agencies — Taxation and Revenue, and the Department of Workforce Solutions — Salazar was well prepared for his position as head of New Mexico IT. But no one could foresee the long-term impact of COVID-19. Salazar attempted to compare the situation to Y2K, but he pointed out that at least with Y2K, there was a “long climbing process” during which people knew what was potentially coming.

Now leaders like Salazar must react in ways that might forever change how states utilize technology. He sees plenty of opportunities to improve New Mexico’s systems by incorporating more cloud solutions, automating more processes and putting in place more procedures for better cybersecurity.

As for government meetings, New Mexico is holding a tremendous amount of virtual sessions — a new trend that perhaps should be a norm.

“All of our employees are doing this on a regular basis, and that’s something that probably needs to continue in the future,” Salazar said.

— Jed Pressgrove

[slideshow-break]

Tracy Doaks, North Carolina

Tracy Doaks is a self-proclaimed technologist at heart. “I love talking about it, translating that in business terms so that I can talk about it with different audiences, understanding the finance side of it,” she said.

That was essential in her last four years as deputy CIO in North Carolina, where she primarily focused on back-of-house operations like data centers, the state network, and cloud and identity management. Since she took the head post as CIO in March, Doaks has had to pivot to more outward-facing work, coordinating with the governor’s office and doing more public speaking. “Now my focus has expanded to all facets of the Department of Information Technology [DIT],” she explained, “so that includes cybersecurity, data analytics, rural broadband, 911, digital transformation.”

Doaks has spent 20 years in and out of government, in the North Carolina Department of Revenue, where she was CIO, as well as time in health care and with Accenture. That experience meant that when she became state CIO, just as COVID-19 was gaining ground in the U.S., she had a firm grasp on how government IT works and the role it would play in adapting to the pandemic. DIT quickly stood up a crisis response team that met twice daily, “so that we could understand obstacles and challenges around the state and knock them down quickly.”

Some of those challenges included putting together a coronavirus website in just a few days to help pull heavy Internet traffic away from Health and Human Services, as well as supporting similar issues at the Division of Employment Security.

And the pandemic of course heightened the need to expand broadband connectivity, particularly throughout the state’s rural areas, which Doaks had to quickly get up to speed on. “As the schoolchildren and college kids were sent home and employees were sent home,” she said, “that made it even more critical and a top priority for us.”

— Lauren Harrison

Editor’s note: After the July/August issue of Government Technology went to press, Doaks stepped down as CIO to helm a tech-related nonprofit in North Carolina.

[slideshow-break]

Jerry Moore, Oklahoma

Jerry Moore took the reins as Oklahoma’s new CIO in February, at a time when the state’s IT organization and direction was shifting.

Gov. Kevin Stitt, who appointed Moore, has made it known he is pursuing a new direction for Oklahoma IT, prioritizing digital transformation and modernization as two of the main efforts of his administration.

Part of this has involved a reorganization of the state’s Office of Management and Enterprise Services: OMES is in the process of conducting an audit meant to identify unnecessary expenditures, which was ongoing when Moore came on the scene.

Having spent a decade as the CIO for the Tulsa Technology Center — the educational IT hub affiliated with the Oklahoma Department of Career and Technology Education — Moore is no outsider to government work. Before becoming CIO, he also worked as the state’s director of IT application services.

At the same time, it is his private-sector experience that has likely given Moore the skill set that is most appealing in light of the governor’s effort: Having held IT leadership roles for large companies like ConocoPhillips and SiteTraxx, Moore has consistently shown an ability to take on IT restructuring projects that hew to long-term strategic goals.

Stitt has said this is what he hopes Moore will bring to the job: an ability to deliver high-performance, cost-effective solutions as the state navigates its modernization efforts.

“Jerry’s more than 20 years of experience in technology leadership in the public and private sectors will serve Oklahomans well as we continue our efforts in becoming a top ten state,” said OMES Director Stephen Harpe in a statement. “He has a proven record in identifying and executing new technologies to solve business problems.”

— Lucas Ropek

[slideshow-break]

Jeffrey Clines, South Dakota

A drive to help people, rather than increase margins and decrease bottom lines, is what brought Jeffrey Clines to public-sector work. He began his career in the private sector, then spent more than a decade in operations and enterprise applications for the American Heart Association. But even that nonprofit work didn’t fully get at Clines’ desire to impact real people’s lives. In 2018 he moved to government, as CIO for the Illinois Secretary of State’s office, before becoming head of the South Dakota Bureau of Information and Telecommunications this past April.

“I believe that in technology, you can’t stay in one place,” Clines wrote in an email to Government Technology. “We push forward, finding ways to leverage technology — especially emerging tech — to improve service, systems and processes.” He’s committed to working with state agencies to target where they want to go and see how tech will help get them there.

“There is no stable ground anymore,” he said. “The days of being able to stand up a system and have it work for 20-plus years are no longer here.”

That forward-looking approach to state IT has so far served him well during his tenure, which of course began amid the COVID-19 pandemic as South Dakota moved to nearly all remote work. When Clines thinks about when the U.S. will “return to normal,” he hopes it doesn’t. “If we push to go back to where we were, we may lose some of the valuable lessons we have learned in this process — things like the ability of our teams to work independently and remotely, or how we have really pushed to look at ways to be creative in delivering services while maintaining social distancing.”

One area where Clines sees this having the most impact is on rural communities as people are given the option to work away from city centers and be just as productive. South Dakota is home to many scenic, far-flung areas that Clines hopes to reinvigorate via telework, which he notes will need investment in broadband and other critical infrastructure to thrive.

— Lauren Harrison

[slideshow-break]

Bill Smith, Alaska

Bill Smith took up the role of Alaska’s state CIO in late 2019, a turbulent time for the state IT department. He became the fifth person to hold the job since 2018, after a series of interim leaders.

Why the turnover? The state had launched an effort to centralize its technology offerings, and the process “didn’t go as well as everybody wanted,” Smith said. The job nonetheless appealed to him because, as he saw it, the fundamentals had begun to fall into place.

“The state leadership is very supportive, from the governor to all the cabinet-level department leaders. We’ve also brought in external resources, where before the centralization effort was being done entirely in house,” he said. “I believe the conditions exist now for us to have long-term success with this effort.”

Those external resources included a third-party assessment of the overall IT ecosystem, which Smith is now leveraging as the basis for prioritization. He also is busy updating all the technology-position descriptions to align with a modernized IT environment, and he’s taking a fresh look at the service catalog.

“We want to build out the organizational chart more fully, to nail down what the services are, how we will provide those and with what resources,” he said. This will include an overhaul of IT governance, with an eye toward creating a more collaborative relationship between the Office of Information Technology and state agencies.

“I’d also like to be able to be the broker,” he said. “If the departments have an IT need, my team can figure out what services are available, so that the departments can focus on meeting their own business needs.”

— Adam Stone

[slideshow-break]

J.R. Sloan, Arizona

J.R. Sloan joined Arizona state government in 2013 as manager of the digital government program. He moved to the deputy CIO position, and then became interim CIO in July 2019. When he officially assumed the CIO role in March 2020, he set cybersecurity as a top goal.

“In Arizona we have a federated model, where agencies have a large degree of autonomy and independence,” he said. To counter the cyber-risk inherent in that model, Sloan has been implementing an enterprise approach to security, standardizing agencies on a set of 16 different security controls.

While legislative funding has helped drive the change, “we really needed to get the agencies involved in the process,” he said. “We set up a committee with security officers from each of the agencies, all bringing front-line knowledge that we can use that to guide the process.”

Looking forward, he said the IT department will try to expand that cooperative mentality across a broader range of IT needs. “We can bring enterprise services to the agencies and save them having to execute on those tasks themselves, so that they can redirect those resources to actually work on the mission of the agency.”

To that end, the state already was engaged in a move to Google’s G Suite for mail and calendar, with 40,000 employees across 80 different agencies onboarded in two years, Sloan said. Now he is looking to expand that implementation to include things like document editors and chat features.

“All this stuff is included in the suite that we’ve already paid for,” he said. Beyond the financial pitch, he’s been wooing agencies on the basis of functionality. “Within the suite, everything is connected and it all works well together. When you can have the same document up with someone else and you can collaborate in real time — that’s when we really catch people’s attention.”

— Adam Stone

[slideshow-break]

DeAngela Burns-Wallace, Kansas

DeAngela Burns-Wallace was already heading up the cabinet-level Department of Administration in Kansas when the governor tapped her to take on an added role as leader of the independent Office of Information Technology Services (OITS).

In her new position, which she’s held since August 2019, Burns-Wallace has staked out a number of key goals. She’s looking to modernize legacy systems and putting a heavy emphasis on security. “Our security posture is solid, but I want us to not just be ‘in the moment,’” she said. “I want us to be in a more strategic stance, strengthening our overall security posture not just as individual agencies but in a coordinated way across state government.”

Data governance also is high on her agenda. “We have some data sharing that has come together out of necessity, but now we need to take a step back and put a real strategy and structure around that,” she said. “We need to put in place sustainable guidelines and policies that aren’t susceptible to changing leadership or changing political winds.”

Finally, Burns-Wallace said she is looking to elevate the perception of IT as a dependable partner across all levels of state government. “We have to have reliable, consistent high-quality IT services across all of our agencies. But over the years, that consistency and level of quality have been uneven,” she said. “Non-cabinet agencies for instance have not always gotten the same level of service, and yet their work is incredibly important. They have a significant impact for our state.”

Going forward, Burns-Wallace said OITS needs to establish a more level playing field, in order to change perceptions at the agency level. “There needs to be a reliably high level of service across all those entities,” she said. “That’s how we rebuild trust and confidence in what IT is delivering.”

— Adam Stone

[slideshow-break]

Ruth Day, Kentucky

Ruth Day took up the helm at Kentucky’s Commonwealth Office of Technology (COT) in December 2019 amid a flurry of bad news.

Two months earlier, a state auditor faulted COT’s inventory practice, saying the agency couldn’t account for some $715,000 worth of equipment. Then in November, a contract worker with access to COT’s storage rooms was indicted for allegedly stealing more than $1 million in laptops from the agency.

Despite the potentially fragile environment, Day rose to the challenge of helping state agencies respond to the COVID-19 outbreak just a few months after taking on her new role. In mid-March she issued a memo to guide officials in their use of online meeting platforms. When vulnerabilities appeared in the popular Zoom platform, she quickly followed up with further guidance.

Meanwhile, Day continues to lead COT in its efforts to address a number of key issues.  Supported by COT, state offices have begun to connect to KentuckyWired, a state-run project constructing high-speed fiber-optic infrastructure to every Kentucky county. Looking ahead, COT will be supporting the National Guard in mounting Cyber Protection Teams to secure the upcoming primary and general elections.

Prior to her appointment, Day served as the vice president for administrative services at Landstar System Inc., a transportation services company specializing in logistics. In a press conference at the time of her appointment, Day expressed enthusiasm for the work ahead. “I’m honored to join the [Gov. Andy] Beshear-[Lt. Gov. Jacqueline] Coleman administration and I think you can tell that the [governor] has laid out a very clear and concise mission for me …,” she said. “I’m very excited and ready to go to work for Kentucky.” 

— Adam Stone 

Originally Posted at: Leadership Profiles: New CIOs Take the Reins in 12 States by analyticsweekpick

The Ultimate Guide to Data Anonymization in Analytics

In the face of GDPR, many companies are looking for ways to process and utilize personal data without violating the new rules.

This is all quite difficult, as GDPR significantly limits the ways in which personal data can be collected and processed. One of the biggest challenges is the high bar the regulation sets for acquiring a visitor’s consent.

The two main obstacles are:

1) under GDPR, consent has to be freely given, specific, informed and an unambiguous indication of the data subject’s agreement to the processing of personal data relating to him or her to serve as a valid basis for processing user data.

If you want to dig deeper into the details of GDPR consent, we advise you to read this blog post:
How Consent Manager Can Help You Obtain GDPR-Compliant Consents From Your Users

2) GDPR has no grandfather provision allowing for the continued use of data collected using non-compliant methods prior to the date of GDPR’s entry into force. In practice, this means that all data collected before GDPR should be removed from databases if it doesn’t meet all the requirements (and most probably it doesn’t).

What’s more, the definition of personal data has broadened drastically, and now includes cookies and many other online identifiers used in web analytics. You can read more about it here:
What Is PII, non-PII, and Personal Data?

Every company wanting to process analytics data has to adjust their approach to meet the demands of the new law. We tackle this topic on our blog here:
How Will GDPR Affect Your Web Analytics Tracking?

Another option is seeking other legal bases allowing us to process data and use historical analytics databases without going into a gray area.

One of the most favorable methods seems to be data anonymization. It may prove a good strategy for retaining the benefits while mitigating the risks involved in dealing with user data.

The key benefits of data anonymization

Companies that use this technique can benefit from one very important fact – anonymous data is not personal data for the purposes of GDPR.

According to Recital 26 of GDPR: The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.

Under the provision cited above, anonymous data doesn’t require any additional safeguards to ensure its security. Among other things, this means that:

  • you don’t need to get consent to process it
  • you can use it for other purposes than the ones it was originally collected for (you can even sell it!)
  • it can be stored for an indefinite period of time
  • it can be exported internationally

In other words, you can use it freely for virtually every purpose you want to.

PII vs Personal Data (Cheat Sheet INCLUDED!)

Learn how to recognize PII and Personal Data to stay away from privacy issues.

Download FREE Guide

What’s more, data anonymization is a great way to prove that you’re making all possible efforts to ensure the security of your users’ data. According to data privacy experts, this technique can be treated as:

  • part of a privacy by design strategy
  • part of a risk minimization strategy
  • a way to prevent personal data security breaches
  • part of a data minimization strategy

These advantages, however, result from one fact – anonymisation is a very complicated and demanding process. It requires a lot of preparation and the use of specialized techniques. The benefits you receive are more like reward for your hard work than a low-hanging fruit.

What exactly is data anonymization?

Data anonymization is the use of one or more techniques designed to make it impossible – or at least more difficult – to identify a particular individual from stored data related to them.

According to London’s Global University, Anonymisation is the process of removing personal identifiers, both direct and indirect, that may lead to an individual being identified.
An individual may be directly identified from their name, address, postcode, telephone number, photograph or image, or some other unique personal characteristic.
An individual may be indirectly identifiable when certain information is linked together with other sources of information, including, their place of work, job title, salary, their postcode or even the fact that they have a particular diagnosis or condition.

Which kinds of data should be anonymized

In the case of anonymization performed to align with the demands of GDPR, that would mean anonymizing every piece of information that can be classified as personal data.

Since, as we’ve already mentioned, the definition of personal data in GDPR is very broad, that will include such information as:

  • login details
  • device IDs
  • IP addresses
  • cookies
  • browser type
  • device type
  • plug-in details
  • language preference
  • time zones
  • screen size, screen color depth, system fonts
  • … and much more

That’s quite a long list, isn’t it?

The most popular anonymization techniques

What’s particularly important in the case of anonymization is that, according to the Article 29 Working Party’s Opinion 05/2014 on Anonymisation Techniques, it shouldn’t be treated as a single unified approach to data protection. It’s rather a set of different techniques and methods used to permanently mask the original content of the dataset.

There’s also a very limited list of techniques that could be considered as providing sufficient level of security. Among the approved anonymization techniques, the Article 29 Working Party lists two types of procedures: randomization and generalization.

Here you can find a short description of techniques encompassed by their scope.

Randomization:

Noise Addition: where personal identifiers are expressed imprecisely (for instance, height is expressed inaccurately).

Substitution/Permutation: where personal identifiers are shuffled within a table or replaced with random values (for instance: a zip code is replaced with a word).

Differential Privacy: where personal identifiers of one data set are compared against an anonymized data set held by a third party with instructions to employ a noise function and an acceptable amount of data leakage is defined.

Generalization:

Aggregation/K-Anonymity: where personal identifiers are generalized into a range or group (for instance: an age of 30 is generalized to 20-35).

L-Diversity: where personal identifiers are first generalized, then each attribute within an equivalence class is made to occur at least l times (for instance, properties are assigned to personal identifiers, and each property is made to occur with a dataset, or partition, a minimum number of times).

The most common threats in anonymization

However, each of the techniques described above has its own pitfalls, especially when tested against the three most common risks involved in anonymizing data. Those risks are:

  • Singling out
  • The possibility to isolate some or all records which identify an individual in the dataset

  • Linkability
  • The ability to link at least two records concerning the same data subject or a group of data subjects (either in the same database or in two different databases)

  • Inference
  • The possibility to deduce, with significant probability, the value of an attribute from the values of a set of other attributes.

As you can see in the table below, every technique has its own set of strengths and weaknesses:

Is Singling Out still a risk? Is Linkability still a risk? Is Inferrence still a risk?
Noise Audition Yes May not May not
Substitution Yes Yes May not
Aggregation or K-anonymity No Yes Yes
L-diversity No Yes May not
Differnetial privacy May not May not May not

Source: Article 29 Working Party, Opinion 05/2014 on Anonymisation Techniques

For these reasons, it’s highly advisable to use not one but a combination of several anonymization in concert to prevent your data set from being re-identified. However, even that approach doesn’t necessarily translate into total data security.

Because there are now so many different public datasets available to cross-reference, any set of records with a decent amount of information on someone’s actions has a good chance of matching identifiable public records.

Latanya Sweeney demonstrated in 2000 that 87% of the American population can be uniquely identified by a combination of just their ZIP code, gender, and date of birth!

That’s why, even when applying anonymization processes, it’s important to limit the amount of anonymized data disclosed to the public and to stick to the data minimization approach. In this way you minimize the risk of this data set being matched with any kind of public records.

PII vs Personal Data (Cheat Sheet INCLUDED!)

Learn how to recognize PII and Personal Data to stay away from privacy issues.

Download FREE Guide

We’re aware that anonymization techniques and the threats involved in applying them to your data is a much broader topic, impossible to tackle in a single blog post. That’s why we’ve put together a list of valuable guides shedding some more light on the technical aspects of data anonymization:

We hope they’ll prove useful!

Disadvantages of data anonymization

Although data anonymization has some very strong advantages, don’t forget about its drawbacks.

It’s important to remember that if you want to anonymize new data collected from your website, then you’ll either need to obtain consent to collect personal data (like cookies, IP addresses, and device ID) and then apply anonymization techniques, or only collect anonymous data from the start. In the latter case, this data would be limited to pageviews, as most other analytics metrics and reports requires personal data like unique pageviews, unique visitors, user location, etc.

However safe this approach may sound, it also deprives you of all the valuable insights you can gain with more detailed information about your customers. Stripping every common identifier from your data makes it impossible to cultivate a more personalized approach towards your clients and visitors – for instance, by serving them with tailored messaging and dedicated offers or recommendations.

Statistics prove that personalization is an increasingly successful marketing tactic. What’s more, consumers are keen to share their personal data with companies if the data will be used for their own benefit:

  • 79% of consumers say they are only likely to engage with an offer if it has been personalized to reflect previous interactions the consumer has had with the brand. (Marketo)
  • More than half of consumers (57%) are okay with providing personal information (on a website) as long as it’s for their benefit and is being used in responsible ways. (Janrain)

That’s why it’s worth sacrificing your historical data set in some cases and going the extra mile to provide your users with enhanced levels of security and transparency. This will help them be relaxed about sharing their personal details with you. Then you can use this data to provide them with level of personalization and customer experience they desire.

First-party data is one of the biggest assets in every marketer’s arsenal. We’ve written a lot about it in these blog posts:

You can do it by asking your users for consent to process their data and storing all the information received in alignment with the new EU data privacy law – something we’ve written a lot about on our blog in the GDPR section. Be sure to check it out!

Anonymous analytics – final thoughts

Anonymization is definitely one of the greatest ways to ensure the safety of data you collect. This extra measure of security lets you freely exploit your data collection in ways that wouldn’t be legally allowed when it comes to non-anonymized data. However, there are also some considerable benefits of using personal data in its pure (original) form. That’s why you really need to think through the pros and cons of each option before making a final decision.

But no matter what method you choose, remember that storing your data in a safe environment is also of paramount importance.

For instance, Piwik PRO Analytics allows you to store your data at a location of your choice – using your own infrastructure, in a third-party database, or in our own secure private cloud with servers located in EU and the USA. What’s more, our software enables you to apply additional security measures to your data, like SAML Authentication or Audit Log, and you can take advantage of professional data security advice and support.

If you’d like to learn more, feel free to contact us anytime!

contact us

The post The Ultimate Guide to Data Anonymization in Analytics appeared first on Piwik PRO.

Originally Posted at: The Ultimate Guide to Data Anonymization in Analytics

Aug 13, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Data Accuracy  Source

[ AnalyticsWeek BYTES]

>> Big Data Analytics, Supercomputing Seed Growth in Plant Research by analyticsweekpick

>> Combine Oversampling and Undersampling for Imbalanced Classification by administrator

>> Could Big Data Be the New Gender Equality Tool? by analyticsweekpick

Wanna write? Click Here

[ FEATURED COURSE]

Applied Data Science: An Introduction

image

As the world’s data grow exponentially, organizations across all sectors, including government and not-for-profit, need to understand, manage and use big, complex data sets—known as big data…. more

[ FEATURED READ]

On Intelligence

image

Jeff Hawkins, the man who created the PalmPilot, Treo smart phone, and other handheld devices, has reshaped our relationship to computers. Now he stands ready to revolutionize both neuroscience and computing in one strok… more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:What is the life cycle of a data science project ?
A: 1. Data acquisition
Acquiring data from both internal and external sources, including social media or web scraping. In a steady state, data extraction and routines should be in place, and new sources, once identified would be acquired following the established processes

2. Data preparation
Also called data wrangling: cleaning the data and shaping it into a suitable form for later analyses. Involves exploratory data analysis and feature extraction.

3. Hypothesis & modelling
Like in data mining but not with samples, with all the data instead. Applying machine learning techniques to all the data. A key sub-step: model selection. This involves preparing a training set for model candidates, and validation and test sets for comparing model performances, selecting the best performing model, gauging model accuracy and preventing overfitting

4. Evaluation & interpretation

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

5. Deployment

6. Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

7. Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Steps 2 to 4 are repeated a number of times as needed; as the understanding of data and business becomes clearer and results from initial models and hypotheses are evaluated, further tweaks are performed. These may sometimes include step5 and be performed in a pre-production.

Deployment

Operations
Regular maintenance and operations. Includes performance tests to measure model performance, and can alert when performance goes beyond a certain acceptable threshold

Optimization
Can be triggered by failing performance, or due to the need to add new data sources and retraining the model or even to deploy new versions of an improved model

Note: with increasing maturity and well-defined project goals, pre-defined performance can help evaluate feasibility of the data science project early enough in the data-science life cycle. This early comparison helps the team refine hypothesis, discard the project if non-viable, change approaches.

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek Keynote: The CMO isn't satisfied: Judah Phillips

 @AnalyticsWeek Keynote: The CMO isn’t satisfied: Judah Phillips

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The world is one big data problem. – Andrew McAfee

[ PODCAST OF THE WEEK]

#BigData #BigOpportunity in Big #HR by @MarcRind #JobsOfFuture #Podcast

 #BigData #BigOpportunity in Big #HR by @MarcRind #JobsOfFuture #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

In 2015, a staggering 1 trillion photos will be taken and billions of them will be shared online. By 2017, nearly 80% of photos will be taken on smart phones.

Sourced from: Analytics.CLUB #WEB Newsletter

Is a Three-Point Scale Good Enough?

Five-point scales are the best.

No, seven points.

Never use a ten-point scale.

Eleven points “pretend noise is science.”

You never need more than three points.

Few things seem to elicit more opinions (and misinformation) in measurement than the “right” number of scale points to use in a rating scale response option.

For example, here is a discussion on Twitter by Erika Hall making the case that eleven-point scales are difficult for humans to respond to and three points would be better.

This sentiment is echoed in another post, this time about ten-point scales, which apparently “don’t reflect reality.”

And this one from Tomer who also advocates for the simplicity of two- and three-point scales:

Thumbs up, thumbs down is simple. 3-point scales are simple. If you feel strongly about a 4 or 5-point scale, go for it. Not more. If you measure very specific experiences, it is very easy for a user to decide if they are happy or not.

This advice gets repeated by others on the Internet:

Responses collected from a large, 11-point scale are extremely noisy, and meaningful changes in ratings are hard to detect … A better scale would use a 3-option Yes/Maybe/No system or a similar scale with 5 options.

Unfortunately, these articles offer no justification for the position. The references often just point to each other and offer at best a sort of folk wisdom and rationale, sort of like “all content should be reachable in three clicks”—because why make users click more?

While it’s good to start with theories of why one response scale would be better than another (for example, the theory that shorter scales are easier and better), it’s important to verify or falsify those claims with data. And a good place to start is the published literature that contains a wealth of research.

Number of Scale Point Guidance

The number of scale points is a topic we address in our UX Measurement Bootcamp and Jim Lewis and I have discussed it more extensively in Chapter 9 of Quantifying the User Experience.

As a general rule, when measuring a construct that falls on a continuum from low to high (such as satisfaction, ease, and likelihood to recommend), the more points you have in your rating scale, the more reliable (consistent responses) and valid (reflects true attitudes) it generally is. The bluntness of scales with few points reduces correlations and causes the reduction in reliability and validity. We pull much of this from the work of Jum Nunnally, the noted psychometrician who wrote about the reliability of scales in 1967, 1978 and in 1994.

The increase in reliability and validity by adding more points to a scale doesn’t go on forever and it isn’t a linear increase. There is both a diminishing return (just adding more points doesn’t continue to help) and nuances in scale formats, such as the order, labels, and neutral points, that can interfere in improvements.

With more items in a questionnaire, the number of scale points matters less (although it still has an effect). For example, the SUMI questionnaire with its 50 items uses three-point response options. While responses to each of the items is rather coarse (agree, undecided, disagree) the fidelity when averaging across 50 items increases the overall score. In contrast, the SEQ is a single item with seven points, which we found superior to the original five-point version because it has more points of discrimination. The same is the case with the single item Likelihood to Recommend item with 11 points (scaled from 0 to 10).

But what do other published studies show about the number of scale points? Has there been any change since the 1970s?

History of Scale Points

For about as long as we’ve had rating scales, we’ve had questions (and concerns) about the “optimal” number of scale steps (see Boyce, 1915).

For the last 100 years there has been a lot published on the different numbers of scale points in general, but there has also been more specific research on the use of two- and three-point scales versus scales with more points.

Early Research: Two- and Three-Point Scales Are Inadequate

In one of the earliest studies, Ghiselli (1939) found four-point scales performed better than two-point scales. The authors had 200 undergraduate students rate the sincerity of advertising pieces for 41 different brands across 12 commodity types. Half the respondents were given two options (yes or no) while the other half were given a four-point scale: very sincere, fairly sincere, fairly insincere, and very insincere. In both conditions, respondents had the choice of selecting “uncertain.” They found more people responded favorably (top-two box) with the four-point compared to the two-point scale and fewer people selected the “uncertain” response when given four points.

Thus, when a four-step response was permitted, more people were willing to respond than when only a two-step response was permitted. They concluded

to cast grave doubt on the assumption implicit in the use of the 2-step response (as yes-no, favorable-unfavorable, etc.) questionnaire method as a measure of “average” opinion.

In the ensuing decades, the debate continued. Green and Rao summarized the debate on scale points in 1970 and described two factions. One advocates for using fine grained scale points (11 or 21) while another, based on opinions about respondents’ ability to differentiate between different points, advocates for only two or three response options. The loss of fidelity by having only two or three points, they argued, is made up for by asking more questions. Green and Rao conducted a simulation study and showed that adding more questions did NOT make up for the loss in fidelity from using two or three points (at least in their specific study type). So, one point in the column for more than two-point scales.

Three-Point Scales May Be Good Enough When Using Many Items

In response to Green and Rao, in the rather bluntly titled “Three-Point Likert Scales Are Good Enough” article, Matell and Jacoby (1971) argued that three points are good enough in some cases. They had 360 undergraduate psych students answer one of eighteen versions of a 72-item questionnaire of values with the response options varying in the number of scale points between two and nineteen (so there were twenty responses per condition).

Respondents were then given the same questionnaire three weeks later. The authors found little differences in reliabilities and their measure of validity and concluded that as few as two response categories may be adequate in practice. They suggested that both reliability and validity are independent of the number of response categories and their results implied that collapsing data from longer scales into two- or three-point scales would not diminish the reliability or validity of the resulting scores.

Three-Point Scales Frustrate and Stifle

In another paper, Lehmann & Hulbert (1972) argued that the main problem with two- and three-point scales is that they force respondents to choose and introduce rounding error. In the also aptly named “Are Three-Point Scales Always Good Enough?” article, the authors conducted a simulation study with items with three, five, seven, and nine points. They concluded that two or three points are probably fine when averaging across people and across many items. But if the focus of the research is on individual scales, using a minimum of five to six scale points is probably necessary to get an accurate measure of the variable. They found, for example, that even when 30 items are summed, the use of six or seven points instead of three cuts the error approximately in half (see their Figure 3).

Cox (1980) offers one of the more comprehensive reviews of the topic (spanning 80 years). His analysis concluded that “scales with two or three response alternatives are generally inadequate and … tend to frustrate and stifle respondents” and that the marginal return from using more than nine response alternatives is minimal.

Three-Point Scales Contain Less Information

Researchers often use rating scales as dependent variables in regression analysis (e.g., as part of a KDA) to understand what drives brand attitude or satisfaction. Morrison (1972) showed that using discrete points for an underlying continuous variable (like satisfaction) will reduce information. Little is lost at eleven points where 99% of the information is transmitted but 87.5% is transmitted with a three-point scale (illustrating loss of discriminating ability with coarser three-point scales).

Morrison is building off of information theory (Shannon & Weaver, 1949) where the “bits” of information transmitted is the log of the number of response options. A two-point scale communicates only one piece of information (yes/no, agree/disagree). Three-point scales communicate two pieces of information (neutrality and direction). Four-point scales communicate intensity of direction, but no neutral opinion. Five-point scales are then better theoretically because they provide three pieces of information: direction (positive/negative), intensity of opinion, and a neutral point.

More Points Differentiate Better Between Groups

In another study, Loken et al. (1987) examined the criterion validity of various telephone-administered scales through their ability to differentiate between different population groups and found eleven-point scales to be superior to three-point or four-point scales.

Lozano et al. (2008), using a simulation study, analyzed between two and nine response options and concluded “the optimum number of alternatives is between four and seven. With fewer than four alternatives the reliability and validity decrease, and from seven alternatives onwards, psychometric properties of the scale scarcely increase further.” The authors didn’t test ten- or eleven-point scales, however.

Alwin (1997) compared seven- and eleven-point scales from a 1978 life satisfaction survey and found eleven-point scales were superior to seven points. He used a large probability sample of 3,692 U.S. respondents who rated 17 domains of satisfaction (e.g., residence, living, and life satisfaction) and used three scales in the same order: a seven-point satisfied to dissatisfied (endpoints and midpoint labeled); a seven-point delighted-terrible (fully labeled); and an eleven-point feeling thermometer (only end points labeled).

He found that eleven-point scales had higher reliability than seven-point scales and in many cases higher validity coefficients and recommended eleven-point scales when measuring life satisfaction and rejected the idea that the eleven-point scale is more vulnerable to measurement errors.

In a study using a UX questionnaire, Lewis (In Press) found little difference between scales with five, seven, and eleven points but found three-point scales were inadequate. He had 242 participants randomly assigned to a three-, five-, seven-, or eleven-point version of the UMUX-Lite. He found little difference in reliabilities and correlations to other measures except for the three-point version, which had the lowest (and unacceptable reliability alpha = .61).

Three-Point Scales Can’t Identify Extreme Attitudes

In our earlier analysis of top-box scoring, we reviewed studies that showed that extreme responders (the most favorable and least favorable responses) tend to be better predictors of behavior. With three-point scales, as other studies have shown, there is no way to differentiate between extreme and tepid responses. The Net Promoter Score calculation doesn’t split the eleven-point scale into three equal intervals; it uses the difference between the most favorable and least favorable responses. We’ll cover what happens when you turn an eleven-point scale into a three-point scale in an upcoming article.

Three-Point Scales Rated as Quicker but Five, Seven, and Ten Points are Easier.

One of the more relevant and recent compelling studies came from Preston & Colman (2000). In the study, 149 (mostly) students responded to their experiences at a store or restaurant using a 101-point scale and five questions rating their experiences (e.g., promptness of service) repeated 11 times with the only thing changing was the number of points, which ranged from 2 to 11.

All scales were anchored with very poor on the left to very good on the right. Participants also rated the scale formats on ease of use, quickness of use, and allowing them to express feelings adequately. They then had respondents answer the same questionnaire again one to three weeks later to assess test-retest reliability; 129 (86%) completed the second one. The authors found:

  • Ease: 5, 7, and 10-point scales were rated as easiest to use and 11- and 101-point scales the least easy to use.
  • Quickness: 2, 3, and 4-point scales were rated as the quickest to use and 11 and 101 were rated the least “quick.”
  • Express feelings adequately: 2- and 3-point scales were rated “extremely” low on “allowed you to express your feelings adequately,” whereas scales with 9, 10, 11, and 101 points were rated the highest.
  • Reliability: 2, 3, and 4-point scales were least reliable (test-retest and internal consistency) while scales with 7–10 points were the most.

The authors concluded that scales with small numbers of response categories yield scores that are generally less valid and less discriminating than those with six or more response categories. They concluded that rating scales with seven, nine, or ten response categories are generally to be preferred.

In a future article, we’ll look to corroborate or clarify the findings in the literature and apply it specifically to the Net Promoter Score using newly collected data.

Summary and Takeaways

A review of 12 studies conducted on the difference between three and more point scales revealed:

More scale points increase reliability. Not much has changed in the 40–50 years since Nunnally’s work. Despite changes in survey formats (especially web surveys), the literature overwhelmingly shows that as you increase the number of scale points you increase reliability. The reliability will increase most when going from three to five points and less is gained as you exceed seven or eleven points.

Three-point scales are not reliable. Across multiple studies, using only three points was shown not only to have lower reliability, but the reliability wasn’t even adequate, and in some studies, even when averaging across items, too much is lost. The low reliability is a consequence of using a coarse scale to represent a continuum. By forcing respondents to choose too few categories, it introduces more error in responding and thus makes responses less consistent in the same study (internal reliability) and over time (test-retest reliability).

There’s a loss of intensity and validity with three-point scales. Using only three points loses all information about intensity, or the strength of people’s opinion. Not everyone feels equally favorably or unfavorably toward a brand, interface, or experience. All but one of the 12 studies we examined recommended using more than three points in a rating scale. In the one study that did suggest three-point scales were enough, it averaged the results across multiple items, not using a single item.

Two- and three-point scales are perceived as quicker and easier. In one study, participants rated two- and three-point scales as being quicker and easier to respond to. This lends credence to the idea that shorter scales require less mental effort to respond to. If speed and the perception of ease is paramount (especially if using multiple items) a researcher may decide that two or three points are good enough. But know that while respondents may save time, it may stifle their ability to actually express their opinions. Or, if you look at the scale as a user interface, you’re preferring a faster and easier but less effective scale.

Two and three points are insufficient to express feelings adequately. While two- and three-point scales may be perceived as faster and easier to respond to, participants overwhelmingly felt they were inadequate in allowing them to express their feelings. We’ll attempt to replicate these findings and measure actual response time (as opposed to perception of speed) in an upcoming article.

The three-point scale superiority is a myth. If you’re concerned about putting a burden on your respondent by including more points, this analysis suggests participants don’t find scales with more than three points necessarily more difficult. In fact, in one study, scales with five, seven, and ten points were rated as EASIER to use than two- and three-point scales.

In other words, using two- and three-point scales may be getting you a slightly quicker but unreliable response, and respondents think they are less easy to use. Along with the idea that all website content should be within 3 clicks, the idea that 3 point scales are always better (or even sometimes better) than scales with more points needs to go into the UX myth dust bin.

Thanks to Jim Lewis for proving comments on this article.

(function() {
if (!window.mc4wp) {
window.mc4wp = {
listeners: [],
forms : {
on: function (event, callback) {
window.mc4wp.listeners.push({
event : event,
callback: callback
});
}
}
}
}
})();

Sign-up to receive weekly updates.

 

 

Source: Is a Three-Point Scale Good Enough?

What’s the True Cost of a Data Breach?

The direct hard costs of a data breach are typically easy to calculate. An organization can assign a value to the human-hours and equipment costs it takes to recover a breached system. Those costs, however, are only a small part of the big picture.

Every organization that has experienced a significant data breach knows this firsthand. Besides direct financial costs, there are actually lost business, third-party liabilities, legal expenses, regulatory fines, and damaged goodwill. The true cost of a data breach encompasses much more than just direct losses.

Forensic Analysis. Hackers have learned to disguise their activity in ways that make it difficult to determine the extent of a breach. An organization will often need forensic specialists to determine how deeply hackers have infiltrated a network. Those specialists charge between $200 and $2,000 per hour.

Customer Notifications. A company that has suffered a data breach has a legal and ethical obligation to send written notices to affected parties. Those notices can cost between $5 and $50 apiece.

Credit Monitoring. Many companies will offer credit monitoring and identity theft protection services to affected customers after a data breach. Those services cost between $10 and $30 per customer.

Legal Defense Costs. Customers will not hesitate to sue a company if they perceive that the company failed to protect their data. Legal costs between $500,000 and $1 million are typical for significant data breaches affecting large companies. Companies often mitigate these high costs with data breach insurance because it covers liability and notification costs, among others.

Regulatory Fines and Legal Judgments. Target paid $18.5 million after a 2013 data breach that exposed the personal information of more than 41 million customers. Advocate Health Care paid a record $5.5 million fine after thieves stole an unsecured hard drive containing patient records. Fines and judgments of this magnitude can be ruinous for a small or medium-sized business.

Reputational Losses. Quantifying the value of lost goodwill and standing within an industry after a data breach is impossible. That lost goodwill can translate into losing more than 20 percent of regular customers, plus revenue depletions exceeding 30 percent. There’s also the cost of missing new business opportunities.

The total losses that a company experiences following a data breach depend on the number of records lost. The average per-record loss in 2017 was $225. Thus, a small or medium-sized business that loses as few as 1,000 customer records can expect to realize a loss of $225,000. This explains why more than 60 percent of SMBs close their doors permanently within six months of experiencing a data breach.

Knowing the risks, companies can focus on devoting their cyber security budget to prevention and response. The first line of defense is technological, including network firewalls and regular employee training. However, hackers can still slip through the cracks, as they’re always devising new strategies for stealing data. A smart backup plan includes a savvy response and insurance to cover the steep costs if a breach occurs. After all, the total costs are far greater than just business interruption and fines; your reputation is at stake, too.

Originally Posted at: What’s the True Cost of a Data Breach? by thomassujain

Aug 06, 20: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

Warning: file_get_contents(http://events.analytics.club/tw/eventpull.php?cat=WEB): failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found
in /home3/vishaltao/public_html/mytao/script/includeit.php on line 15

[  COVER OF THE WEEK ]

image
Extrapolating  Source

[ AnalyticsWeek BYTES]

>> New Study from FMI and Autodesk Finds Construction Organizations with the Highest Levels of Trust Perform Twice as Well on Crucial Business Metrics by analyticsweekpick

>> All the Big Data News You Need from 2019 and Major Trends to Watch in 2020 by daniel-jacob

>> Machine Learning Model to Quicken COVID-19 Vaccine Release and More – Weekly Guide by administrator

Wanna write? Click Here

[ FEATURED COURSE]

CS109 Data Science

image

Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data managem… more

[ FEATURED READ]

Data Science from Scratch: First Principles with Python

image

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn … more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:Give examples of bad and good visualizations?
A: Bad visualization:
– Pie charts: difficult to make comparisons between items when area is used, especially when there are lots of items
– Color choice for classes: abundant use of red, orange and blue. Readers can think that the colors could mean good (blue) versus bad (orange and red) whereas these are just associated with a specific segment
– 3D charts: can distort perception and therefore skew data
– Using a solid line in a line chart: dashed and dotted lines can be distracting

Good visualization:
– Heat map with a single color: some colors stand out more than others, giving more weight to that data. A single color with varying shades show the intensity better
– Adding a trend line (regression line) to a scatter plot help the reader highlighting trends

Source

[ VIDEO OF THE WEEK]

@AnalyticsWeek Panel Discussion: Finance and Insurance Analytics

 @AnalyticsWeek Panel Discussion: Finance and Insurance Analytics

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data really powers everything that we do. – Jeff Weiner

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @Beena_Ammanath, @GE

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @Beena_Ammanath, @GE

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

73% of organizations have already invested or plan to invest in big data by 2016

Sourced from: Analytics.CLUB #WEB Newsletter