Glossary of terms | Page 3 of 7

Top 5 Use Cases of Machine Learning in the Telecom Industry

Written by Cynthia Hoza on 17 June 2023. Posted in Glossary, Machine Learning, Machine Learning. No Comments on Top 5 Use Cases of Machine Learning in the Telecom Industry

Machine learning is revolutionizing the telecom industry by enabling data-driven decision-making, enhancing customer experiences, and optimizing operations. In this blog post, we will explore the top use cases of machine learning in telecom, highlighting how Calligo’s Machine Learning as a Service capability empowers telecom companies to leverage predictive models, optimization techniques, time-series analysis, and customer segmentation.

1. Optimize Call Center Staff

Efficient scheduling of call center staff is crucial for customer satisfaction and cost reduction. Calligo’s predictive models and optimization algorithms help telecom companies optimize call center staff scheduling based on call volumes and customer needs. By dynamically adjusting staff schedules, telecom companies can ensure efficient resource allocation, enhance customer experiences, and capture sales opportunities.

2. Market Penetration

Understanding market penetration and identifying high-potential markets are essential for telecom companies looking to expand their customer base. Calligo’s predictive models and time-series analysis help telecom companies assess market penetration and identify markets that offer the best return on investment. By leveraging data on customers, sales, and local market trends, telecom companies can focus their efforts on markets with high growth potential.

3. Store Location Optimization

Selecting optimal locations for new retail stores is critical for maximizing revenue potential and minimizing building costs. Calligo’s machine learning solutions analyze data on network capacity, finance, customer demographics, and market trends to identify the best locations for new telecom stores. By optimizing store locations, telecom companies can capture new customers, increase market share, and ensure the best network coverage for their customers.

4. Service Interruption Detection

Predicting and quickly responding to network problems is vital for maintaining revenue, customer retention, and satisfaction. Calligo’s predictive models, time-series analysis, and anomaly detection techniques enable telecom companies to detect and respond to service interruptions proactively. By identifying network anomalies and implementing efficient troubleshooting and repair strategies, telecom companies can minimize downtime and ensure uninterrupted service for their customers.

5. Customer Segmentation

Understanding current and potential customers is crucial for targeted marketing and sales decisions. Calligo’s clustering and collaborative filtering techniques help telecom companies segment their customer base based on various attributes such as usage patterns, demographics, and preferences. By leveraging machine learning algorithms, telecom companies gain insights into customer behavior and preferences, enabling them to tailor marketing efforts, offer personalized services, and drive revenue growth.

Machine learning is transforming the telecom industry, enabling telecom companies to leverage data-driven insights and make informed decisions. Calligo’s Machine Learning as a Service capability empowers telecom companies to optimize call center operations, improve market penetration, optimize store locations, detect service interruptions, and understand customer segments. By embracing machine learning, telecom companies can enhance customer experiences, drive revenue growth, and stay ahead in a competitive market.

Lie Machines – The global fight against misinformation

Written by Cynthia Hoza on 14 June 2023. Posted in Beyond Data Podcast, Data Ethics, Data Governance, Data Insights, Glossary, Machine Learning, Machine Learning. No Comments on Lie Machines – The global fight against misinformation

Listen on Spotify

Exorcizing the ghost in the machine

In this latest podcast in our ‘Beyond Data’ series, Tessa Jones (Calligo’s Chief Data Scientist) and Peter Matson (Data Science Practice Lead) talk with Oxford University’s Professor Philip Howard about the threats posed to democracy by technology, specifically in the shape of Lie Machines.

Fact or fiction? Microtargeting with lie machines

In this age of social media, chatbots and AI it’s never been easier for individuals to share their opinions. Instant communication to, and engagement with, a global audience is now commonplace, and it seems there’s no need to let facts get in the way of a good angle. As Mark Twain, or maybe Winston Churchill, or most probably Jonathan Swift famously said, “a lie can travel halfway around the world whilst the truth is still putting on its shoes.” A great example in itself of the ease in which misunderstandings and misappropriations can become canon.

In this vein, Professor Howard has spent years studying the mechanisms in which opinion, behavior and values can be manipulated and misdirected by lie machines:

“Lie machines are large, complex mechanisms made up of people, organizations, and social media algorithms that generate theories to fit a few facts, while leaving you with a crazy

conclusion easily undermined by accurate information. By manipulating data and algorithms in the service of a political agenda, the best lie machines generate false explanations that

seem to fit the facts.”

Lie Machines: How to Save Democracy from Troll Armies, Deceitful Robots, Junk News Operations, and Political Operatives

We find lie machines in all types of countries and governing structures. They share common elements – political actors produce the lies, social media firms distribute them, and paid consultants market them. High profile examples of the effectiveness of the lie machine include the UK’s Brexit campaign, and Trump’s electioneering – in both cases patently untrue ‘facts’ and arguments were targeted at key voters by disinformation networks, troll farms and lie machines. Algorithms direct individuals towards ever-more insular sources and extreme content:

“A healthy, public-facing algorithm might occasionally introduce another credible source… we know the platforms play around with this stuff, especially during elections in the US”

Controlled by bad actors and forming a global ecosystem of lie development and propagation, these lie machines spread their tendrils across every social media platform, moving out from Facebook as new outlets develop.

Computational propaganda

Lie machines have evolved and finessed themselves as technology advances. Instead of stealing the photos, social media handles and biographies of real people, AI now generates new pictures and personas and thus evades technology platforms’ troll-spotting software.

Spreading propaganda far and wide, with a convincing voice, the lie machine

Has a profound effect on society, with a scale that is difficult to quantify
Is perfectly engineered to target human vulnerabilities, reducing critical thinking
Deliberately misrepresents and appeals to emotions and prejudices, using our cognitive biases to bypass rational thought and create echo chambers
Is vague and unknowable – what training data was used for large language models? (Professor Howard postulates that every Gmail sent over the last 25 years may have been scraped, along with content from junk news sites)

Doing better – where does the onus sit? User or developer?

When it comes to developing processes to combat the lie machine, there’s no one legislation or guiding principle that works. We must always consider the regional and cultural context of both data and users. Research can’t necessarily be amalgamated or directly compared from different regions and countries – for example, we know that the placebo effect is always greater in US medical studies. To date, technology has not always built in cultural nuances in how people use words, with intent and meaning lost in translation – the majority of network takedown orders are for sites that are not in English.

Wherever there is human input, there are behavioral differences that make it much more difficult to apply common rules:

“People who manage cookies are above average in terms of their knowledge of technology, so these people are generally more purposeful in terms of how they set up their news feeds and where they go for information”

The huge amount of disinformation spread around Covid and the resulting vaccination campaign demonstrates how potent the lie machine is. It doesn’t need to convince people its argument is right, all that is required is to introduce enough doubt, to highlight there is a chance of harm. After all:

“If everybody really understood probability, nobody would ever buy a lottery ticket”

Balance the field – breaking the lie machines

Professor Howard believes that whilst we are justified in our concern about the threats to democracy, the principles behind the lie machine can be harnessed for good – promoting topics that are in the public interest and generating democratic discourse:

“I am cynical, but not fatalistic”

He describes the steps we can take to break the lie machines:

Public policy oversight, founded in ongoing public data capture and analysis
Designing social media to highlight emerging consensus, rather than heated conflict – machine learning can amplify common ground
Setting election guidelines to create more opportunities for civic expression
Giving journalists, civic groups and researchers access to all the public opinion data that is currently in the hands of the technology firms
Ensuring that the big data collected by technology platforms is added to public archives

The answer is more social media, not less. But it needs to serve society much better.

IPIE – bringing down the lie machine

Professor Howard has recently launched a new program, creating an independent scientific body to foster global cooperation in safeguarding the online information environment. The International Panel for the Information Environment (IPIE) will assess the scope of the misinformation crisis, analyze its effects on our societies and the planet itself, and propose solutions. Featuring data scientists and engineers alongside neuroscientists and sociologists, IPIE hopes to be the beginning of a global effort to save our common information environment.

Watch the podcast for yourself below to hear more from Professor Philip Howard about the power of the lie machine, and crucially, to learn how we can use it for the collective good.

Professor Philip Howard is a social scientist with expertise in technology, public policy and international affairs. He is Director of Oxford University’s Programme on Democracy and Technology, a Statutory Professor at Balliol College, and he is affiliated with the Departments of Politics and Sociology. Currently, he is also a Visiting Fellow at the Carr Center for Human Rights at Harvard University’s Kennedy School.

<< PREVIOUS EPISODE

NEXT EPISODE >>

Top 10 Use Cases of Machine Learning in the Healthcare Industry

Written by Cynthia Hoza on 9 June 2023. Posted in Data Visualization, Glossary, Machine Learning. No Comments on Top 10 Use Cases of Machine Learning in the Healthcare Industry

Machine learning is revolutionizing the healthcare industry by leveraging the power of data to improve patient outcomes, enhance operational efficiency, and drive cost savings. In this blog post, we will explore the top use cases of machine learning in healthcare, highlighting how Calligo’s Machine Learning as a Service capability can empower healthcare providers to transform their operations and deliver better care.

1. Improve STAR Rating

The STAR rating system is crucial for healthcare providers as it determines their quality of care and impacts financial incentives. Calligo’s predictive models can identify the key variables that influence STAR ratings and provide prescriptive solutions to improve them. By optimizing patient experience, lowering costs, and enhancing patient satisfaction, providers can achieve higher STAR ratings and increase their bonus payments.

2. Health Crisis Preparedness

Health crises, such as the COVID-19 pandemic, require proactive preparation to ensure the safety of workers and mitigate financial risks. Calligo’s predictive models and time-series analysis help healthcare organizations simulate and forecast the impact of unexpected economic shocks. By making data-driven decisions around layoffs, resource allocation, and innovation, providers can navigate health crises effectively and minimize long-term financial consequences.

3. Optimize Staff Scheduling

Efficient staff scheduling is essential to meet patient needs while minimizing unnecessary labor costs. Calligo’s predictive models enable healthcare leaders to optimize physician and facility resources based on patient demand. By aligning staffing levels with patient access expectations, providers can enhance patient experiences and remain competitive in the evolving healthcare landscape.

4. Medical Supply Logistics

Efficient supply chain management is critical for delivering timely and life-saving healthcare services. Calligo’s predictive models and time-series analysis optimize supply chain logistics by leveraging diverse data sources. By constantly monitoring and updating logistics channels, providers can ensure the availability of essential medical supplies, reduce costs, and mitigate the risk of inadequate supplies that could compromise patient safety.

5. Patient Insights

Understanding patient preferences and identifying high-value services are essential for improving patient satisfaction and achieving higher Medicare STAR ratings. Calligo’s predictive models and Monte-Carlo simulations enable healthcare providers to measure and analyze patient feedback, identifying the services that provide the most value. By tailoring care and service offerings to meet patient preferences, providers can enhance patient satisfaction and drive higher STAR ratings.

6. Reduce Patient Wait Time

Reducing patient wait times is crucial for delivering efficient and patient-centered care. Calligo’s predictive models and optimization techniques help healthcare organizations anticipate patient and staffing needs, enabling effective resource allocation and streamlined workflows. By reducing wait times, providers can improve patient satisfaction, increase revenue, and optimize staff utilization.

7. Reduce Readmission Rates

Reducing readmission rates is vital for improving patient outcomes and optimizing costs in value-based care models. Calligo’s predictive models identify indicators of readmission, allowing healthcare providers to allocate resources strategically and implement interventions that reduce readmissions. By maximizing shared savings payment models and focusing on patient-centric care, providers can improve outcomes, drive revenue, and enhance STAR ratings.

8. Improve ER Admittance

Enhancing emergency room (ER) admittance processes is crucial for managing complex patients and improving care outcomes. Calligo’s predictive models help healthcare organizations connect different health silos and optimize procedures to ensure appropriate patient-provider matches and levels of care. By leveraging machine learning algorithms, providers can target specific patients effectively, lower facility costs, and deliver better care experiences.

9. Improve Screening Frequency

Improving the frequency of routine screenings plays a vital role in preventive healthcare and early detection of illnesses. Calligo’s predictive models and time-series analysis help healthcare providers identify patients who would benefit from screenings and predict their compliance. By targeting the right patients and promoting routine screenings, providers can reduce the risk of costly illnesses, improve patient outcomes, and optimize resource allocation.

10. De-Identification of Data

Data de-identification is essential for expanding the usability of healthcare data while protecting patient privacy. Calligo employs advanced predictive models and time-series analysis techniques to safely de-identify data while retaining its value and richness. By leveraging anonymized data, healthcare organizations can drive additional revenue by utilizing data for research, population health management, and healthcare analytics while complying with privacy regulations.

Machine learning is reshaping the healthcare industry, enabling providers to deliver better care, optimize operations, and improve patient outcomes. Calligo’s Machine Learning as a Service capability empowers healthcare organizations to leverage the power of predictive models, time-series analysis, and optimization techniques to drive tangible results. By embracing machine learning, healthcare providers can unlock new possibilities and create a future where data-driven decision-making revolutionizes the delivery of healthcare services.

Making complex data available for the benefit of society

Written by Cynthia Hoza on 15 May 2023. Posted in Beyond Data Podcast, Data Governance, Data Insights, Data Privacy, Data Privacy, Data Strategy, Data Visualization, Glossary, Machine Learning. No Comments on Making complex data available for the benefit of society

Listen on Spotify

In Calligo’s latest Beyond Data podcast, Tessa Jones (Chief Data Scientist) is joined by Dr Ellie Graeden, Research Professor (Center for Global Health Science and Security) at Georgetown University. Here we explore some of the episode’s highlights:

The inherent conflict of private data and the public good
Protecting individual rights within federated learning
The importance of effective communication and a common language
Designing systems and policies that work together
Focusing regulation on outcomes, not creating data siloes

At societal level, poor communication costs lives

Transitioning data across and between departments and data systems has historically been fraught with problems – who owns it? Who pays for it? Is it understandable and translatable into meaningful and actionable insights for the end user?

Having worked extensively in disaster response, Dr Graeden has seen first-hand the potentially life-threatening issues that can arise when government departments’ data platforms produce incompatible outputs:

If 20,000 people need water, how many pallets need to be shipped?
If 10,000 electricity meters have been knocked out by a hurricane, how many people need feeding?

In such scenarios, identifying individuals amongst population-level data is crucial if the help provided is to be sufficient.

“We have to be able to really effectively move and communicate and share data that are relevant, in ways that they can get used by people all across the system”

Of course, any data system design should ensure privacy and protection for personal data. ‘Big data’ is still relatively new, and as such more powerful and widespread regulatory controls are now being introduced, although the US still does not have consistent requirements for how data should be handled. Fundamentally, meeting a population’s needs today, and planning for them tomorrow, requires the data of individual people to be analysed. Personal data must be shared quickly, effectively and all the while protecting individual rights. Data system design must therefore:

Include all players
Consider cultural constraints
Keep out bias
Ensure the right words and phrases are used
Focus on the ‘so what’, why does it matter?

“Every single thing we experience can be captured as data”

Even the most mundane moments in our daily lives leave a digital footprint, we shed data everywhere. But when does ‘my’ data become public, or the property of the software developer or the service provider? VR headsets collect ephemeral data that is analysed and applied for that one end user, but if that data is assumed to fall under GDPR the potential to use it for positive outcomes is severely limited. For example, should authorities be notified if content viewed and generated is illegal or harmful? And what if that chip can detect if the user is having a stroke, is that data classified as ‘health’ data? Can it be used to alert the individual to their medical emergency without contravening legislation? What if your mouse clicks can detect the early stages of Parkinson’s? Should you, could you, be told?

“If you’re treating this data as health data, then they have a very different set of regulatory constraints. HIPAA isn’t going to regulate those because it’s not a health care provider or a health insurer”

Piercing the veil

The conflict between personal protection and public good is everywhere, and Dr Graeden believes that some new data laws will create problems for federated learning. Legislation has clear boundaries (speed limits, blood alcohol levels) whereas science deals in spectrums, probabilities and unknowns.

Deleting an individual’s personal data from the model breaks the system, contradicting what regulators are trying to achieve. The solution is to prioritize outcomes, not processes – it doesn’t matter whether you write the rules with a pen and paper, or with AI, as long as you write the rules. Expanding the framework by setting gradients of data availability affords protection for individuals, whilst making data available that informs better decision making for public bodies.

“Data is nothing more, nothing less, than an abstract description of our world. A useful and powerful language that can tell us things that other languages don’t”

Data can no longer exist in siloes if it’s to be useful to society

There is now a healthy global appetite for the discussion around data, thanks in the main to two recent developments:

Covid gave us huge amounts of data about mortality levels, vaccination rates, hospitalisation trends – all of which were in the public consciousness every day
AI and ChatGPT – articles and debates about the pros and cons are everywhere, discussion is not just in the scientific community

The key challenges now for data scientists are expectation management and communication – we need to be clear about aims and specific about context, as well as knowing what to leave out to avoid overwhelm and misunderstanding. Unfortunately, scientists are not always great communicators (using complex terminology and detail, rather than common parlance and generalization) as Covid demonstrated:

Did having a vaccine mean you wouldn’t get sick? Or just less sick?
‘Everyone should wear a mask’ became ‘wear a mask if you can’. This was due to limited supply, but it appeared that the science was not clear

“The scientific approach means you never have an answer… we are trained as scientists to focus on the fact that we don’t know”

In fact, the only answer is that the right data, used consistently and communicated clearly, will always allow us to be prepared, not reactive. To make decisions for the public good that protect every individual.

You can find out more about the common language of privacy in our Rosetta Stone eBook.

You can also watch Tessa’s fascinating podcast with Dr Graeden below.

<< PREVIOUS EPISODE

NEXT EPISODE >>

The benefits of outsourced Data Protection Officer as a Service

Written by Brendan Walsh on 21 February 2023. Posted in Data Privacy, Data Protection, Glossary. No Comments on The benefits of outsourced Data Protection Officer as a Service

As the world becomes increasingly digital and cloud based, the importance of data protection and privacy has become paramount for all organizations. One key aspect of ensuring compliance with data protection laws and regulations is the appointment of a Data Protection Officer (DPO).

However, appointing a DPO internally can present several challenges, including conflicts of interest and a lack of specialized skills. That is where Data Protection Officer as a Service (DPOaaS) comes in.

Sidestep potential conflict of interest

One of the main reasons organizations appoint external DPOs is to sidestep the potential conflict of interest that arises when a DPO is appointed internally. Supervisory Authorities are becoming increasingly strict about this issue, and a conflict of interest can be seen as a punishable breach. For example, CIOs and CISOs are responsible for the collection, storage, and protection of data, which can prevent them from objectively scrutinizing their own processes.

Similarly, Heads of Legal and In-House Counsel are tasked with defending the organization’s interests, while a DPO is required to represent the data subject. Heads of Compliance, who are responsible for determining how data is processed, may also be unable to impartially assess its adherence to legal obligations.

By outsourcing your DPO to a specialized service provider, such as Calligo, you can sidestep these conflicts of interest and ensure your organization’s compliance and data safety. Outsourcing your DPO is also faster and more cost-effective than hiring one internally.

10x as many DPO vacancies as there are qualified individuals

There are currently 10x as many DPO vacancies as there are qualified individuals, making hiring processes long and expensive. Outsourcing your DPO allows for flexible resourcing, as the role is often not a full-time position. Additionally, outsourcing your DPO gives you access to a wider set of skills, including technical, legal, and information security expertise, all at a far lower cost than recruiting each of these individuals individually.

The Calligo Privacy Team is a specialized team of experienced and qualified professionals with deliberately diverse career backgrounds and deep subject matter knowledge. They are committed to ensuring adherence to global data protection laws without compromising the ambitions and goals of your clients. The team is highly qualified, holding certifications such as the IAPP, which are the world’s most trusted and respected certifications in data privacy. These cover privacy laws and regulations and the practical operations to apply and deploy them successfully.

The Calligo Privacy Team also brings diversity in terms of industry experience. By operating in varied domains, the team’s expertise is sector-transferable, keeping your knowledge as relevant as possible. In an increasingly complex landscape, the team is uniquely placed to support you in the nuances of different data protection and privacy regulations, across any sector and jurisdiction. The team has supported industries such as global manufacturing, global franchise fast food brands, financial, software as a service platform providers, energy, government, charities, and service providers.

In summary, Data protection and privacy is crucial for all organizations in the digital age. However, appointing an internal Data Protection Officer (DPO) can be challenging, due to potential conflicts of interest and lack of expertise. DPO as a Service (DPOaaS) provides a solution by outsourcing the role to a specialized service provider, avoiding conflicts of interest and providing access to a wider set of skills at a lower cost. The Calligo Privacy Team is a highly qualified team of experienced professionals with diverse backgrounds and certifications in data privacy, who are committed to ensuring global data protection compliance. The team has a proven track record of supporting various industries, keeping knowledge relevant and up-to-date.

Let the team help you fulfill your legal obligation to appoint a suitable Data Protection Officer, while also serving as an internal advisor, representative, and liaison for your organization.

Learn more about Calligo’s Data Protection Officer as a Service

Unlocking the power of AI and Natural Learning

Written by Cynthia Hoza on 17 February 2023. Posted in Beyond Data Podcast, Glossary, Machine Learning, Machine Learning. No Comments on Unlocking the power of AI and Natural Learning

Listen on Spotify

In Calligo’s latest Beyond Data podcast, co-hosts Sophie Chase Borthwick and Tessa Jones are joined by Alexander Visheratin, Artificial Intelligence Engineer at Beehive AI. Here we explore some of the episode’s highlights; the importance of Natural Learning Processing (NLP) and the pros and cons of output produced by examples like OpenAI’s ChatGPT-3.

“It can do anything, because it was trained on everything”

NLP models like ChatGPT are changing the way we search for data online. But if you average everything, the output will necessarily be average. And we have questions:

How ethical is the learning data that feeds these models, and how ethical was the process of collecting it?
How can global models be policed and regulated within individual countries?
What is the potential for small and specific training datasets to be manipulated by humans in a way that will limit and create biases in the algorithms?
Is it a ‘bug’ when a prompt doesn’t give us what we wanted? What we ask for is rarely what we actually get.

Confidence or competence?

One major drawback of the NLP process is that many models stopped learning at the turn of the decade, which as Alexander highlights, can easily lead to incorrect information being generated. “I asked one of the large models, ‘who is the president of the United States?’ and it answered very confidently, Barack Obama.” That confidence is interesting, because as humans we are predisposed to trust information that is given to us clearly and directly, with no hint of doubt.

Also, NLP models are built to prove or agree with the task given to them, and they sound so plausible. Alexander shares a specific example of Chat-GPT providing convincing output that could easily persuade someone unfamiliar with the facts.

“Andrew Ng, who is an Adjunct Professor at Stamford University, asked Chat-GPT to prove that CPU is better than GPU for deep learning. It was very confident and created a long paragraph of text proving it. Then he asked it to prove that some more primitive way of calculating is better than CPU, and it again provided very confident paragraph of text. He ended up basically ‘proving’ that an abacus is better than GPU for deep learning.”

In this age of misinformation, there is huge potential for NLP to spread misleading (or downright false) information very quickly to large audiences. ‘Facts’ which then become accepted, magnified and transmitted further.

Taking liberties with artistic license

There are obvious intellectual property issues when it comes to NLP and art generation. Asking an AI tool to create a piece in the style of a named artist will generate convincingly similar work. But if this output contravenes the artist’s morals or political views for example, it is easy to see how discomfort (and possibly even legal challenges) could follow. Conversely, when original artwork is produced that has been generated from hundreds of command iterations to finesse exactly the output required, can it still be seen as ‘art’? Is it the work of the individual using the AI tool, or the tool itself? But is this any different to the great works credited to Michelangelo that we know were produced in part by his students? Is the value of NLP in art actually more as an idea generator, a source of inspiration for the artist rather than the end point?

Alexander believes that creatives shouldn’t be afraid of natural learning. “I think NLP is more of a supplement, a good supplement, because it allows us to be more creative, pushing forward, advancing. It’s not like a replacement at all, it’s more like a co-worker or a supplemental ghost writer almost.”

Guard rails contain or keep out discriminatory language?

OpenAI were very upfront when ChatGPT first launched about the fact that the model would not allow misogynistic or racist material to be produced. Yet the very nature of the learning process saw AI models scraping huge amounts of learning data from the internet, much of which would inherently be of questionable bias and tone. Thus, what these models are drawing from as ‘normal’ is very much not.

“What Chat-GPT doesn’t allow, it feels like it doesn’t allow not because of how it was trained, but because of the huge amounts of guard rails that OpenAI built around it. So, they basically caged this model into all these sorts of limitations about stuff that it shouldn’t allow. But if you can get past these guard rails and into the model itself, it still has all these biases, like race, gender, all this stuff. It still has it, but they just try their very best to limit the way it can show it. Chat-GPT is essentially a celestial bureaucrat!”

NLPs provide assistance, not autonomy

Going forward, combining NLP output with factual SEO-sourced content feels like best practice when using AI tools. Alexander points out that this is quicker than finding the information yourself too and gives us the opportunity to validate what the models generate. Ultimately, he believes that directed and federated learning have fantastic potential, as long as we remain mindful of the risk of reverse engineering and privacy breaches. Using NLP as part of the solution, not the source of the only answer.

If you’d like to discuss the benefits of using Natural Learning Processing in your organization, please contact Tessa Jones to find out more.

You can also watch the fascinating podcast in full below.

<< PREVIOUS EPISODE

NEXT EPISODE >>

AI bias is frequently failing the LGBTQ+ community

Written by Brendan Walsh on 11 January 2023. Posted in Beyond Data Podcast, Data Ethics, Data Privacy, Glossary, Machine Learning, Machine Learning. No Comments on AI bias is frequently failing the LGBTQ+ community

Listen on Spotify

In our latest Beyond Data podcast, co-hosts Sophie Chase Borthwick (our Data Ethics & Governance Lead) and Tessa Jones (our Chief Data Scientist) invited Tomer Elias, Director of Product Management at BigID, to discuss how AI bias affects the LGBTQ+ community.

Here we explore some of the episode’s highlights – although you can also watch the full episode here.

Why is there bias?

When building an AI algorithm or AI solution, it is crucial to make sure it’s based on data sets that are both unbiased and diverse and, in terms of the LGBTQ+ community, this often falls short. Whatever the sector – work, health, entertainment – all will be subject to bias if the LGBTQ+ community is not taken into consideration when an AI solution is being created.

For Tessa Jones, one of the barriers to collecting sufficient data is that people might be reluctant to share information about their sexual orientation or their gender journey – particularly if they don’t know how this personal data will be used. Sophie Chase-Borthwick agrees that it quickly becomes a catch-22 situation:

“The biases that make you nervous of disclosing information are the very reason that you need to disclose said personal information in order to prevent bias and improve.

Knock-on effects

Drawing on his experience as a board member of an organization that supports LGBTQ+ employees, Tomer Elias explains how candidates are being let down by recruitment AI solutions and that the consequences are significant.

“A lot of people in the LGBTQ+ community are unemployed and that’s not because they’re lacking the professionalism and passion.”

Meanwhile, medical advances in the LGBTQ+ community are constantly evolving, and many algorithms do not take these changes into account.

“People who are transitioning are not getting the right treatments because the treatment providers are not well educated about it and the data is not diverse enough,” explains Tomer.

Tessa also raises the issue of health apps that require a user to state whether they are male or female.

“Even though the equations could be written differently to how you use different input, they’re just not and that means, you either have to pretend you’re something different or just not use that tool.”

Potential of AI to help overcome bias

While AI bias is clearly affecting the LGBTQ+ community, there are innovative ways it can be used to overcome it, too. Such as in recruitment.

“At the initial interview stage, AI could be used to scramble the voice so you would not know if the candidate was male or female or someone who has transitioned,” says Tomer.

He also poses the possibility for AI to help with the retention of LGBTQ+ employees.

“Technology could help employers know that the employee is happy and feels a part of the organization.”

Time to step it up…

There are already many AI forces for good – including recommendation systems which can help LGBTQ+ people feel more emotionally supported and The Trevor Project that uses AI to predict which callers are more likely to commit suicide to ensure they get help.

Much more needs to be done. But the fact that people are starting to think about AI bias and the LGBTQ+ community is a step in the right direction.

“Now we’re talking about it and people are realizing the actual real-world implications, hopefully more people will feel comfortable expressing themselves and we can close some of that data gap so there is more information for the models to work off,” according to our Data Ethics & Governance Lead, Sophie Chase-Borthwick.

“It’s also super critical that we have diverse AI developers who are knowledgeable about people and bias,” adds Calligo’s Tessa Jones.

To hear more of our fascinating discussion on AI bias and how it affects the LGBTQ+ community, tune in to our latest Beyond Data podcast episode below.

<< PREVIOUS EPISODE

NEXT EPISODE >>

Global Food Waste – Can AI Offer a Solution?

Written by Brendan Walsh on 29 November 2022. Posted in Beyond Data Podcast, Glossary, Machine Learning, Machine Learning. No Comments on Global Food Waste – Can AI Offer a Solution?

Listen on Spotify

In our latest Beyond Data podcast ‘Global Food Waste – Can AI Offer a Solution?’, we invited Data Science leader Shawn Ramirez to help us explore the global issue of food waste and discuss how AI has the power to make a difference. Co-hosts Sophie Chase Borthwick (our Data Ethics & Governance Lead) and Tessa Jones (our VP of Data Science, Research & Development) steered Shawn to share her insight and examples of where AI is helping combat this prevalent ‘human’ problem. Here we explore some of the episode’s highlights.

What a waste…

To say global food waste is a huge problem seems like an understatement. Nearly one third of all food around the world is currently being wasted. Estimates also suggest that 8-10% of global greenhouse gas emissions are associated with food that’s not consumed. In Shawn’s own words, there are some stark facts that are hard to ignore:

“In the United States about 40% of food is wasted while, at the same time, 40 million people in the US are suffering from hunger, including 12 million children. If we could reduce and redistribute food waste by 15%, we’d actually feed more than half of those hungry people. And it’s a similar story in Europe where 153 million tons of food is being wasted.”

Not just about hunger…

In addition to hunger, food waste is directly connected to all kinds of additional concerns – such as resource conservation, carbon emissions, and climate change. Clearly hugely passionate about the subject, Shawn explains why we all need to commit to change.

“With our rising population, the situation is only going to get worse and, if we could reduce or redistribute that food waste, we could have a massive global impact.”

Where is it happening?

In the US, 60% of food is wasted before it even reaches the consumer. In Europe, 55% to 60% occurs in consumer households. Whereas in developing countries, most waste happens during agricultural production. Where you are in the world’s supply chains can make a big difference.

What part can AI play?

It’s increasingly clear that AI can and is starting to play an important role in turning the tide on food wastage. Exciting hi-tech innovations are enabling more sustainable farming – such as AI-enabled monitors, computer vision, remote sensing, as well as robots. Shawn highlights how this technology is revolutionizing vertical farming.

“We are now seeing single vertical farms that produce the same amount of fruit and vegetables as an 80-acre farm and they are using 97% less water.”

A Swedish company is transforming disused office buildings into autonomously controlled greenhouses and a company in Singapore has created the world’s first low carbon hydraulic water-driven vertical farming system.

Throughout the supply chain, AI is becoming an indispensable planning tool. And Shawn has seen this first-hand, thanks to her time at Shelf Engine – an end-to-end grocery ordering solution using advanced AI.

“We worked with grocery stores, using inventory simulations to optimize the freshness of food by predicting customer demand…Connecting data across the supply chain facilitates better informed decisions.”

Knowledge is power

Then there’s a need for efficient and effective monitoring of what’s actually being wasted – something AI now has the capability of doing in granular detail.

“AI-powered garbage cans equipped with weight sensors, cameras and computer vision have the ability to recognize and track the amount and type of food we’re throwing away.”

As well as in the home, these can be used in restaurants, hotels, and other businesses – enabling people to think carefully about their waste, while helping companies effectively monitor and understand what’s being thrown away and when.

You may have heard of Ikea partnering with Winnow Vision AI to track kitchen waste using computer vision technologies. Well, Ikea then used this data to implement changes resulting in a saving of 20 million meals. That amounted to 40,000 tons of carbon dioxide.

Food for thought

The US Department of Agriculture has set a target of reducing food waste by 10% within the next decade. To achieve this, Shawn believes education in the capability of AI is the next vital step.

“We want to see more organizations thinking about the food that they waste and realizing how they can make a massive difference by adopting different AI technologies.”

To hear more of our valuable discussion on how AI has the power to reduce food waste, tune in to our latest Beyond Data podcast episode now.

<< PREVIOUS EPISODE

NEXT EPISODE >>

The dark side of AI energy consumption – and what to do about it

Written by Brendan Walsh on 3 October 2022. Posted in Beyond Data Podcast, Data Ethics, Data Privacy, Data Strategy, Glossary, Machine Learning, Machine Learning. No Comments on The dark side of AI energy consumption – and what to do about it

Listen on Spotify

Artificial Intelligence’s ability to augment and support progress and development over the past few decades is inarguable. However, when does it become damaging, contradictory even? In our latest Beyond Data podcast AI’s Climate Jekyll & Hyde – friend and foe, Tessa Jones (our VP of Data Science, Research & Development) and Sophie Chase-Borthwick (our Data Ethics & Governance Lead) discuss exactly this with Joe Baguley, Vice President and Chief Technology Officer, EMEA, VMware.

Our speakers explore the multifaceted topic of energy consumption and AI – from whether all applications are equal for energy consumption (or reflecting if there are some ‘better’ than others), to creating visibility and responsibility of energy consumption for all stakeholders. Here we try to give clarity to some of the grey areas that were discussed.

Should we consider all applications equal?

“AI and machine learning are about huge things, huge data sets, huge computation actions … all of those have huge implications in terms of energy,” Joe observes, before dropping in hugely sobering stats such as the total annual energy consumption of bitcoin being the same as Norway. Even when considering the often-touted argument of 57% of the energy for bitcoin mining using renewables, Joe counters: “But those renewables could have been used for something else, right? Those solar panels… and those hydropower stations and those wind turbines, we could be using them for something else.”

This raises the ethical question of whether there should be stricter governance, standards, and precedent set on more ‘moral’ applications for their energy consumption. Should we be more closely considering the difference in energy consumption between server farms that support minimizing food waste versus those that are focused on mining digital currency, for example?

“Is there an opportunity for [greater] regulation?” Tessa ponders. Would this regulation help challenge the current status quo for all applications’ energy consumption being considered equal? While Sophie observes: “We’ve had certain European nations start to put rules around data center expansion, where you’re allowed and not allowed to build because of the capacity there, which isn’t regulating the use of it. But it does have that knock-on effect that if you literally can’t build the data center support, you have to start thinking about other ways to build [models].”

When considering Sophie’s point on alternative ways to build models, Joe notes: “We’re using AI to deal with the symptoms, but maybe there’s some better ways we could be using AI to deal with the cause as well”.

And this all raises the next question – who should ultimately be making these ongoing moral calls for the environment and energy usage?

Embedding Environmental, Social, and Governance (ESG) by design

Environmental, Social, and Governance (ESG) is shorthand for a framework that helps stakeholders understand how an organization is managing risks and opportunities related to environmental, social, and governance criteria. Our speakers untangle the idea of ESG and how companies could use it to help triage the different applications they use.

Joe asks: “Is there an ESG-led marketing opportunity here? Your AI might be the same as my AI, but my AI is better from an ESG perspective. They both get the same results at the same time for the same cost, but this one’s better from an ESG perspective, in terms of sustainability, in terms of social good, in terms of environmental.”

By placing more emphasis on ESG as the criterion for measuring impact and success, it could help with embedding sustainability in the heart of the application’s deployment, rather than a siloed approach. Sophie agrees: “We have privacy by design, we have security by design. Why not have ESG by design?”

Following on from this thought, our speakers consider the cost implications of AI and ESG with Joe observing, “There’s a lot of businesses right now that can’t afford AI because it’s expensive…but I believe they will come to a tipping point where they can’t afford not to”.

Are we over-prioritizing accuracy?

“There’s a hyper-focus on the accuracy,” according to Tessa. “It ends up not even being about the motivation for green, it’s a motivation for fast training, fast tuning. Unfortunately, it’s how most data scientists are motivated; be faster without having to compromise their accuracy.”

Often, the increase in accuracy can be mapped on a logarithmic graph. Good gains at first, but quickly tapering off to minimal increase. Is it useful to be that much more accurate, often by points of a decimal? “Some are good, more must be better … people just keep going, as opposed to saying actually good enough is good enough,” Joe summarizes.

Instead of chasing marginally better accuracy each time, we should be considering the application in a holistic view. The increase in accuracy might be 0.01%, but would cost heavily for energy consumption – is it worth it? Should we be better at exposing these costs more vigorously throughout a team so everyone can feel more empowered and have the visibility to interrogate more closely?

To hear about how our speakers untangle these controversial questions and more, tune in now to Beyond Data podcast episode 3: AI’s Climate Jekyll & Hyde – friend and foe.

<< PREVIOUS EPISODE

NEXT EPISODE >>

Vehicle Autonomy; the good, the bad, and the complicated

Written by Brendan Walsh on 7 September 2022. Posted in Beyond Data Podcast, Data Ethics, Data Insights, Data Privacy, Data Warehouse, Glossary, Machine Learning, Machine Learning. No Comments on Vehicle Autonomy; the good, the bad, and the complicated

Listen on Spotify

In our second Beyond Data podcast episode ‘Autonomous mass transportation and its impact on citizen privacy’, we will sit down with Beep’s Chief Technology Officer, Clayton Tino to explore the current landscape of autonomous vehicles (AVs), whether AVs truly can replace the human factor in public transportation, and how AV ethics can be holistically measured. Here we give you a snapshot of that fascinating discussion by digging into a few of the explored topics.

You can watch episode 1 here

When looking at AV ethics, there are two strands to consider:

1: The ethics programmed into the AV itself (e.g., how the AV ‘decides’ which course to take when it identifies a hazard, otherwise known as the ‘trolley car’ scenario).
2: The ethics surrounding embedding AVs into society (e.g., whether we can truly replace the human factor in AVs, or what level of surveillance AVs should have).

Going beyond the trolley car scenario

Often touted as the litmus test for AV ethics, the ‘trolley car’ or ‘trolley problem’ is a thought experiment where someone chooses between saving five people in danger of being hit by a runaway trolley by diverting the trolley to hit one person. This is extrapolated to AVs by using a scenario such as an AV traveling down the street when suddenly a group of pedestrians runs out. The AV must ‘choose’ between hitting the group or altering its course but by doing so, hitting a lone pedestrian.

The ‘Moral Machine’ experiment was an online survey of 2.3 million people worldwide that investigated the moral dilemmas faced by autonomous vehicles. The study found that moral principles guiding drivers’ decisions varied from country to country, and also women and men viewed ethical and moral situations differently. This made something like the trolley problem difficult to quantify and standardize worldwide.

Far from a simple ethics exercise…

On the surface, it seems a simple ethics exercise. But as Clayton Tino summises: “People like to think they have a preconceived notion of how they would behave, but I just don’t buy that. [A near miss] is a purely reactive response. We’re setting unrealistic expectations on the machine because we need to blame something when something goes wrong.” Tessa Jones (podcast co-host) agrees, observing: “AVs need some decision-making process, but I don’t have a decision making process myself.”

As Sophie Chase-Borthwick (podcast co-host) explains: “We expect our AVs to be guaranteed safe. But we know that any other vehicles are not 100% safe with a human behind them. So we have a higher expectation of what ‘safe’ looks like when it’s autonomous [as opposed to] to when it’s a human.”

In our opinion, the disproportionate emphasis placed on the trolley problem to solve the lion’s share of AV ethics is reductive and dangerous to advancing AV technology. It’s a useful piece of the puzzle but it’s a symptom when we should be focusing on fixing the cause.

In our podcast, we also explore the importance of accurate and timely hazard perception (both in humans and AVs). By improving hazard perception, it not only provides safety methods for AVs but can help reduce or mitigate entirely AVs even having to make the trolley problem decision in the first place.

Can we ever truly replicate the human factor?

There are five levels in the maturity of autonomy of AVs – with Level 1 being no autonomy and Level 5 being a vehicle without a driver safely taking you to where you want to go.

For Clayton, Tessa and Sophie the debate centers on where the application of AVs could work best with the least blockers. They wonder whether public transportation seems an ideal choice, given how it could be geo-fenced, fixed route and hyper-local.

However, when considering AVs in the context of public transportation, they realize it’s important to look at the holistic service of public transportation, beyond just the driving. As Clayton pithily observes when considering AVs for school buses, “[Bus drivers] do a heck of a lot more than just drive the bus … they need to be aware of passenger safety and security, assistance…”.

For example, in London, there’s been some disputes between wheelchair users and pram users about who has first access to the space. Bus drivers (and others in charge of public transportation) are expected to act as mediators to settle these disputes. How would this be replicated in an AV with no human factor?

The answer could lie in more secure and closely governed surveillance. Having surveillance on public transport AVs could add a safety layer to minimize vandalism, protect the users and ensure the AVs remain a reliable and safe choice. Our podcasters observe the marked differences between privacy in the US and Europe but with the introduction of GDPR-style laws such as the California Consumer Protection Act (CPPA), there will inevitably be more scrutiny on how the surveillance data is used and stored.

However, as is often the case with autonomy when it comes to public transport there’s no easy decision. By removing the human factor, there need to be other allowances made to fill the gap. Companies and governments need to work hard to make sure both the users and their data are protected and that these allowances do not harm the end-users or misuse them for commercial purposes.

Our podcast delves more into the nuances and pitfalls when considering the commoditization of a public service, such as public transportation. Generally, the people who need it most are vulnerable, and unless there’s a significant level of transparency, can users be fully aware and able to consent to the wider implications of being surveilled?

To hear more about how we untangle and much more, watch our episode on ‘Autonomous mass transportation and its impact on citizen privacy ’.

<< PREVIOUS EPISODE

NEXT EPISODE >>

How intelligent are AI tea-making robots?

Written by Cynthia Hoza on 28 July 2022. Posted in Beyond Data Podcast, Glossary, Machine Learning, Managed Cloud. No Comments on How intelligent are AI tea-making robots?

Listen on Spotify

When it comes to how truly intelligent Artificial Intelligence (AI) is, it’s a polarizing debate. Either AI will solve the world’s woes or robots will rule us all – Matrix-style. But it’s all a little more complicated than Hollywood makes it seem…

Watch podcast episode 2 here

For a deep dive, do listen to our Beyond the Data podcast hosted by Sophie Chase-Borthwick (Calligo’s Global Data & Governance Lead) and Tessa Jones (VP of Data Science Research & Development).

Meanwhile, in this blog we look at tea-making and social care robots to illustrate an otherwise very nuanced and arguably never-ending narrative on the ‘intelligence’ part of the AI equation.

It’s important first to consider the different types of AI:

The majority of AI is ‘narrow AI’ – a single task, building a system to perform a particular task. You can build lots of narrow AI systems to perform together.
General AI, in comparison, is a lot more broad – intelligent machines that can learn, perform, and comprehend intellectual tasks much like a human. This is the territory where it’s a lot less clear-cut.

Let’s unpick the gray area of ‘general AI’, by looking at what robots are capable of – and whether this makes them truly intelligent, yet…

Tea-making as a success criteria for intelligence?

A robot making a cup of tea isn’t something a lot of us think twice about and wouldn’t be the first example of proving intelligence in a typical setting. However, scientists are doing just this, typically by:
1. Coding in the tasks a robot has to complete first (boil kettle, get cup, put the teabag in and so on).

2. Using experience-based learning to demonstrate how to make a cup of tea. When the robot doesn’t do it well or something is not done correctly, then the robot is given more examples of how to do that task.

To successfully have the robot make a cup of tea, scientists are having to build in and prescribe a lot of the parameters and tasks a robot has to complete. However, if the environment changes (for example a robot has to make a cup of tea in a different room) it would likely struggle because it isn’t familiar with the environment and the parameters.

Intelligence can’t just be about managing to do a task correctly; it’s being able to use inference to adapt in a new environment and navigate unfamiliar parameters to complete a task.

However, this adaptation and re-learning is a lot slower for robots than it is for humans. As Tessa Jones highlights, it’s referred to as Moravec’s paradox and essentially means it’s easy to train robots to do things that humans find hard, like chess and logic-driven tasks. However, it’s hard to train robots to do things humans find easy, like walking and image recognition.

In the podcast Sophie Chase-Borthwick observes: “Playing a game of chess is very rule-based [and easy to code into a robot] whereas making a decent cup of tea is definitely an art”.

Using a Japanese concept to make robots more human

This image has an empty alt attribute; its file name is MicrosoftTeams-image-111-2-1024x576.png

When looking at robots comprehending tasks much like a human, what could be more human than caring for one another? Japan is leading the exploration of the use of social robotics for assisted care. However, rather than the robot just serving a functional task, Japanese scientists are building one step further…

“There’s a concept coming out of Japan – a concept called ‘kokoro’”, says Tessa. “For robots to actually be effective and useful, there needs to be a heart-to-heart connection between the human and the robot”. There’s typically three kinds of kokoro you can achieve:

1. How the robot affects the human. If the human is feeling sick, whether the robot can interact in a way that lifts their spirits – for example Paro, a soft baby seal robot designed for use in hospitals and nursing homes as a therapeutic tool.

2. Whether the robot understands a human’s emotions. The robot can conceptualize when the human is feeling sad or angry. But getting this right is very difficult, as it’s hard to detect between anger and happiness based on imagery and voice. Microsoft has even recently stopped a lot of its programs around emotion detection as it opens the door to racial biases, and different facial and voice features.

3. When the robot itself feels and has its own ‘kokoro’. Currently, this remains confined to science fiction as it maps to ‘super intelligence.’

However, it’s worth considering the spectrum of human diversity. For example, neurodiverse people don’t always recognise what some emotions are but they are still intelligent. So recognising emotions and responding to them on its own isn’t a demonstration of intelligence.

As Sophie poignantly puts it: “Are we re-defining intelligence to suit the machines – and in doing so, carving out some humans?”.

NEXT EPISODE >>

Why data-ambitious organizations need more than a Chief Data Officer (CDO)

Written by Brendan Walsh on 4 February 2022. Posted in Data Governance, Data Privacy, Data Protection, Glossary. No Comments on Why data-ambitious organizations need more than a Chief Data Officer (CDO)

The rise of the CDO

The potential value of data – if used optimally – is unquestioned.

In recent years, there has been a clear acceleration in the number of organizations keen to not only better understand their data’s potential, but also govern it more rigorously, structure it more usefully and use it more creatively.

And so, they appoint a Chief Data Officer (CDO) to drive this change.

This person – the business hopes – will “take hold of the data problem”, pulling sources and siloes together to create clarity, drive automation, place data and insights into the hands of the front line, and improve business performance and customer satisfaction.

Discussing Client Ambition

When discussing these ambitions with our clients, the excitement and optimism is clear. But what is often missed, or at best over-simplified, is the need to execute safely.

Managing the security risk to the organization is a fundamental part of a CDO’s remit. Depending on the organizational structure, it is usually shared with or delegated to a dedicated CISO or equivalent.

Similarly, compliance with industry regulations and certifications such as ISO and SOC comes under the governance aspect of the CDO role (again, often shared with / delegated to the CISO)

But what about Data Privacy?

CDOs and data privacy

In the pursuit of these ambitious data goals, while the CDO and/or CISO handle security and compliance, who will manage the privacy-related risks to the organization? And the risk to the data subjects?

What data is personally-identifiable, and therefore subject to data privacy laws?
Where is this data received from and held?
How retrievable is it?
How is it used?
Will personal data be exposed to machine learning or automated decision-making?
When and how is personal data shared?
Or disposed of?

In tackling these questions, some organisations believe the CDO can also perform the Data Protection Officer (DPO) role, or have one report into them or the CISO. Others appoint a Chief Privacy Officer, thinking they are the same as a DPO, or a “DPO+”. Others ignore the need for privacy oversight altogether.

None of these answers are wise. Some are even illegal and can result in penalties.

The truth is, most data-ambitious organizations require all three roles. Without them, data safety is jeopardised and the company is at risk of non-compliance, breaches, inefficiency and missed opportunity.

But how the remits are best defined and structured is often a mystery.

Below is a guide to the three pertinent roles – Chief Data Officer (CDO), Chief Privacy Officer (CPO) and Data Protection Officer (DPO) – outlining why each role is essential for every data-ambitious organization, plus their differences, inter-relationships, boundaries and overlaps.

CDOs and data privacy

What data is personally-identifiable, and therefore subject to data privacy laws?
Where is this data received from and held?
How retrievable is it?
How is it used?
Will personal data be exposed to machine learning or automated decision-making?
When and how is personal data shared?
Or disposed of?

None of these answers are wise. Some are even illegal and can result in penalties.

But how the remits are best defined and structured is often a mystery.

Who you need

The Chief Data Officer (CDO)

Responsible for using data to best effect. The basis of this is data governance – its stewardship, consolidation, structure, management and distribution, but also the security and compliance risk it presents. On top of this lies innovation and how it can be most profitably exploited, whether through automation, analysis or data science.

The Chief Privacy Officer (CPO)

This role sits within the overall CDO responsibility. This role adds the perspective of privacy compliance to the CDO function, specifically in terms of any action’s risk to the company. As such, they will lead on the construction of the privacy programme, its roll-out and training and any necessary assessments.

The Data Protection Officer (DPO)

Represents the data subject within the organization. They oversee activities from data processing, assessments and employee training to ensure that none of them conflict with data subjects’ privacy rights, and as such must maintain independence from activities and reporting lines. While perhaps not technically required within your organization (for instance if you are not a public body, do not systematically process personal data as a core activity, or are not processing ‘large volumes’ of sensitive data), it is nonetheless a firmly recommended role for any data-ambitious organization with any degree of use of personal data.

Can these roles be combined into single individuals?

The CDO and CPO can be the same person, and arguably should be to ensure that the entirety of data safety – security and privacy – are the foundations of all data use and governance, and reducing the risk of accidental non-compliance, or painful retrofitting of compliance requirements.

The DPO and CDO (and/or CPO) must never be the same person, as it would create a punishable conflict of interest. They should not even be in the same reporting structure. The DPO’s role is to independently monitor and question all activities, strategic policies and objectives, which means they need the platform to challenge every level of the organization.

The risk of getting this wrong

Risk of unethical / non-compliant data processing

Our data privacy experts have often seen overenthusiasm and ambition innocently leading to personal data being misused. Without anyone overseeing the privacy risk to the data subject (DPO) or even the business (CPO), and a focus only on security, then organizations can easily overstep.

Missed opportunity

DPOs and CPOs are often mistaken for naysayers, as they too often focus on limiting what can be done with data and curtailing the ambition of the CDO. In fact, the best DPOs and CPOs will support the CDO’s objectives, by suggesting innovative approaches to data use that balance ambition with risk.

Delays

If privacy is not a foundation on which data ambitions are built, then it will either be forgotten or retrofitted. The former creates risk of breaches, while the latter creates delays. Projects that lay privacy on top, rather than being designed with it in mind from the outset, risk needing costly redesign and rebuilding.

Conflict of interest

A DPO has to be independent of the day-to-day processes of data management, including its receipt, use, treatment and security. This rules out those job titles that are classically given this second role, such as CIOs and Heads of Compliance, and that regulators are now punishing.

The Chief Data Officer (CDO)

Remit unique to this position:

Data governance

Ranging from data’s structure and architecture to its management and ongoing quality assurance. Accurate and efficient data governance is the foundation stone of all data initiatives. Data siloes, untidy or incomplete data and inconsistent data structures are the principle barriers to data ambitions.

Security-related risk to company

Clearly overlapping with the above, the CDO is required to identify where the ambitions for data’s structure, storage and use will create security and regulatory compliance risk. Working with the CISO – who may be alongside or within the CDO’s team – these risks then need to be mitigated comprehensively, and without obstructing operations.

Innovation / Data Science & Insights

This is the principal reason for the appointment of a CDO: using data creatively to further the aims of the organization as a whole. Building on the groundwork of data governance and security, this may be through automation, analytics, visualizations, machine learning or other forms of AI. Projects may be intended for internal efficiency, or the development of new products and services, but one truth remains at every initiative’s core: using data more intelligently.

The Chief Privacy Officer (CPO)

Remit unique to this position:

Privacy-related risk to company

While the CDO handles the security-related risk, the CPO looks specifically at personally-identifiable data, how well protected it is and how ethically / compliantly it is used. This will include determining how all the organization’s activities affect the regulations whose scope they fall under, and ensuring the various obligations are all addressed.

Clearly, this responsibility overlaps with the CDO’s security-related remit, and requires the cooperation of the CISO, as a lot (though not all) of a privacy-focused risk assessment is based in typical security technical and organizational measures (TOMs). As such, the CPO role may well be part of the CDO’s, if the individual has the relevant privacy skills.

Devise & deploy the privacy programme

This is the tactical implementation of the above. It involves the creation of policies and processes that will protect personal data in every department, by every user and with every data interaction, and specifically on an ongoing basis.

Unlike many other areas of compliance, data privacy requires continuous management and oversight. A breach of ISO compliance requirements on a given day is unlikely to jeopardise completing the next audit’s requirements and maintaining certification. In contrast, a single breach of data privacy requirements could result in customer dissatisfaction, being reported to regulators and potentially fines and irreparable brand damage. As such, the deployment of the privacy programme must ensure continuous protection.

Data Protection Officer

Remit unique to this position:

Privacy-related risk to data subjects

This is the crux of the DPO role. A Data Protection Officer is one of few senior roles who categorically do not serve the interests of the organization, but of third parties – arguably the only one. It is this unusual perspective that requires them to be independent of the mechanics of the organization, and that underpins all other responsibilities.

Oversight

The DPO is responsible for continuously monitoring all data processing activities and independently assessing their adherence to the GDPR and any other relevant legislation. Any faults or risks found are then the responsibility of the CPO and/or CDO to remedy, working alongside any relevant departmental head.

Internal audit

Part of the Oversight role above will include regular internal audits of data processing activities. An initial GAP Analysis will show a baseline of compliance, while subsequent periodic audits will showcase the evolving privacy maturity of the organization, plus any persistent weaknesses.

Liaison with authorities and data subjects

DPOs also act as a conduit for all communications with supervisory authorities and data subjects. They may do this proactively, for example securing approval from authorities on the legitimacy of any new and unusual data processing initiatives. DPOs will also handle the communications with any data subjects in the case of Data Subject Requests.

The Shared Remits

Shared remit: CDO & DPO

Automated decision-making

This is a crucial overlap. For many data-ambitious organizations, especially those in consumer services such as banking, telecoms or utilities, there will be a drive to use automation or machine learning to systematize interactions with customers based on the data on them as individuals. These may include the pricing and terms offered to them, which would mean that automated decisions are being made that have a legal or similarly significant effect – which is specifically limited by the GDPR and many other privacy regulations that followed in its footsteps.

This is therefore a classic example of a situation where the CDO and the DPO would have to work together to ensure that the project is legitimately designed and executed, and is highly indicative of why the DPO cannot be the same person or even be in the same reporting structure as the CDO. The CDO’s project needs to be able to be objectively critiqued and perhaps stopped by an independent DPO.

Shared remit: CDO & CPO

Ethical Data Impact Assessments (EDIAs)

EDIAs are modern supplements to the pre-existing Data Protection Impact Assessment (DPIA), and are effectively documented evidence of the scrutiny required above in instances of Automated Decision-making.

While not specifically required by privacy legislation or guidance – as a DPIA is – the sort of rigour they encompass is. As mentioned above, references are found in the GDPR and many other pursuant regulations. The extra scrutiny is recommended because of the deliberate removal of human oversight from processes, and therefore the risk of the inadvertent removal of understanding, proportionality, fairness and even values.

For a DPIA, a DPO and a CPO (see below) will collaborate on mitigating the risks to data subjects – hence the DPO’s involvement.

An EDIA’s extra considerations beyond a DPIA focus on accountability, transparency, necessity and sustainability. These are more technical, strategic and concerned with personal rights including but also beyond privacy, such as the right to not be discriminated against.

The CDO’s input will therefore cover the technical and strategic sides, while the CPO is best placed to review the technology’s ethical use. In truth, this is not a perfect fit. But there are few alternatives. A DPO’s role is to monitor activity through a strict lens of protecting data subjects’ privacy rights – and arguably their independence means their role can never be to perform assessments, only to review. Legal counsel is concerned with the application of the codified law, not the wider topic of ethics. Compliance roles are similarly used to implement specific rules and standards.

Upholding ethics is different by its nature, and not typically a nominated role within organizations, but a CPO is arguably the closest fit, not least because they lead the completion of DPIAs, on which EDIAs are based.

Shared remit: CPO & DPO

Training employees

This is part of the CPO’s deployment of the overall privacy programme, but requires the involvement of the DPO because of their responsibility for monitoring internal compliance. Acting on behalf of data subjects, the DPO will check the suitability and comprehensiveness of the training programme, in essence confirming that should the training be satisfactorily completed (the CPO’s responsibility to ensure), then data subjects’ rights are protected

Data Protection Impact Assessments (DPIAs)

These tools identify any potential risks that may arise from processing personal data, allowing the organization to minimise and negate them in advance. They are a key requirement for demonstrating adherence to GDPR and most other privacy regulations, and should be completed for every way in which an organization processes data.

They are the CPO’s responsibility to perform, though as with the Training above, the DPO is required to provide an oversight role to ensure data subjects’ rights are protected. They will advise the CPO on whether a DPIA is necessary in any given situation, how it should be performed, what measures can be legitimately put in place to negate any risks identified, and whether the ultimate decision that process is permitted or not is correct.

This process and shared responsibility applies equally to other privacy adherence tools such as Legitimate Interest Assessments (LIAs), where the CPO is responsible for performing the duty, while the DPO ensures their completion and verifies their outcomes.

Data Subject Access Requests (DSARs)

Some of the most common instances of CPOs and DPOs having to collaborate are on DSARs. In some industries, these are rather common, especially those with high volumes of consumer interaction such as retail, utilities, telecoms and retail banking. A CPO will be responsible for the performance of the DSAR – for example, verifying the identity of the data subject and collecting relevant data – while the DPO will be responsible for overseeing the process, approving the data to be shared, ensuring deadlines are met and handling communications with the data subject.

The Universal Responsibilities

Data Quality

All three Data Officers have a responsibility – or at least a vested interest – in maintaining the continuous quality of all the organisation’s data.

For a CDO, this is of course a principal strategic objective. Better use of data relies on data sources being cleansed for interrogation, and probably integrated under common data models to allow for deeper insights. But without continuous data governance – the process by which data quality is preserved – then interrogation becomes impossible, and integrations fall apart.
Data quality requires common rules – defined and upheld ultimately by the CDO – for how data is collected and stored; agreed responsibilities for how it is maintained and kept complete, credible, useful and clean,; and a clear vision for how it may be used.
The CPO and DPO will also have involvement in this, and vested interests in its performance. How and where the CDO decides to store data will need to adhere to data residency and sovereignty requirements. Data privacy regulations routinely give data subjects a Right to Accuracy, where every reasonable step must be taken to rectify data inaccuracies or erase data if no longer correct. And of course, without complete, clean and credible data, then DSARs cannot be accurately performed, and DPIAs and other typical processes cannot be conducted or verified easily.

DPIAs in fact even have a specific question of:

“Are you satisfied that the personal data processed is of good enough quality for the purposes proposed? If not, why not?”

Of course, the easiest way for Data Quality to serve all three Data Officers needs is to base the organization’s Data Quality framework on the principles of Privacy by Design & Default.

Contracts

While the above is a strategic imperative that requires all three Data Officers’ involvement, this is a tactical overlap.

Contracts with new suppliers, partners, and potentially customers that inherently involve the processing of personal data create responsibilities for CDOs, DPOs and CPOs alike.
A CDO needs to ensure that the contract and the mechanics of the engagement will not undermine or contradict any element of data governance. For example, if the new contract is with a new cloud services provider, can the provider support any ISO, SOC or PCI obligations? If the contract is with a new CRM, is the data structure consistent with any pre-existing common data model and how will data quality and accuracy be maintained? And in all cases, what security measures are in place to protect data from internal and external threats?
Meanwhile, a CPO will be concerned with whether the contract is in line with the organization’s privacy obligations. To use the example of the new cloud provider again, will data residency obligations be met? Or for new SaaS platforms, where will data be stored and are the correct cross-border data transfer mechanisms such as Standard Contractual Clauses (SCCs) in place?
Finally, a DPO’s role in a contract scenario is to review the legitimacy of the decisions made above, and verify that the privacy of data subjects’ personal data will not be jeopardised – regardless of whether the organization is a controller or a processor in the given scenario.

The Core Lessons

All three roles – CDO, CPO, DPO – are probably required in your organization, even if a DPO is not strictly required it is nonetheless advisable.
The CDO can also be the CPO, but the DPO must be independent.
The CDO defines the strategy and is responsible for the vision of what is to be accomplished with your organization’s data. This will include its structure, security, governance, maintenance and creation of value.
The CPO is responsible for ensuring that the implementation of this strategy will not put the organization at any privacy-related risk, and is tasked with mitigating any risk with a defined and well-executed privacy programme.
The DPO is the representative of the data subject within the organization, and is primarily responsible for overseeing the activities and ensuring no rights are or could be infringed.
The more fundamental or complex the operation (such as data quality or intelligent data use), the more likely it is to require all three roles.
Putting privacy – and better yet, total data safety – at the heart of every data initiative and interaction will make it more likely that every role’s agendas are equally met.

Unlocking Property Management Insights: Extracting and Analyzing Yardi Data

1. Optimize Call Center Staff

2. Market Penetration

3. Store Location Optimization

4. Service Interruption Detection

5. Customer Segmentation

Exorcizing the ghost in the machine

Fact or fiction? Microtargeting with lie machines

Computational propaganda

Doing better – where does the onus sit? User or developer?

Balance the field – breaking the lie machines

IPIE – bringing down the lie machine

1. Improve STAR Rating

2. Health Crisis Preparedness

3. Optimize Staff Scheduling

4. Medical Supply Logistics

5. Patient Insights

6. Reduce Patient Wait Time

7. Reduce Readmission Rates

8. Improve ER Admittance

9. Improve Screening Frequency

10. De-Identification of Data

At societal level, poor communication costs lives

“Every single thing we experience can be captured as data”

Piercing the veil

Data can no longer exist in siloes if it’s to be useful to society

Sidestep potential conflict of interest

10x as many DPO vacancies as there are qualified individuals

“It can do anything, because it was trained on everything”

Confidence or competence?

Taking liberties with artistic license

Guard rails contain or keep out discriminatory language?

NLPs provide assistance, not autonomy

Why is there bias?

Knock-on effects

Potential of AI to help overcome bias

Time to step it up…

What a waste…

Not just about hunger…

Where is it happening?

What part can AI play?

Knowledge is power

Food for thought

Should we consider all applications equal?

Embedding Environmental, Social, and Governance (ESG) by design

Are we over-prioritizing accuracy?

When looking at AV ethics, there are two strands to consider:

Going beyond the trolley car scenario

Far from a simple ethics exercise…

Can we ever truly replicate the human factor?

Tea-making as a success criteria for intelligence?

Using a Japanese concept to make robots more human

The rise of the CDO

Discussing Client Ambition

CDOs and data privacy

CDOs and data privacy

Who you need

The Chief Data Officer (CDO)

The Chief Privacy Officer (CPO)

The Data Protection Officer (DPO)

Can these roles be combined into single individuals?

The risk of getting this wrong

Risk of unethical / non-compliant data processing

Missed opportunity

Delays

Conflict of interest

The Chief Data Officer (CDO)

Remit unique to this position:

Data governance

Security-related risk to company

Innovation / Data Science & Insights

The Chief Privacy Officer (CPO)

Remit unique to this position:

Privacy-related risk to company

Devise & deploy the privacy programme

Data Protection Officer

Remit unique to this position:

Privacy-related risk to data subjects

Oversight

Internal audit