Data Sovereignty Unveiled – Balancing Rights, Privacy, and Innovation
Written by Cynthia Hoza on . Posted in Beyond Data Podcast, Data Ethics, Data Insights, Data Privacy, Data Privacy, Data Protection, Machine Learning. No Comments on Data Sovereignty Unveiled – Balancing Rights, Privacy, and Innovation
In this episode of the Beyond Data podcast series, Tessa Jones (Calligo’s Chief Data Scientist) and Peter Matson (ML Solution Architect) are joined by Martin Hoskin, Chief Technologist at VMware and Advisory Board Member for the Centre for Data Ethics & Innovation. In this enlightening discussion, we delve into the concept of data sovereignty and its implications for ethical data use, as well as explore how federated learning offers a promising solution to the challenges we face.
Understanding Data Sovereignty
Data sovereignty encompasses the notion of data residency, access control, and governance. The dominance of American cloud providers, subject to U.S. laws, raises concerns about data privacy and security, particularly in the European context. For certain organizations, like government agencies and defense suppliers, data sovereignty becomes a critical factor. VMware has introduced a program to certify partners as Sovereign, ensuring data storage, processing, and governance are specified, differentiating them from major hyperscale cloud providers.
The Challenge of Data Sharing
Data sovereignty also touches upon the ethical dilemma of sharing data for legitimate purposes like law enforcement investigations. Striking a balance between data privacy and the greater good is complex. For instance, the case of Apple’s cloud security raises questions about when governments should access personal data to combat serious crimes.
Federated learning emerges as a promising solution to data sharing challenges. This approach enables entities to collaboratively train machine learning models without sharing raw data. Instead, local models are trained on separate datasets, and only aggregated model updates are shared with a central server. This preserves privacy and protects sensitive data, making it suitable for applications like fraud detection in the banking industry.
Experimenting with Federated Learning
The Centre for Data Ethics & Innovation (CDI) conducted an experiment using federated learning for government-provided services. The CDI set up two data sets—one for detecting fraud in financial transactions using SWIFT data and another for studying the spread of COVID-19. The experiment highlighted the complexities of sharing data, including obtaining government buy-in and ensuring data anonymization to protect privacy.
While federated learning is ingenious, it comes with its own set of challenges. Concerns arise about the aggregator potentially being reverse engineered to extract sensitive information. Additionally, the scale of data involved in real-world applications may make reverse engineering even more difficult.
As data continues to play a critical role in various industries, addressing data sovereignty and privacy concerns remains paramount. Federated learning offers a way to enable collaboration without compromising data privacy. However, continuous innovation is necessary to tackle challenges like reverse engineering and fully realize the potential benefits of this approach.
Ethical Considerations in AI and Data Technology
The conversation takes a broader turn, exploring the intersection of AI, data, and ethics. AI development should consider risks, probabilities, and potential biases to build robust and ethical systems. Ethical implications of sharing genetic data and the responsibility of pharmaceutical companies in handling such information are discussed.
Regulating AI Ethics and the Divide between Academia and Industry
The need for clear regulations to define and enforce ethical standards in AI and data technology is acknowledged. Balancing philosophical academic perspectives with industry practicality becomes essential as AI progresses toward stronger AI with self-learning capabilities.
Navigating Legal Frameworks and Data Sharing in Healthcare
Enforcing ethical standards and regulations on a global scale, especially with rogue states, poses challenges. Collaboration through global forums, like Gaia X, can facilitate trust, data security, and individual interpretations of frameworks. Standardized data-sharing frameworks and data portability regulations can address data sharing challenges in healthcare.
Autonomous Weapons and the Role of Global Forums
The ethical challenges of deploying AI in autonomous weapons, especially in making life and death decisions, raise profound moral dilemmas. The hosts stress the importance of engaging in public discourse and involving the global community to shape AI and robotics’ future.
The Impact of Social Media on Data Privacy
The podcast concludes with a discussion on the influence of social media on data privacy and the ethical considerations surrounding its use. Addressing the impact on young minds and the potential implications on decision-making, including voting rights for 16- and 17-year-olds, is highlighted.
In conclusion, data sovereignty, AI ethics, and federated learning are crucial components of an evolving data landscape. Ethical considerations must be at the forefront of AI development and data sharing to ensure responsible and equitable data-driven futures. By embracing ethical practices and fostering interdisciplinary collaboration, we can harness the potential of AI while respecting individual rights and privacy. Establishing global forums and transparent public discussions will play a pivotal role in shaping the future of AI and robotics in a manner that benefits humanity as a whole.
Listen on Spotify or watch below
Powering up ESG through digital transformation
Written by Cynthia Hoza on . Posted in Data Ethics, Data Insights, Data Strategy, Glossary. No Comments on Powering up ESG through digital transformation
The term ‘ESG’ (Environmental, Social and Governance) is everywhere. In its own right, the potential impact is important enough, but it can so often be viewed as a standalone initiative. At its worst it becomes a tick box exercise, when in fact its real benefit is in informing and driving fundamental changes in your organization’s wider actions and endeavors.
ESG – good for the planet, good for business
In January 2023, the EU’s Corporate Sustainability Reporting Directive came into effect. Under its terms, all large companies and all listed companies (except micro-enterprises) must disclose information on the risks and opportunities arising from social and environmental issues, and their impact on people and the environment.
Set against this we have an AI revolution taking place – witness the activity on LinkedIn, with almost every other post lauding the benefits of some ChatGPT derivative or similar, leading to something of an AI feeding frenzy.
Looking through an ESG lens, the environmental impact of AI is huge. According to calculations by the specialist in sustainable data science, Kasper Groes Albin Ludvigsen, published in Medium at the end of 2022, ChatGPT could have consumed as much electricity as 175,000 people in the month of January 2023 alone. Equally, there are numerous articles that reference AI’s huge water impact.
One thing is clear. Whilst there can be many positive outcomes and by products from AI on ESG, the true end-to-end cost of this next wave of Digital Transformation is not yet well understood.
Given we are still trying to get to grips with the effects of the Industrial Revolution from an environmental perspective, how good is humankind’s track record of not repeating the mistakes of the past? How can we exploit opportunity without understanding the true cost and impact?
Wider business benefits of ESG
Developing an ESG strategy that is in harmony with your Digital Transformation yields multiple advantages. And whilst ESG reporting is now mandatory for corporations in the EU, doing so helps quantify the benefits that exist for every party:
- Investors. Many investors place great importance on ESG reporting and an overall strategy
- Customers. Consumers are increasingly concerned about the companies they place business with, and ESG is becoming far more important in their decision making
- Suppliers / Supply Chain. Companies are receiving more requests for information on their ESG credentials, capabilities and response. They must be able to demonstrate their end-to-end position when reporting, driving positive change throughout the supply chain
- Employees. Recruiting and retaining talent can be difficult, expensive and disruptive when there are issues with ESG policies. Research indicates that as many as 47% of employees would look for new roles if their organization is not proactive here
- Market reputation. Creating a strong reputation and a positive view of a company takes time and effort. Negative disclosures around ESG will quickly damage reputations, whereas positive ones will confer competitive advantage
Balancing potential conflicts between digital transformation and ESG
Detractors of ESG will point to the irony that a robust ESG process itself has an environmental impact: data centers in the EU consume more than 2.7% of the bloc’s electricity. And the Ukraine war has highlighted that the geopolitics of power supply will increasingly affect decisions on data processes and sovereignty – when Cloud storage and transference requires so many terawatts of electricity, securing a good price must be balanced against political and geographic risk.
Digital transformation is, by its very definition, a process of huge change. Done right it unlocks competitive advantage, delivers cost savings, drives productivity, opens up new opportunities and delivers compliance with ESG obligations. But done half-heartedly or implemented sporadically it will almost certainly be a huge waste of time, effort and resources.
Deloitte calculates that digital transformation could unlock as much as US$1.25 trillion in additional market capitalization across all Fortune 500 companies. However, done incorrectly, market value could actually be eroded, putting more than US$1.5 trillion at risk.
Prior preparation prevents poor performance
When it comes down to it, successful digital transformation requires only three things:
- An agreed plan
- The right tech platforms
- A joined-up approach
And whilst that sounds simple, it involves significant planning and project management resources. It’s not possible to retro-fix a digital solution onto your existing processes – a successful digital transformation requires a center-out approach, incorporating data privacy and protection and considering ESG objectives at the very heart of policy and technology.
When digital transformation is done correctly, “it’s like a caterpillar turning into a butterfly,” but when done wrong, “all you have is a really fast caterpillar.”
MIT Sloan Professor George Westerman
ESG at the heart of the digital transformation process
The comprehensive and insightful data analysis and management required to power your digital transformation needs a huge team of business experts, platform designers and technology specialists, all following a clear process:
- Develop an agreed, business-wide strategy
- Create and share a roadmap
- Define the metrics of success, and measure them
- Build user-friendly dashboards and data analytics
- Use optimal data platforms and cloud services
- Ensure data privacy and protection
- Set and track ESG targets. Not only does ESG need to be considered, it needs to sit right at the heart of digital transformation, informing and guiding the entire organization
Simply ‘ESG washing’ operations with fancy reports is both ineffective and expensive. That’s why Calligo ensures that every digital transformation we drive is engineered with careful attention to its environmental impact. Future-proofing your data use in a way that protects everyone’s future.
To help you navigate the expansive topic of digital transformation, we’ve put together a comprehensive eBook, outlining all the key considerations for your organization. And if all this sounds daunting, don’t worry – we’ve seen plenty of similar challenges. Data privacy, for example. Once seen as a vague afterthought or something for someone else, today it takes center stage – the concept of Privacy by Design even has its own ISO standard (31700). Understanding the end-to-end ESG impact of Digital Transformation is heading the same way.
If you want to learn some more, or if you want specific advice, consultancy support or technical implementation, why not talk to our experts, who can get your digital transformation journey underway?
Security SOS: It’s dangerous to view cloud and data separately
Written by Cynthia Hoza on . Posted in Cloud, Cloud Migration, Cloud Strategy, Data Insights, Glossary. No Comments on Security SOS: It’s dangerous to view cloud and data separately
Security risks within the IT infrastructure of global businesses are increasingly prevalent – and damaging. When swathes of data are separated in the hybrid or multi cloud, it can leave big open doorways for malware to walk right in.
The message I want businesses to hear is that cloud and data are not separate. IT only exists to service the needs of a business’ data. Securing cloud services – and therefore your data – is a business-critical issue.
Read on to understand:
- The limitations of AV
- The dangers of remote networks
- The cost of getting security wrong
1. Blind faith in AV
Businesses are too often putting their faith in antivirus (AV) software. This is unintentional blind faith, in my opinion. The problem with AV software alone is that it does not go far enough to protect businesses data assets; it only detects known threats and is not reliable against new variants. We speak to a lot of businesses that assume their security box is ticked, thanks to AV software alone.
But what about zero-day attacks that make up most data breaches these days? A zero-day vulnerability is a computer security vulnerability unknown by anti-virus software creators; they’ve had ‘0’ days to work on a security patch or an update to fix the issue. Zero-day attacks leverage innovative multi-layered approaches – like BitLocker encryption – that haven’t been seen before; anomalies that business software can’t easily detect and protect against without human intervention.
The need to have human and AI based security operations centers (SOC) is increasing, but the cost to implement internally is high and the skills are in short supply. This can cause complications when trying to get pay-outs from cyber security insurers – because businesses haven’t invested in a higher level of threat protection.
Against this backdrop, AV is like wearing chain mail with a gaping hole in the front.
2. Leaving doors open in our remote working world
Unsurprisingly, zero-day vulnerability is greater in our remote working world. Weaker control systems, attacks on remote working infrastructure, sensitive data accessed through unsecured Wi-Fi networks, expanded attack surfaces, the use of personal devices…The list goes on. SaaS in one corner, Office 365 and Dynamic CRM in the other. Servers, software and data – here, there and everywhere. Not to mention outdated legacy operating systems.
Businesses have previously relied on remote access virtual private networks (VPN) for users – but this creates a tunnel between devices and company networks that’s hard to secure adequately. It also means a laptop or personal device can easily become a conduit. A virus or malware can scan for open communication channels – and find its way easily into a corporate environment. If your business IT environment has modern applications, your security must also be modernised. And fast.
This is where Zero Trust Network Access can come into play to secure access to internal applications for remote users. ZTNA gives remote users connectivity to private apps without placing them on external network tunnels or exposing the apps directly to the internet.
It’s about changing the architecture to be as secure as possible for the modern way we work.
3. The financial – and reputational – costs
Under British data protection laws, for example, a company could also face a fine of up to 4% of its global turnover if it is found to have failed to have met its data protection duties by the Information Commissioner’s Office (ICO). This is not new news. But despite the serious risk this poses to a business, many organisations still have an ‘it won’t happen to me’ attitude.
Zero-day attacks – or any type of data breach – can be hugely costly for a company. We know, because we we’ve had big business customers who’ve been in this predicament (not on our watch, I hasten to add!). Add into the mix GDPR – and uninformed reliance on AV and cyber insurance and a lack of control over remote networks has landed many in trouble with the regulators. Hefty fines – and reputational damage.
Businesses that value their data need to value security, first and foremost. And that starts in the cloud.
Lie Machines – The global fight against misinformation
Written by Cynthia Hoza on . Posted in Beyond Data Podcast, Data Ethics, Data Governance, Data Insights, Glossary, Machine Learning, Machine Learning. No Comments on Lie Machines – The global fight against misinformation
Exorcizing the ghost in the machine
In this latest podcast in our ‘Beyond Data’ series, Tessa Jones (Calligo’s Chief Data Scientist) and Peter Matson (Data Science Practice Lead) talk with Oxford University’s Professor Philip Howard about the threats posed to democracy by technology, specifically in the shape of Lie Machines.
Fact or fiction? Microtargeting with lie machines
In this age of social media, chatbots and AI it’s never been easier for individuals to share their opinions. Instant communication to, and engagement with, a global audience is now commonplace, and it seems there’s no need to let facts get in the way of a good angle. As Mark Twain, or maybe Winston Churchill, or most probably Jonathan Swift famously said, “a lie can travel halfway around the world whilst the truth is still putting on its shoes.” A great example in itself of the ease in which misunderstandings and misappropriations can become canon.
In this vein, Professor Howard has spent years studying the mechanisms in which opinion, behavior and values can be manipulated and misdirected by lie machines:
“Lie machines are large, complex mechanisms made up of people, organizations, and social media algorithms that generate theories to fit a few facts, while leaving you with a crazy
conclusion easily undermined by accurate information. By manipulating data and algorithms in the service of a political agenda, the best lie machines generate false explanations that
seem to fit the facts.”
Lie Machines: How to Save Democracy from Troll Armies, Deceitful Robots, Junk News Operations, and Political Operatives
We find lie machines in all types of countries and governing structures. They share common elements – political actors produce the lies, social media firms distribute them, and paid consultants market them. High profile examples of the effectiveness of the lie machine include the UK’s Brexit campaign, and Trump’s electioneering – in both cases patently untrue ‘facts’ and arguments were targeted at key voters by disinformation networks, troll farms and lie machines. Algorithms direct individuals towards ever-more insular sources and extreme content:
“A healthy, public-facing algorithm might occasionally introduce another credible source… we know the platforms play around with this stuff, especially during elections in the US”
Controlled by bad actors and forming a global ecosystem of lie development and propagation, these lie machines spread their tendrils across every social media platform, moving out from Facebook as new outlets develop.
Computational propaganda
Lie machines have evolved and finessed themselves as technology advances. Instead of stealing the photos, social media handles and biographies of real people, AI now generates new pictures and personas and thus evades technology platforms’ troll-spotting software.
Spreading propaganda far and wide, with a convincing voice, the lie machine
- Has a profound effect on society, with a scale that is difficult to quantify
- Is perfectly engineered to target human vulnerabilities, reducing critical thinking
- Deliberately misrepresents and appeals to emotions and prejudices, using our cognitive biases to bypass rational thought and create echo chambers
- Is vague and unknowable – what training data was used for large language models? (Professor Howard postulates that every Gmail sent over the last 25 years may have been scraped, along with content from junk news sites)
Doing better – where does the onus sit? User or developer?
When it comes to developing processes to combat the lie machine, there’s no one legislation or guiding principle that works. We must always consider the regional and cultural context of both data and users. Research can’t necessarily be amalgamated or directly compared from different regions and countries – for example, we know that the placebo effect is always greater in US medical studies. To date, technology has not always built in cultural nuances in how people use words, with intent and meaning lost in translation – the majority of network takedown orders are for sites that are not in English.
Wherever there is human input, there are behavioral differences that make it much more difficult to apply common rules:
“People who manage cookies are above average in terms of their knowledge of technology, so these people are generally more purposeful in terms of how they set up their news feeds and where they go for information”
The huge amount of disinformation spread around Covid and the resulting vaccination campaign demonstrates how potent the lie machine is. It doesn’t need to convince people its argument is right, all that is required is to introduce enough doubt, to highlight there is a chance of harm. After all:
“If everybody really understood probability, nobody would ever buy a lottery ticket”
Balance the field – breaking the lie machines
Professor Howard believes that whilst we are justified in our concern about the threats to democracy, the principles behind the lie machine can be harnessed for good – promoting topics that are in the public interest and generating democratic discourse:
“I am cynical, but not fatalistic”
He describes the steps we can take to break the lie machines:
- Public policy oversight, founded in ongoing public data capture and analysis
- Designing social media to highlight emerging consensus, rather than heated conflict – machine learning can amplify common ground
- Setting election guidelines to create more opportunities for civic expression
- Giving journalists, civic groups and researchers access to all the public opinion data that is currently in the hands of the technology firms
- Ensuring that the big data collected by technology platforms is added to public archives
The answer is more social media, not less. But it needs to serve society much better.
IPIE – bringing down the lie machine
Professor Howard has recently launched a new program, creating an independent scientific body to foster global cooperation in safeguarding the online information environment. The International Panel for the Information Environment (IPIE) will assess the scope of the misinformation crisis, analyze its effects on our societies and the planet itself, and propose solutions. Featuring data scientists and engineers alongside neuroscientists and sociologists, IPIE hopes to be the beginning of a global effort to save our common information environment.
Watch the podcast for yourself below to hear more from Professor Philip Howard about the power of the lie machine, and crucially, to learn how we can use it for the collective good.
Professor Philip Howard is a social scientist with expertise in technology, public policy and international affairs. He is Director of Oxford University’s Programme on Democracy and Technology, a Statutory Professor at Balliol College, and he is affiliated with the Departments of Politics and Sociology. Currently, he is also a Visiting Fellow at the Carr Center for Human Rights at Harvard University’s Kennedy School.
Making complex data available for the benefit of society
Written by Cynthia Hoza on . Posted in Beyond Data Podcast, Data Governance, Data Insights, Data Privacy, Data Privacy, Data Strategy, Data Visualization, Glossary, Machine Learning. No Comments on Making complex data available for the benefit of society
In Calligo’s latest Beyond Data podcast, Tessa Jones (Chief Data Scientist) is joined by Dr Ellie Graeden, Research Professor (Center for Global Health Science and Security) at Georgetown University. Here we explore some of the episode’s highlights:
- The inherent conflict of private data and the public good
- Protecting individual rights within federated learning
- The importance of effective communication and a common language
- Designing systems and policies that work together
- Focusing regulation on outcomes, not creating data siloes
At societal level, poor communication costs lives
Transitioning data across and between departments and data systems has historically been fraught with problems – who owns it? Who pays for it? Is it understandable and translatable into meaningful and actionable insights for the end user?
Having worked extensively in disaster response, Dr Graeden has seen first-hand the potentially life-threatening issues that can arise when government departments’ data platforms produce incompatible outputs:
- If 20,000 people need water, how many pallets need to be shipped?
- If 10,000 electricity meters have been knocked out by a hurricane, how many people need feeding?
In such scenarios, identifying individuals amongst population-level data is crucial if the help provided is to be sufficient.
“We have to be able to really effectively move and communicate and share data that are relevant, in ways that they can get used by people all across the system”
Of course, any data system design should ensure privacy and protection for personal data. ‘Big data’ is still relatively new, and as such more powerful and widespread regulatory controls are now being introduced, although the US still does not have consistent requirements for how data should be handled. Fundamentally, meeting a population’s needs today, and planning for them tomorrow, requires the data of individual people to be analysed. Personal data must be shared quickly, effectively and all the while protecting individual rights. Data system design must therefore:
- Include all players
- Consider cultural constraints
- Keep out bias
- Ensure the right words and phrases are used
- Focus on the ‘so what’, why does it matter?
“Every single thing we experience can be captured as data”
Even the most mundane moments in our daily lives leave a digital footprint, we shed data everywhere. But when does ‘my’ data become public, or the property of the software developer or the service provider? VR headsets collect ephemeral data that is analysed and applied for that one end user, but if that data is assumed to fall under GDPR the potential to use it for positive outcomes is severely limited. For example, should authorities be notified if content viewed and generated is illegal or harmful? And what if that chip can detect if the user is having a stroke, is that data classified as ‘health’ data? Can it be used to alert the individual to their medical emergency without contravening legislation? What if your mouse clicks can detect the early stages of Parkinson’s? Should you, could you, be told?
“If you’re treating this data as health data, then they have a very different set of regulatory constraints. HIPAA isn’t going to regulate those because it’s not a health care provider or a health insurer”
Piercing the veil
The conflict between personal protection and public good is everywhere, and Dr Graeden believes that some new data laws will create problems for federated learning. Legislation has clear boundaries (speed limits, blood alcohol levels) whereas science deals in spectrums, probabilities and unknowns.
Deleting an individual’s personal data from the model breaks the system, contradicting what regulators are trying to achieve. The solution is to prioritize outcomes, not processes – it doesn’t matter whether you write the rules with a pen and paper, or with AI, as long as you write the rules. Expanding the framework by setting gradients of data availability affords protection for individuals, whilst making data available that informs better decision making for public bodies.
“Data is nothing more, nothing less, than an abstract description of our world. A useful and powerful language that can tell us things that other languages don’t”
Data can no longer exist in siloes if it’s to be useful to society
There is now a healthy global appetite for the discussion around data, thanks in the main to two recent developments:
- Covid gave us huge amounts of data about mortality levels, vaccination rates, hospitalisation trends – all of which were in the public consciousness every day
- AI and ChatGPT – articles and debates about the pros and cons are everywhere, discussion is not just in the scientific community
The key challenges now for data scientists are expectation management and communication – we need to be clear about aims and specific about context, as well as knowing what to leave out to avoid overwhelm and misunderstanding. Unfortunately, scientists are not always great communicators (using complex terminology and detail, rather than common parlance and generalization) as Covid demonstrated:
- Did having a vaccine mean you wouldn’t get sick? Or just less sick?
- ‘Everyone should wear a mask’ became ‘wear a mask if you can’. This was due to limited supply, but it appeared that the science was not clear
“The scientific approach means you never have an answer… we are trained as scientists to focus on the fact that we don’t know”
In fact, the only answer is that the right data, used consistently and communicated clearly, will always allow us to be prepared, not reactive. To make decisions for the public good that protect every individual.
You can find out more about the common language of privacy in our Rosetta Stone eBook.
You can also watch Tessa’s fascinating podcast with Dr Graeden below.
“A data viz expert is like a language translator.”
Written by Brendan Walsh on . Posted in Data Insights. No Comments on “A data viz expert is like a language translator.”
Timerie Bahler is no stranger to digging deep into the data of organizations – from telecommunications to trucking and finance companies. Many different industries, with many different challenges. What they all have in common is that somewhere in the data there’s always something new to discover that has the power to enhance operations and bottom lines. And that keeps Timerie motivated, professionally, as she turns ostensibly hidden information into actionable insight.
Interpreting the hidden language
She likens being a data visualization expert who interprets ‘hidden’ information to being a “language translator” – with a special ability to communicate and uncover what’s not in a client’s line of sight. Calligo’s clients know they have the data and that the insights are in there somewhere, but they need help to see them – and then act on them. At other times, she says she feels more like a chef.
“If someone goes to the grocery store and buys a ton of ingredients – fresh vegetables, herbs, rice, and meat – and they then munch on the raw food, they will be fed and kept alive,” explains Timerie. “But, add a chef into the equation and they can be served up an amazing meal. The pieces can be turned into something meaningful. Whatever level of data literacy they have, if clients aren’t utilizing visualization, then they’re lacking the chef – who will transform many different bits into something people will connect with and remember.”
Skill and passion
It’s important to point out that being data visualization ‘chefs’ takes extraordinary skill and is not an easy task. Not satisfied with creating game changing dashboards for their clients, Calligo’s data viz team recently took on the challenge of an internal competition mimicking Tableau’s ‘Iron Viz’ rules (the world’s largest data visualization competition). Each person had to submit a piece of storytelling, based on data sets on the theme of ‘Health and Wellness’. Timerie – then only in her first year as a professional Visual Analytics Consultant – picked the topic men’s mental health. And you can see the end result ‘Tough Guys’ here.
“I am proud of it from a design and meaningful content perspective. The data came from a survey run by a government department. I got a lot of codified numbers – including a 500-page codebook
on how to interpret the answers. This was not just an exercise in storytelling or data visualization; it was a deep dive into information that I had to clean, interpret, find stories – and then communicate.”
On the front foot
And she does this for her clients, too. A logistics company she’s worked with recently needed to track the safety risks of its operators and pinpoint any issues before they arose – like whether operators were on top of their training. All this tied directly into the company’s safety rating – and therefore, its funding.
“On many projects, the ROI of data visualization is obvious,” explains Calligo’s Visual Analytics Consultant, Timerie Bahler. “Providing insight for this logistics company with easy to interpret data dashboards was a critical step forward. And the ROI was clear cut. We could demonstrate things like man hours saved and incident reduction. There are other times when data visualization makes it easier for people to do their jobs, well. They can see trends and make informed decisions. This doesn’t always have an obvious price tag; it can, however, save organisations a huge amount of money in the long run.”
From communicating corporate and human stories…to animals and Timerie’s first ever dashboard was about fatal bear attacks, demonstrating that there really is no topic that data visualization experts can’t work their magic on.
In her own words: “If something is memorable, that’s how you create value – in business and in life.”
A picture speaks a thousand words
Written by Brendan Walsh on . Posted in Data Insights. No Comments on A picture speaks a thousand words
Deep within data lies stories that can help businesses of all shapes and sizes see hidden detail – and act on it. Take a US healthcare provider, for example, who came to us with a pressing issue: the greatest cause of its patient dissatisfaction was due to waiting times. When were the longest peaks? Where was the epicenter of the backlog? And once this was known, what targeted processes could be introduced to speed things up? Our data visualization experts revealed all this, and much more, in a real-time dashboard that reduced wait times by 13%.
So talented are our data viz specialists that they can turn pretty much anything into a dashboard that packs a punch and educates. Like Yash Shah, our Data Analytics Consultant – whose ‘World of Tea’ was named Viz of the Day (VOTD) by Tableau recently.
“Tea is always central to any meaningful conversations which I have in my life,” explains Yash. “And so, when I was back in India chatting to some friends, with a cup in hand, the idea of creating a data visualization about tea struck me suddenly, as it’s such an integral part of my every day. I wanted to visually display all the information I could find out about it – and share that with others. There was so much to do with the data: 1,500 flavors of tea just for starters – 200 of which come from India.”
We’re biased, but to us, it’s clear why Yash won the VOTD accolade for his ‘World of Tea’. It’s certainly no mean feat. Tableau’s expert team scrutinizes many data visualizations every day; competition is stiff. The winners are added to the Tableau Public website, shared via social media streams, and sent out via email to VOTD subscribers. Put simply, they’re seen by thousands of people worldwide, providing inspiration and learning. It’s this immediate knowledge-sharing impact of data visualizations that Yash believes is so powerful – whether in the creative or corporate world.
“For businesses, data that is translated into these kinds of visuals gives great clarity to weighty subjects,” says Yash. “Images can give a huge boost to concepts and information; complex narratives can be conveyed cleverly in an instant. The impact is great.”
Originally from Mumbai, Yash started using Tableau when he moved to Canada to study a graduate program in Data Analytics and Data Science from Durham College. He’d taken a few courses on it back in school. Then, when he landed an internship, he decided to hone his skills on the platform – and use it to create ‘out of the box’ visuals in his spare time. Now that he’s a fully-fledged member of the Calligo team, he’s determined not to lose any of the expertise he’s taken so long to fine tune. And to continue to tell stories through the power of data visualization.
“Storytelling is important to me, whether I am creating a visualization for a client, or as a hobby outside of work. I want viewers to be pulled towards the images – glued to them and absorbed in the content. For me, a picture speaks a thousand words.”
Back to Yash’s day job, and this is exactly why clients come to our data visualization team – to decipher ostensibly complex data sets. Instead of hidden insight, these (often beautiful) creations bring collective understanding and clarity – enhancing business performance across a huge variety of sectors.
You can download our data visualization portfolio here – and discover how we’ve helped global clients cut through the noise with actionable visual analytics.
The Jersey Transform 2022 Event
Written by Brendan Walsh on . Posted in Cloud, Data Ethics, Data Insights, Data Privacy, Machine Learning, News. No Comments on The Jersey Transform 2022 Event
The Channel Islands’ Premier Data & Cloud Strategy Event
Join The Channel Islands’ Premier Data & Cloud Strategy Event – Transform 2022
Our speaker line-up includes Professor Hannah Fry, a Professor in the Mathematics of Cities, science broadcaster, and winner of the prestigious Zeeman Medal.
- Venue: The Royal Yacht Hotel
- Location: Weighbridge Pl, St Helier, Jersey
- Date: 30th November 2022
- Timings: Conference from 1.30 pm-5 pm, cocktails and canapes from 5.30-7 pm
Please Note: This event is for business leaders, and spaces are therefore limited.
To secure your exclusive place, register here.
Join your peers from across the Channel Islands and get past the buzzwords to learn more about what Business intelligence (BI) really means in today’s modern businesses and why this is a strategic imperative for leadership teams and not just your IT teams.
You will learn about Data and how to unlock its true power, covering:
- The trends in Cloud technology and the business advantages to be gained from them
- Why the Cloud is the foundation to becoming a truly data-driven business
- Best Data practices to help organisations make better decisions
- Using accurate Data to drive change and grasp opportunities quicker
- Eliminate risk and inefficiencies
- Adapt quicker to market challenges
Putting the ping into office pong: Meet our table tennis score predictor – powered by Tableau
Written by Brendan Walsh on . Posted in Data Insights, Machine Learning. No Comments on Putting the ping into office pong: Meet our table tennis score predictor – powered by Tableau
A game of table tennis is always a good way to unwind – over a few drinks with colleagues after the to-do list is done or perhaps during lunch break. But, if you’re looking for a way to rev up competitive spirit and even take on teams in other ZIP codes or countries, we have created the perfect solution.
If your office has access to a ping pong table (essential) you’re eligible to climb the global ranks soon to see if you can beat other companies – and even the data.
Let us introduce you to our new Ping Pong leaderboard & Predictor dashboard, powered by Tableau. Our data analysts, data engineers and data scientists have been working hard behind the scenes to bring this new product to offices across the world – soon.
How does it work?
Put simply, you play a game of ping pong, input the scores – and then you can start seeing where you rank on the dashboard.
“Companies can sign up and get their own dashboards for their employees,” explains James Faure, Calligo Data Scientist and Ping Pong Predictor project champion. “It’s the perfect way to add some additional energy into work get-togethers. You can keep the competition internal, and just play against your colleagues. Or you can expand your competitive horizons to within your country, or even the whole world.”
The more you play, the better – the predictive model is hungry for data, to get an accurate understanding of your state of play. Through the app you’ll be able to see who’s predicted to win the next match. If you’ve decided to take on a rival business across the street, for example, previous data fed into our Ping Pong Predictor might lead it to say you’re going to lose by 18-21. But…will that ring true? Can you not only beat your human competitors, but also take on data analytics, engineering, and science?
A bit of geekery
On that note, while this no doubt sounds intriguing, you’re probably wondering how it’s even possible. Well, the brains of our data analysts, data engineers, and data scientists are constantly whirring, devising creations that are not yet out there – and not just for our clients. This is the latest example of their combined, extraordinary skills; the perfect trinity of Calligo’s disciplines – with (more than) a little help from React, Python, Snowflake and openFaaS. And, of course, Tableau – that not only powers the dashboards, but makes them highly intuitive to use and pleasing to the eye. A mighty partner, indeed.
Analytics + Engineering + Science
Firstly, data visualization and analytics expertise combined to create the dashboard in Tableau. (On a quick side-note, now seems an opportune moment to mention that a piece (Human Disease Network) by one of our data visualization experts – Anjushree B V – was chosen last year by Tableau as its Viz of the Day (VOTD). You can read more about this here.)
Then came the data engineering – SQL database programming and designing the system. This ascertained how data would flow between the dashboards. As we get more match data, we will start to take it across to train the models – aka Machine Learning, that will be deployed back into the app.
“As well as being interactive entertainment for companies, this demonstrates the Calligo team’s exceptional skills,” says James. “Our Ping Pong Predictor is an awesome playground where we can test new technologies that we want to implement for our clients. I also love to teach people about data – and this is a diverse way to illustrate what it involves for younger professionals considering a career in this field.”
Coming soon…
Back to what are likely to be hotly contested corporate competitions… Ping Pong Leaderboard and Score Predictor Dashboard powered by Tableau is coming soon to offices near you. We are currently beta testing the platform, and by signing up to Calligo’s newsletter, we will provide an update as soon as the platform is released to the public. It’s time to put the ping back into office pong…

Error: Contact form not found.
A shout out to Calligo’s Ping Pong Predictor team
And those exceptional skills deserve a special mention. Thanks to Nick Mischko – our Senior Data Analytics Team Lead who built the Tableau dashboard. Supporting James Faure with engineering insight was John Jackson, Calligo’s Director of Data Integration & Engineering, and internal tech help – such as deploying pieces of code – from Gary Bright. The Machine Learning element of this app will become more prevalent, thanks to Peter Matson, our Data Science Practice Lead, and Tessa Jones, our VP of Data Science Research & Development. And, last but not least, software engineer expertise came from Artur Kruell, showing how software engineering is still very important in the data field as the glue that sticks different pieces together.
Vehicle Autonomy; the good, the bad, and the complicated
Written by Brendan Walsh on . Posted in Beyond Data Podcast, Data Ethics, Data Insights, Data Privacy, Data Warehouse, Glossary, Machine Learning, Machine Learning. No Comments on Vehicle Autonomy; the good, the bad, and the complicated
In our second Beyond Data podcast episode ‘Autonomous mass transportation and its impact on citizen privacy’, we will sit down with Beep’s Chief Technology Officer, Clayton Tino to explore the current landscape of autonomous vehicles (AVs), whether AVs truly can replace the human factor in public transportation, and how AV ethics can be holistically measured. Here we give you a snapshot of that fascinating discussion by digging into a few of the explored topics.
You can watch episode 1 here
When looking at AV ethics, there are two strands to consider:
1: The ethics programmed into the AV itself (e.g., how the AV ‘decides’ which course to take when it identifies a hazard, otherwise known as the ‘trolley car’ scenario).
2: The ethics surrounding embedding AVs into society (e.g., whether we can truly replace the human factor in AVs, or what level of surveillance AVs should have).
Going beyond the trolley car scenario
Often touted as the litmus test for AV ethics, the ‘trolley car’ or ‘trolley problem’ is a thought experiment where someone chooses between saving five people in danger of being hit by a runaway trolley by diverting the trolley to hit one person. This is extrapolated to AVs by using a scenario such as an AV traveling down the street when suddenly a group of pedestrians runs out. The AV must ‘choose’ between hitting the group or altering its course but by doing so, hitting a lone pedestrian.
The ‘Moral Machine’ experiment was an online survey of 2.3 million people worldwide that investigated the moral dilemmas faced by autonomous vehicles. The study found that moral principles guiding drivers’ decisions varied from country to country, and also women and men viewed ethical and moral situations differently. This made something like the trolley problem difficult to quantify and standardize worldwide.
Far from a simple ethics exercise…
On the surface, it seems a simple ethics exercise. But as Clayton Tino summises: “People like to think they have a preconceived notion of how they would behave, but I just don’t buy that. [A near miss] is a purely reactive response. We’re setting unrealistic expectations on the machine because we need to blame something when something goes wrong.” Tessa Jones (podcast co-host) agrees, observing: “AVs need some decision-making process, but I don’t have a decision making process myself.”
As Sophie Chase-Borthwick (podcast co-host) explains: “We expect our AVs to be guaranteed safe. But we know that any other vehicles are not 100% safe with a human behind them. So we have a higher expectation of what ‘safe’ looks like when it’s autonomous [as opposed to] to when it’s a human.”
In our opinion, the disproportionate emphasis placed on the trolley problem to solve the lion’s share of AV ethics is reductive and dangerous to advancing AV technology. It’s a useful piece of the puzzle but it’s a symptom when we should be focusing on fixing the cause.
In our podcast, we also explore the importance of accurate and timely hazard perception (both in humans and AVs). By improving hazard perception, it not only provides safety methods for AVs but can help reduce or mitigate entirely AVs even having to make the trolley problem decision in the first place.
Can we ever truly replicate the human factor?
There are five levels in the maturity of autonomy of AVs – with Level 1 being no autonomy and Level 5 being a vehicle without a driver safely taking you to where you want to go.
For Clayton, Tessa and Sophie the debate centers on where the application of AVs could work best with the least blockers. They wonder whether public transportation seems an ideal choice, given how it could be geo-fenced, fixed route and hyper-local.
However, when considering AVs in the context of public transportation, they realize it’s important to look at the holistic service of public transportation, beyond just the driving. As Clayton pithily observes when considering AVs for school buses, “[Bus drivers] do a heck of a lot more than just drive the bus … they need to be aware of passenger safety and security, assistance…”.
For example, in London, there’s been some disputes between wheelchair users and pram users about who has first access to the space. Bus drivers (and others in charge of public transportation) are expected to act as mediators to settle these disputes. How would this be replicated in an AV with no human factor?
The answer could lie in more secure and closely governed surveillance. Having surveillance on public transport AVs could add a safety layer to minimize vandalism, protect the users and ensure the AVs remain a reliable and safe choice. Our podcasters observe the marked differences between privacy in the US and Europe but with the introduction of GDPR-style laws such as the California Consumer Protection Act (CPPA), there will inevitably be more scrutiny on how the surveillance data is used and stored.
However, as is often the case with autonomy when it comes to public transport there’s no easy decision. By removing the human factor, there need to be other allowances made to fill the gap. Companies and governments need to work hard to make sure both the users and their data are protected and that these allowances do not harm the end-users or misuse them for commercial purposes.
Our podcast delves more into the nuances and pitfalls when considering the commoditization of a public service, such as public transportation. Generally, the people who need it most are vulnerable, and unless there’s a significant level of transparency, can users be fully aware and able to consent to the wider implications of being surveilled?
To hear more about how we untangle and much more, watch our episode on ‘Autonomous mass transportation and its impact on citizen privacy ’.
UPDATE 8: The Data Privacy Periodic Table
Written by Brendan Walsh on . Posted in Data Ethics, Data Insights, Data Privacy. No Comments on UPDATE 8: The Data Privacy Periodic Table
By Sophie Chase-Borthwick, Calligo’s Global Data & Governance Lead
From the increasing importance of ethical AI principles, to the EU’s all-encompassing data strategy – including the first law on AI by a major regulator, anywhere – and US President Joe Biden’s new transatlantic data agreement, much has been bubbling away in the world since my previous revision of The Data Privacy Periodic Table.
Here I delve into the whys and wherefores of any changes I’ve made since then – in the form of Update 8.

Fundamental Principles of Data Protection – some newcomers…
Acting FAST on ethical AI
6-9

The Alan Turing Institute developed the ‘FAST Track Principles’ to support a responsible environment for data innovation, in particular when understanding Artificial Intelligence ethics and safety. To reflect the importance of ‘ethical AI’ (as demonstrated by the ICO’s collaboration with the Institute) I have added Accountability and Sustainability for the first time.
While Sustainability is the only element that’s really unique to AI, Fairness and Transparency (moved, but not new) have and always will be fundamental to data privacy. I had considered Accountability to be almost too obvious and intrinsic a component of privacy to have its own place. But, as a nod to my opinion that the FAST Track Principles should become industry standards, here it is. After all, FST certainly doesn’t have the same ring to it.
While I can’t go into huge detail here about each one, I urge anyone who hasn’t read up on FAST to do so now – and embed the principles into every aspect of AI project delivery.
“As inert and program-based machinery, AI systems are not morally accountable agents. This has created an ethical breach in the sphere of the applied science of AI that the growing number of frameworks for AI ethics are currently trying to fill. Targeted principles such as fairness, accountability, sustainability, and transparency are meant to ‘fill the gap’ between the new ‘smart agency’ of machines and their fundamental lack of moral responsibility.”
The Alan Turing Institute: Understanding Artificial Intelligence Ethics and Safety
Moved, but not downgraded
34 & 35

Lawfulness and Necessity have made way for FAST. Far from downgraded, they’ve merely moved a little within the same elemental area. But, Relevancy has been removed altogether. In my opinion, this is more than covered by Necessity and there’s no need to double up on similar principles.
Retention becomes the industry norm…
53

We welcome Retention to the table – this echoes the fact that this has become more of an industry standard term.
Highly unstable, yet fascinating Future Developments…
And now for the fast-moving, highly unstable elements: the future developments that are shaping the world’s data privacy parameters and legislation.
US legislation limbo…
112

To the United States and various US Bills – starting with President Joe Biden’s new transatlantic data agreement in principle with the European Union.
We’ve been here twice before – with similar proposals previously thrown out. Although it doesn’t seem to be going anywhere fast, this is hugely important, due to the rocky recent history of EU-US data flows – following the invalidity of the Safe Harbor and subsequent Privacy Shield framework.
Above all, greater certainty is needed for the vast amount of companies that regularly exchange data between Europe and the US.
Then there’s the ADPPA – the American Data Privacy and Protection Act – a bill designed to regulate how organizations collect, process, manage, and even securely store personal information or “covered data.” The US does not yet have a comprehensive privacy law that creating such safeguards. The ADPPA has bipartisan support, but also faces opposition from privacy advocates and business groups.
After an initial flurry of excitement, how and when these laws will pass is up in the air. In the meantime, individual states are focusing on their own data laws.
“We have agreed to unprecedented protections for data privacy and security for our citizens. This new arrangement will enhance the Privacy Shield framework, promote growth and innovation in Europe and in the United States and help companies, both small and large, compete in the digital economy.”
Joe Biden, US President, March 25, 2022
Retroactively enforceable California Privacy Rights Act
Staying with US Bills, but moving specifically to California state now, and the CPRA comes into law after January 2023, technically speaking. But – and there’s a big but – companies need to be compliant retroactively. The second the law goes live, businesses can be fined for any non-compliance issues dating back to January 2022. Forewarned is definitely forearmed in this case.
Across the Atlantic…
118

To Europe and the EU Data Strategy. Its tagline is: ‘Making the EU a role model for a society empowered by data’. But this is so much more than the EU’s General Data Protection Regulation. It’s about the entire data landscape; a large regulatory umbrella under which the future of Europe’s data protection sits. Having said that, policymakers are far from finished in creating this broader regulation.
The new laws that will be incorporated into this holistic strategy will include, among others: The Data Act – aiming to create rights and responsibilities on how valuable forms of data are shared; The Data Governance Act – to create a “common European data space” and “single market for data” – boosting innovation while respecting the values of privacy; and the AI Act – the first law on AI by a major regulator, anywhere.
Importantly, none of these acts should be viewed in isolation. It’s a positive development that the EU is treating data as an asset (like physical infrastructure). Sewing all the various initiatives together in this way – data protection, governance, AI and also fair markets – is a savvy, cohesive approach, in my opinion.
However, it’s hard to know how effective this strategy will be when it comes to improving data development, given the EU currently lags behind on AI / ML. It remains to be seen if this will level the playing field, or create yet more red tape.
“People, businesses and organisations should be empowered to make better decisions based on insights from non-personal data, which should be available to all.”
European Commission
Source: https://commission.europa.eu/strategy-and-policy/priorities-2019-2024/europe-fit-digital-age/european-data-strategy_en
State of flux
113

In post-Brexit UK, the new UK-GDPR is nearly identical to the EU-GDPR. However, it is UK legislation independent of the EU. The UK has already performed a consultation process to see what data protection in the UK should look like in the future – and therefore new developments need to be monitored closely as they unfold.
114

First it was Apple’s move to block third-party cookies that conduct cross-site tracking on Safari, then Google announced they will do the same in 2023. But, with these changes making things difficult for advertisers and small publishers, what will adtech look like in the future?
Ever-changing laws…
109

Having passed its latest draft of the Personal Data Protection Bill over to the parliament in November 2021, the bill, now referred to as the Data Protection Bill or DPB as it now contains several provisions on non-personal data, has been pulled from consideration for parliament to draft entirely fresh language.
111

The Personal Data Protection Law (PDPL) is the first of its kind to be passed in Saudi Arabia. The protection rules were first published in September 2021 and they are due to come into effect in March 2023.
The Data Privacy Periodic Table is entirely unique to Calligo and is an ongoing project, contributed to by the entire industry. We encourage anyone who’s interested to get involved. I consider all comments when creating the next update.If you have any thoughts you’d like to share or want to discuss anything featured in more detail, you can contact me here.











