Skip to main content
Requirements Gathering for Data Analytics Projects

Requirements Gathering for Analytics Projects

Introduction 

Behind every impactful dashboard you’ve ever seen is a well put together plan and an understanding of the objective in creating the tool. It does not matter if the dashboard is being used in a business context or to tell an interesting data story – any successful dashboard requires meticulous planning and an understanding of the underlying data paired with foundational data literacy skills. Regardless of what you’re building and who it is being built for, effective requirements gathering will generally lead to greater efficiency in development and greater user satisfaction. 

Requirements gathering is not a novel concept, but it’s not common either. Individuals and organizations often forgo the process under the misconception that they’re saving time. The reality is that not investing this time upfront leads to inefficiencies down the road, and those inefficiencies can be multitudes greater than the time it would have taken to effectively gather requirements prior to starting development. Multiple rounds of feedback, unsatisfied end users, and dashboards that receive little to no usage after going live are, unfortunately, staples of the analytics world today.  

The good news is that it doesn’t have to be this way. With an understanding of how to gather requirements plus the knowledge of why it is such a crucial step, you’ll be able to gain buy-in from stakeholders and deliver excellent, meaningful analytics products to end users.  

What is Requirements Gathering? 

When aiming to understand what requirements gathering is, it’s important to start by understanding what it is not. There are numerous strategies that attempt to mimic the process of gathering requirements, so let’s address a few of the most common: 

  • A single ticket submitted to an IT/analytics team – teams that service multiple parts of the organization often set up a ticketing system to elicit requirements and requests from end users. The problem is that teams often begin development immediately after receiving a ticket that often has limited information. The result is many hours or days that have been sunk into the development of a tool that is built on assumptions and has little chance of exciting users, or even meeting their needs, when it’s released. This is not an indictment of ticketing systems – in fact, ticketing systems are a great way to organize workstreams and manage a high volume of stakeholders, but it’s essential to move into a structured requirements gathering exercise as the first step after receiving a request. 
  • Technical requirements without business context – requirements gathering is meant to be a thorough, holistic approach to understanding what is being built and why. A common trap that developers fall into is believing that they only need to elicit the technical aspects of what they’re building. They see themselves as strictly technical resources, their stakeholders as strictly business-focused contributors, and they don’t bridge the gap between the two parties. A collaborative approach with buy-in from both sides to solve the business problem is key in requirements gathering. Empathy, curiosity, and an ability to step outside of the technical development world are essential skills to practice. 
  • Defining requirements for others – it should be stated that imagining what others need and developing products for them based on those gut feelings is not an effective way to work with end users. 

So, we know what does not constitute effective requirements gathering, but the real question is: how do we ensure that we go through this process and come out with the necessary information? Requirements gathering can be messy; it can and should result in jumbles of notes, ink smeared whiteboards, and a feeling of renewed energy for the design and development phase of the project. A lot of information will come up during these 1–2-hour sessions, but the following 4 focus areas will help structure your time: 

  • Determine the objective – drill into the “why” behind what is being built. It’s perfectly acceptable to start a requirements gathering session by asking the question, “why are we building this?” The idea is to drill into the business problem that we’re hoping to solve, and to understand how we plan to solve that. Ideally, we can create an objective statement that is measurable, then work backwards to understand how a specific tool or technology will accomplish that goal. 
  • Define the audience – aside from the overall objective, the most important consideration is the audience. Understanding who will use the tool, how and when they will use it, and their general ability to use analytics products are essential pieces of knowledge when considering design and deployment. The end goal is the ability to create specific user stories that can be used to guide the design and development of the tool in order to drive the greatest adoption. 
  • Outline priority questions – this is the time to dive into specific metrics and dimensions related to the data that will be utilized for the project. This information will be used to guide the design of individual visualizations and it is likely that you’ll map specific questions to specific visuals as you move into the design phase. Questions such as “How do each of my sales regions and the salespeople within them rank amongst one another by total volume sold?” represent the level of detail desired when drilling into priority questions. 
  • Document dashboard features/other details – lastly, we want to document any special functionality requests. Often times, users will be expecting certain features that they have seen from other tools, or that they have been imagining as valuable for the tool you are building. Think about filters, sorting, data exports, printer-friendly concepts for those paper lovers. These items can be make or break for users; spend time identifying those needs so you can plan to integrate them into the product. 

Final Thoughts 

Remember that requirements gathering is inherently social and requires a deep level of curiosity. It should be collaborative – stakeholders need to be involved and to feel that they’re involved. This isn’t just about soliciting requirements – it’s about gaining buy-in from end users and having them know that they played a crucial role in developing the end product. By defining the objective, audience, priority questions, and key functionality in collaboration with your stakeholders, you will be well equipped to move into the design phase of your project.  

Lastly, it is important to understand when requirements gathering has concluded. In an effort to provide clarity around when we have reached that point, formal documentation is passed to stakeholders and their sign-off is requested. By seeing the requirements formalized and delivered, you and your stakeholders will know that milestone has been completed, and that content will now be used to guide the design of the product. See below for a template that you can use next time you engage in requirements gathering.


REQUIREMENTS DOCUMENT TEMPLATE 

<Dashboard Name> 

<Date> 

PROBLEM STATEMENT 

<What are the client’s pain points? Why do they need this dashboard? Why have they come to us?> 

OBJECTIVE OF DASHBOARD 

<What is the business value of the dashboard? Does it help improve revenue? Does it highlight costs that can be reduced? Does it tell the story of an organization’s efforts? Does it increase employee retention?> 

AUDIENCE & USAGE 

<Description of section> 

  • Who will use the new dashboard 
  • Outline permissions 
  • When it will be used 
  • How often it will be updated 
  • Any subscriptions should be mentioned here 
  • Security features 
  • How it will be distributed & shared 

BUSINESS QUESTIONS & ACTIONS 

<Description of section> 

Question Action / Purpose 
Business Question 1 What will the answer to this question drive/result in? 
Ex. [Which customers have purchased one product, but not the other? What are these customers’ phone numbers?] Ex. [This allows our sales reps to telephone the customers who are most likely to purchase additional products] 
  
  
  
  
  

DASHBOARD SPECIFICS 

  • Date range of the data 
  • Filters 
  • Etc. 

QUESTIONS 

  • List any outstanding questions here 

Unlocking Property Management Insights: Extracting and Analyzing Yardi Data

Unlocking Property Management Insights: Extracting and Analyzing Yardi Data

 

Join Nick Mishko, Senior Data Analytics Team Lead at Calligo, as he delves into the world of property management analytics and Yardi data.

Discover how Calligo’s data analytics practice transforms Yardi data into powerful tools, enhancing operational efficiency for property management firms globally. From data extraction challenges to creating dynamic dashboards, explore the strategies and solutions that propel businesses forward.

If you’re navigating Yardi complexities or seeking to leverage analytics for your property management endeavours, this insightful discussion is a must-watch. Stay tuned for more insights from Calligo Shorts!

Year in Review Video Title Slide Linkedin - 2MB Resize 02

Data Transformation Predictions for 2024 – Calligo Data Leaders Roundtable

 

In this lively debate you will hear from Calligo’s Practice Leads as they discuss their key takeaways from 2023 and their data predictions for 2024 and beyond.

Topics discussed include:

Regulation of AI including the EU AI act

AI hallucinations & AI bias

Data governance and data fines

Dashboard fatigue

Data ROI

Trends in Data Visualization Proliferation and Consolidation

Introduction 

When I started my first project with Microsoft back in 2019, I was tasked with creating a report to help a sales team understand when clients had licenses up for renewal and see detailed information about the client’s usage of licenses to help the sales team better optimize agreements with their customer base. The tool was revolutionary for the sales team, which used to pull data from several sources and spend hours making sure it was right. Reports done right can lead to huge efficiencies and make everyone’s jobs smoother, letting us focus on the decisions that truly matter. 

The problem 

With that project complete, I moved onto a new project with a different team, and then another. Two years later an email popped up from a random employee at Microsoft. He’d found the report I’d built and was asking if I could update it for him. I did a little digging and found that my old team had moved on to a new report, but the old one was still available in their portal and employees could still search for and find the report if they had the proper access. Reports and dashboards across the org had proliferated and no one was taking the time to consolidate them. As a result, people were finding old, not quite deprecated reports and trying to use outdated data to make decisions. 

The details 

This problem isn’t unique to Microsoft. If you’ve been working with data for long enough, this problem almost certainly applies to you. As people who love data, we want to see insights that are relevant to us and tailored specifically to the way we want to see the data. With multiple teams or levels viewing the same data, this can lead to custom reports for each group that all slice the data slightly differently. When metrics change, these changes don’t always make their way to every report, especially if Dave in accounting (sorry Dave!) created a copy of a report to do his own work. As time goes on, the number of reports keeps expanding and when new team members onboard they don’t know which reports have the right data. This can lead to muddy reporting environments, with reports from years ago that we keep around because we might want to see that data or that visual again someday. 

The Solution 

Are we doomed to drown in an unending deluge of reports or is there something we can do about it? 

  1. Create report documentation. 

Whenever you create a report, you should create documentation that outlines the data sources, the intended audience, and how the report is intended to be used. Documentation for a report overall should be supplemented by a data dictionary that covers the measures or calculations in the report and gives everyone clarity on what is being reported. We often add these as readme tabs or store them in a company wiki. This not only helps with keeping our environments clean, but also helps new users onboard. You will never have to answer the question – “What did we use this report for?” 

  1. Utilize report usage metrics. 

Power BI has built in reports that let you see which reports have been viewed and by whom. Tableau has similar features for Tableau server. We think these reports are so useful we built our own custom report that lets you see usage across workspaces or servers to help you make the decisions on what reports to deprecate. We deployed this in our own environments and for multiple customers.

Interact with the dashboard by clicking on the image below


  1. Archive reports offline. 

Sometimes we don’t want to get rid of reports or need to keep them, but we don’t want them to be available to the organization. In this case, we recommend creating an archive for reports to be kept offline or at least off the workspace or server. These reports should also have accompanying documentation and a data dictionary (thank you, readme tab!) 

Closing Remarks 

Maintaining your reporting environment hygiene pays dividends in the future and reduces confusion and wasted time. In fact, we saw this as one of our trends for 2023. Curious about the other trends we saw or our predictions about 2024? Watch our Data Transformation Predictions video to see them. We take our reporting work very seriously and our team has the tools and experience to help you with your environment.

For more comprehensive insights into data analytics and visualization, visit https://www.calligo.io

Machine learning in healthcare

Top 10 Use Cases of Machine Learning in the Healthcare Industry

Machine learning is revolutionizing the healthcare industry by leveraging the power of data to improve patient outcomes, enhance operational efficiency, and drive cost savings. In this blog post, we will explore the top use cases of machine learning in healthcare, highlighting how Calligo’s Machine Learning as a Service capability can empower healthcare providers to transform their operations and deliver better care. 

1. Improve STAR Rating

The STAR rating system is crucial for healthcare providers as it determines their quality of care and impacts financial incentives. Calligo’s predictive models can identify the key variables that influence STAR ratings and provide prescriptive solutions to improve them. By optimizing patient experience, lowering costs, and enhancing patient satisfaction, providers can achieve higher STAR ratings and increase their bonus payments. 

2. Health Crisis Preparedness

Health crises, such as the COVID-19 pandemic, require proactive preparation to ensure the safety of workers and mitigate financial risks. Calligo’s predictive models and time-series analysis help healthcare organizations simulate and forecast the impact of unexpected economic shocks. By making data-driven decisions around layoffs, resource allocation, and innovation, providers can navigate health crises effectively and minimize long-term financial consequences. 

3. Optimize Staff Scheduling

Efficient staff scheduling is essential to meet patient needs while minimizing unnecessary labor costs. Calligo’s predictive models enable healthcare leaders to optimize physician and facility resources based on patient demand. By aligning staffing levels with patient access expectations, providers can enhance patient experiences and remain competitive in the evolving healthcare landscape. 

4. Medical Supply Logistics

Efficient supply chain management is critical for delivering timely and life-saving healthcare services. Calligo’s predictive models and time-series analysis optimize supply chain logistics by leveraging diverse data sources. By constantly monitoring and updating logistics channels, providers can ensure the availability of essential medical supplies, reduce costs, and mitigate the risk of inadequate supplies that could compromise patient safety. 

5. Patient Insights

Understanding patient preferences and identifying high-value services are essential for improving patient satisfaction and achieving higher Medicare STAR ratings. Calligo’s predictive models and Monte-Carlo simulations enable healthcare providers to measure and analyze patient feedback, identifying the services that provide the most value. By tailoring care and service offerings to meet patient preferences, providers can enhance patient satisfaction and drive higher STAR ratings. 

6. Reduce Patient Wait Time

Reducing patient wait times is crucial for delivering efficient and patient-centered care. Calligo’s predictive models and optimization techniques help healthcare organizations anticipate patient and staffing needs, enabling effective resource allocation and streamlined workflows. By reducing wait times, providers can improve patient satisfaction, increase revenue, and optimize staff utilization. 

7. Reduce Readmission Rates

Reducing readmission rates is vital for improving patient outcomes and optimizing costs in value-based care models. Calligo’s predictive models identify indicators of readmission, allowing healthcare providers to allocate resources strategically and implement interventions that reduce readmissions. By maximizing shared savings payment models and focusing on patient-centric care, providers can improve outcomes, drive revenue, and enhance STAR ratings. 

8. Improve ER Admittance

Enhancing emergency room (ER) admittance processes is crucial for managing complex patients and improving care outcomes. Calligo’s predictive models help healthcare organizations connect different health silos and optimize procedures to ensure appropriate patient-provider matches and levels of care. By leveraging machine learning algorithms, providers can target specific patients effectively, lower facility costs, and deliver better care experiences. 

9. Improve Screening Frequency

Improving the frequency of routine screenings plays a vital role in preventive healthcare and early detection of illnesses. Calligo’s predictive models and time-series analysis help healthcare providers identify patients who would benefit from screenings and predict their compliance. By targeting the right patients and promoting routine screenings, providers can reduce the risk of costly illnesses, improve patient outcomes, and optimize resource allocation. 

10. De-Identification of Data

Data de-identification is essential for expanding the usability of healthcare data while protecting patient privacy. Calligo employs advanced predictive models and time-series analysis techniques to safely de-identify data while retaining its value and richness. By leveraging anonymized data, healthcare organizations can drive additional revenue by utilizing data for research, population health management, and healthcare analytics while complying with privacy regulations. 

Machine learning is reshaping the healthcare industry, enabling providers to deliver better care, optimize operations, and improve patient outcomes. Calligo’s Machine Learning as a Service capability empowers healthcare organizations to leverage the power of predictive models, time-series analysis, and optimization techniques to drive tangible results. By embracing machine learning, healthcare providers can unlock new possibilities and create a future where data-driven decision-making revolutionizes the delivery of healthcare services.

complex data

Making complex data available for the benefit of society

In Calligo’s latest Beyond Data podcast, Tessa Jones (Chief Data Scientist) is joined by Dr Ellie Graeden, Research Professor (Center for Global Health Science and Security) at Georgetown University. Here we explore some of the episode’s highlights:

  • The inherent conflict of private data and the public good
  • Protecting individual rights within federated learning
  • The importance of effective communication and a common language
  • Designing systems and policies that work together
  • Focusing regulation on outcomes, not creating data siloes

At societal level, poor communication costs lives

Transitioning data across and between departments and data systems has historically been fraught with problems – who owns it? Who pays for it? Is it understandable and translatable into meaningful and actionable insights for the end user? 

Having worked extensively in disaster response, Dr Graeden has seen first-hand the potentially life-threatening issues that can arise when government departments’ data platforms produce incompatible outputs:

  • If 20,000 people need water, how many pallets need to be shipped?
  • If 10,000 electricity meters have been knocked out by a hurricane, how many people need feeding?

In such scenarios, identifying individuals amongst population-level data is crucial if the help provided is to be sufficient.

“We have to be able to really effectively move and communicate and share data that are relevant, in ways that they can get used by people all across the system”

Of course, any data system design should ensure privacy and protection for personal data. ‘Big data’ is still relatively new, and as such more powerful and widespread regulatory controls are now being introduced, although the US still does not have consistent requirements for how data should be handled. Fundamentally, meeting a population’s needs today, and planning for them tomorrow, requires the data of individual people to be analysed. Personal data must be shared quickly, effectively and all the while protecting individual rights. Data system design must therefore:

  • Include all players
  • Consider cultural constraints
  • Keep out bias
  • Ensure the right words and phrases are used
  • Focus on the ‘so what’, why does it matter?

“Every single thing we experience can be captured as data”

Even the most mundane moments in our daily lives leave a digital footprint, we shed data everywhere. But when does ‘my’ data become public, or the property of the software developer or the service provider? VR headsets collect ephemeral data that is analysed and applied for that one end user, but if that data is assumed to fall under GDPR the potential to use it for positive outcomes is severely limited. For example, should authorities be notified if content viewed and generated is illegal or harmful? And what if that chip can detect if the user is having a stroke, is that data classified as ‘health’ data? Can it be used to alert the individual to their medical emergency without contravening legislation? What if your mouse clicks can detect the early stages of Parkinson’s? Should you, could you, be told?

“If you’re treating this data as health data, then they have a very different set of regulatory constraints. HIPAA isn’t going to regulate those because it’s not a health care provider or a health insurer”

Piercing the veil

The conflict between personal protection and public good is everywhere, and Dr Graeden believes that some new data laws will create problems for federated learning. Legislation has clear boundaries (speed limits, blood alcohol levels) whereas science deals in spectrums, probabilities and unknowns.

Deleting an individual’s personal data from the model breaks the system, contradicting what regulators are trying to achieve. The solution is to prioritize outcomes, not processes – it doesn’t matter whether you write the rules with a pen and paper, or with AI, as long as you write the rules. Expanding the framework by setting gradients of data availability affords protection for individuals, whilst making data available that informs better decision making for public bodies.

“Data is nothing more, nothing less, than an abstract description of our world. A useful and powerful language that can tell us things that other languages don’t”

Data can no longer exist in siloes if it’s to be useful to society

There is now a healthy global appetite for the discussion around data, thanks in the main to two recent developments:

  • Covid gave us huge amounts of data about mortality levels, vaccination rates, hospitalisation trends – all of which were in the public consciousness every day
  • AI and ChatGPT – articles and debates about the pros and cons are everywhere, discussion is not just in the scientific community

The key challenges now for data scientists are expectation management and communication – we need to be clear about aims and specific about context, as well as knowing what to leave out to avoid overwhelm and misunderstanding. Unfortunately, scientists are not always great communicators (using complex terminology and detail, rather than common parlance and generalization) as Covid demonstrated:

  • Did having a vaccine mean you wouldn’t get sick? Or just less sick?
  • ‘Everyone should wear a mask’ became ‘wear a mask if you can’. This was due to limited supply, but it appeared that the science was not clear

“The scientific approach means you never have an answer… we are trained as scientists to focus on the fact that we don’t know”

In fact, the only answer is that the right data, used consistently and communicated clearly, will always allow us to be prepared, not reactive. To make decisions for the public good that protect every individual.

You can find out more about the common language of privacy in our Rosetta Stone eBook.

You can also watch Tessa’s fascinating podcast with Dr Graeden below.