The 14 most important questions to ask when evaluating next-gen data solutions
Are you thinking of making the switch to an AI-powered analytics platform, but don’t know where to start or what factors to consider? We’ve got you covered. Based on our conversations with our customers and on our own experiences using and developing data platforms for the past 10 years, we’ve put together this checklist that highlights some key areas and functionalities to look for.
We’re focusing on solutions that allow you to interact with data conversationally - by asking questions as you would of a data scientist. Most solutions in this field already have good and existing basic functionality - this is not covered in this checklist. Instead, our focus is on features/limitations that are less advertised by vendors but have an outsized impact on companies’ ability to get value from the products - e.g. how much ongoing data engineering work is required or what happens when there’s a wrong answer.
How to use this checklist
We recommend that you use this checklist (you can duplicate it and make your own copy) to:
Highlight questions you should be asking vendors when talking to them
Get a better understanding of which areas are a priority for your business
Identify the strengths and weaknesses of different vendors
In the end, trust is a key component when choosing a data solution and you should understand as fully as possible what you’re buying.
Data Compatibility and Import
It’s important to make sure that you can easily get your data in without needing to dedicate a lot of expensive data science / engineering resources. There are often upfront as well as recurring maintenance costs to consider.
Real Time Data Updates
If the underlying data updates, will generated charts and tables update automatically?Data Shape Compatibility
Can the application work with the data in the format you already have? Is (some) of your relevant data in a JSON format, and if yes, does the platform support working with it directly?Low Recurrent Data Engineering Requirements
If the structure of your underlying data changes, or you need to ask novel exploratory questions, how much data engineering is required?Cost and Performance
Copying your entire dataset is often expensive and slow. Can the platform work with where you stored the data, or does it create a copy of your data? How expensive is it?
Answering Questions
Ultimately, you need a platform that gives correct answers. We tested a number of solutions for correctness (you can find the benchmark here) but the best way to understand if something will work for your company specifically is to connect your data and test it. It’s also important to understand what happens when wrong answers are returned and what limitations a product has in answering questions.
Can answer complicated questions
Can it answer questions that require a join? Can it answer something that requires two aggregations? (e.g. “Between March and May, how many users signed p?”)Refusal of bad questions
If you ask it a question about something not in the data, it should either refuse to answer, ask for clarification, or at least make it very clear which question it answered.Transforming data
An easy way to test it is to ask the AI to show something as all uppercase / lowercase, just show the first five letters, or to have the AI access a field of a JSON column.Business-Specific Info / Data Dictionaries / Hints
Some of your data is specific to your business, and isn’t visible from your data. Is there a place where you can easily add the necessary contextual information about your business and your data?
Interpretability
All next-gen data solutions use LLMs in some way to help you analyze data. However, LLMs are not perfect and frequently make mistakes. It’s important to be able to understand what the LLM is doing in order to identify bad results and understand where they come from (e.g. the LLM misunderstood the question, or the data has errors in it).
See the underlying data
Can you easily see what data the AI based the decision on?See the AI’s transformations
Can you easily figure out how the AI processed your data to get to the answer?
Note: Some applications might have a button that asks the LLM to explain the query it just generated. However, this is extremely unreliable, and LLMs are bad at explaining their reasoning post-hoc.See the data an answer or chart is based on
In case the AI answers in text form, can you see what the final result of the AI’s transformations are?
Data Governance
Not everyone at your company should be able to access / ask questions about all your customers’ most sensitive data. It’s important that an analytics application helps you govern access to data in a sane, understandable way.
Can your data governance model be enforced in the data platform?
You probably already have rules in place about who gets to see what data. Does the platform allow you to enforce those rules?
Collaboration
Companies are more than the sum of their employees, working together is what truly generates value.
Data-Science Checkup
If a non-technical user has asked a question, is there a good workflow for a data-scientist to inspect or debug a particular result?Collaborative Q&A
Can you follow-up the question of a colleague?