Can an AI replace a data analyst? In recent years, we have heard about breakthroughs in artificial intelligence and how robots are replacing office workers, especially in areas where processes can be algorithmized.
Data work is highly technical, and more and more services are emerging that can independently find insights, build models, and create dashboards. So, we decided to conduct a crash test to see how regular AI models handle the daily tasks of a data analyst.
Alex Kolokolov, a data expert and bestselling author of “Data Visualization with Microsoft Power BI,” the founder of Data2Speak Inc, along with his team, actively trains students and professionals in AI skills. They conduct courses and week-long marathons, regularly performing crash tests and studying the capabilities of artificial intlligence.
In this article, you will learn about the challenges and pitfalls at different stages:
- Processing the dataset, creating a data model
- Identifying and calculating key metrics for analysis
- Data visualization and summarizing findings
Initial Task for the Analyst:
“Analyze the table of webinars and courses and create a text report for management on conversion rates, success, and popularity of webinars and instructors. Calculate the ROI of each webinar.”
Let’s see how the AI handles it.
Step 1: Processing the Dataset and Creating a Model
The CRM export with webinar data consists of two sheets within a single Excel workbook:
- One sheet contains data on promo webinars, including topics, authors, audience size, and the cost of attracting attendees.
- The other sheet records course sales among people who attended the webinars.
As shown on the image below, the data can be easily matched using the Webinar ID field.
Sheet 1 “Webinars Data” + Sheet 2 “Course Details”:
*All names in this article have been changed to maintain confidentiality and protect financial information. Any resemblance to real individuals is purely coincidental.
First Challenges Encountered
- Paradoxically, Microsoft Copilot refused to work with the .xlsx file—the most common format for Excel documents. Instead, it suggested using .pdf and .png, which are completely unacceptable formats for final reports.
- ai does not support .xlsx files but handles CSV files well, though it struggles with analyzing two tables simultaneously.
- DeepSeek inconsistently read the data, possibly due to temporary issues.
- GPT-4 handled the upload perfectly, so we decided to proceed with it.
Next, we followed a simple approach—asking the AI questions as if it were a regular analyst on the team.
The AI performed reasonably well:
- It understood the document structure.
- It generated answers for all questions.
However, during data validation, we noticed some miscalculations in ROI and conversion rates. The errors stemmed from the AI needing to repeatedly reference both data sheets and merge the dataset each time.
At this stage, we reached the data analysis limit for the free version.
So, we continued our work in the paid version of ChatGPT-4o. Yes, even an AI intern needs to be paid!
While GPT does not have a built-in function to “build a data model,” it can determine which field to use for linking two tables and merge them into one. Given the AI’s previous confusion when handling multiple tables, we assigned it this task.
After downloading the AI-generated file, we noticed that it had created multiple rows for almost every webinar. This happened because course purchases occurred on different days after the webinar. Although we expected a single dataset, it became clear that in this format, the AI would continue to make errors in calculations.
We refined our prompt and achieved the desired result: 1 webinar = 1 data row. This step highlighted AI’s limitations in handling tabular data. Here are the key issues we identified:
- 1 task = 1 table: AI models may lose context and become confused when handling larger datasets, such as multiple sheets within a single workbook.
- Internal formulas = a mystery: If tables contain logical relationships, AI does not interpret them correctly.
- The quality of your prompt determines the quality of the result.
⚠️ Warning! Always verify AI-generated results as you would with a human assistant—AI can be just as inattentive!
We must communicate with AI differently than with a human analyst, who would naturally account for details. Instead, we need to describe requests in detail, as the machine’s response is entirely dependent on the precision of the input prompt.
Step 2: Identifying and Calculating Key Metrics
Now we are ready to fully work with the data. We asked the AI to suggest relevant metrics, selected the best options, and attempted to calculate them. Here is the final list of key metrics:
- Webinar conversion rate: Identifying which webinars most effectively convert attendees into course buyers.
- Top instructors by conversion: Determining which instructors have the highest conversion rate from webinars to course purchases.
- Most popular webinar topics: Identifying the most attended topics without linking to specific instructors.
- ROI per webinar: Calculating the return on investment for each webinar, independent of instructors.
Each step requires verification since AI has a short memory—by the end of a conversation, it may forget earlier instructions and recommendations.
For the first three metrics, the AI provided perfect responses. However, when calculating ROI per webinar, the AI generated a list of 50 rows instead of the expected 10 unique webinars.
We highlighted this mistake to the AI. At first, it did not immediately acknowledge the error, but later, it apologized politely and provided the correct response.
Only after addressing the issue did the AI provide the correct result.
The Insight: Trust, but Verify!
Test complex queries on small data samples first to identify logical weaknesses in calculations. Keep reference values for comparison—column totals, unique record counts—using a reliable BI tool (even Excel).
Step 3: Data Visualization and Summarizing Findings
For each question, we received a separate table. Now, we add visualizations to make the results easier to interpret. The AI handled this without major errors (see the picture on the left). However, for those who are meticulous or familiar with “Data Visualization with Microsoft Power BI,” some obvious refinements were needed: adding data labels and removing gridlines and axes, which were unnecessary in this case.
Before and after:
Working with textual data in AI is significantly easier, which often results in more interesting responses. However, AI can sometimes hallucinate and generate information beyond the given data. For instance, when analyzing the chart for the best-converting webinar, it decided to provide generic marketing recommendations instead of basing its conclusion on actual data.
This is a common issue—it’s important to specify that we need not just general insights and recommendations but concrete tasks based on our real data. When formulated this way, the AI’s response becomes much more valuable.
Conclusion: It’s great that AI can not only create but also improve charts, but without human oversight, they still end up being difficult to read.
Plan ahead which key information each chart should convey, and check it for visual clarity. The book “Data Visualization with Microsoft Power BI“ can help with this.
Crash Test Results
It’s still too early to replace human analysts—AI is more of a timid intern than a full-fledged expert.
Working with AI in data analytics has demonstrated that success depends on clearly defining tasks and continuous oversight. AI handled calculations and visualizations well but required detailed instructions and an iterative approach. This highlights the need not only to understand algorithms but also to craft precise queries.
Key Takeaways from This AI Experiment:
- AI does not support all file formats (each model has its preferred ones).
- Constant verification is necessary—AI confuses numerical dependencies.
- Multiple iterations, refinements, and corrections are required. No automation.
- Difficulty in generating structured reports—manual intervention is still needed.
- Inaccuracy increases with data volume.
Keep thinking critically! This skill is essential not only for working with AI but also in daily life. AI, like humans, can make mistakes and fail to recognize them. This serves as a reminder that all information, even from the most reliable sources, should be double-checked. As long as you continue to think critically, machines won’t replace you.
Author: Alexey Kolokolov, data analysis expert, bestselling author of “Data Visulization with Microsoft Power BI,” and founder of Data2Speak Inc. He trains students and professionals in AI applications, organizes courses and intensive programs, and regularly tests neural network technologies to assess their effectiveness and potential in data analytics.