The software testing space has evolved significantly over the past few decades, with huge advancements in automated testing, incredible testing tools that make deeper testing possible, and the shift to a holistic team quality. But in my eighteen years in software testing, I’ve never seen as much hype and excitement, yet balanced with equal amounts of uncertainty and reluctance, as artificial intelligence, or as we hear thousands of times each day, AI.
Whatever camp you are in, we can’t ignore the presence of AI, it’s here and it’s definitely not going away. But what is AI within software testing truly capable of? Where does it support testing, and where does it hinder us? I’ve been fortunate enough to get hands on with a lot of AI recently, and in some cases it’s significantly helped me, saving me hours. In other cases I’ve lost hours to something I could have probably done myself in thirty minutes. These experiences have helped me form some opinions on the current offering and what it means for testing and those in testing roles.
Understanding AI’s Role in the Software Development Lifecycle (SDLC)
Before diving into the specifics of testing, it’s important to define what we mean by AI for this article. AI is an incredibly broad topic, encompassing everything from natural language processing, machine learning and large language models (LLMs). In this article we’re going to focus specifically on Generative AI, and Agentic AI. It’s these kinds of tools that I’ve been able to explore, and it’s these kinds of tools I’m seeing more and more of each day in the context of software testing and software development.
While we are going to focus more on utilising these tools to assist with aspects of software testing, it would be an incomplete article without briefly exploring how the use of AI in the wider software development life cycle is impacting testing.
How much of the code you are testing has been generated by AI? Did the engineers review the code the AI produced or just see it works and commit it? Was the code solely written by agents? Did those same agents make additional unnoticed/unwanted changes in the process? Did the agents write unit tests? Is the code of the same quality standards as the existing code? I’m not saying this code is worse, that’s very contextual, but the process that led to generating that code is very different than without utilising AI.
What about your user stories? Are they written solely by a human? The same human that’s been involved in all the customer conversations. Or were the stories written by an AI based on meeting notes and selective information the product owner provided. How well were they reviewed?
There lies one of my early observations about using AI. It will compound a lack of quality standards. If your team has poor quality controls in place throughout the SDLC, bringing AI into the mix will likely lead to majority disruption and a drop in quality. Exactly the same as when teams started bringing automation into their process. Teams need to have an established understanding of what good looks like, and how to measure that. That will allow teams to bring in AI in a controlled manner while maintaining, and hopefully improving their processes. My experience thus far where teams haven’t got these foundations in place is more disruption that gets pinned on AI, when really they were never ready to start using AI. This then upsets those leaders were believed AI was going to solve all their problems
There’s been significant focus on quality in recent years, with the rise of the quality engineer, and the introduction of AI highlights the importance of this work. While every team doesn’t need a quality engineer, every team will need to put in quality controls, and that requires identifying what good looks like. We can’t get the value possible from AI if we don’t agree what good looks like. I think testers and quality engineers can add a huge amount of value here.
AI vs. Traditional Automation: A Paradigm Shift
AI is accelerating a paradigm shift that I’m fortunate to be ten years ahead on. It’s that automation doesn’t equal test automation. This is the exact reason I coined the term Automation in Testing back in 2014. Test Automation should mean the use of software to assist software testing, but unfortunately it doesn’t, it means automated testing. I’ve accepted this now, it’s not a fight I want to take on anymore. Automation is simply using software to assist with a process, there are many different types of software we can utilise to assist with software testing beyond test automation tools. I’ve built hundreds of tools that have made my testing easier, data creation tools, test automation frameworks, log parsers, complex bash scripts and a whole host more. Now, we can add AI to that list, mostly via chat tools utilising an LLM such as ChatGPT, Claude and Co-pilot.
We have more options than ever right now, but we need to learn when to use them and how to use them. What are the differentiating characteristics that will expand, but more commonly limit our options.
A Look at Automated Testing
It feels funny to talk about traditional automation, but that’s the impact AI is already having on our industry. Traditional automation in this context is rule-based, deterministic automated scripts. Scripts that are written by a human, and contain the codified information they collated from interacting with the system. These scripts are predictable and repeatable, they never deviate from the exact steps provided. We create these because we’ve seen the system behave in a certain way, and we deem that information so important that if it were to ever stop working that exact way we need to know.
We’ve seen two strong additions to this space in recent years, code/test generation, and testing agents. Notice I said additions, I don’t see these as replacements, they are different, they offer us different alternatives that in some contexts could be a huge value add. But ultimately for me they are in the same category as automated testing, or part of the process in creating automated tests.
Test Code Generation
Let’s start with test code generation. These are tools that take an input, commonly a prompt in a chat, written by a human to provide important context to the model. This context could include user stories, test data, quality standards, number of tests, what to focus on, or it could simply say ‘create me some tests!’. The output in both cases will be some test code, but their quality will be significantly different; put garbage in, you’ll get garbage out. This is exactly the same for code generation, the less specific you are, the more generic the code generated.
However, in the right context, with the right engineer, using some tools could significantly speed things up. What can we do to get better results:
- Providing Context - The more context we can provide a model with, the better I’ve found the results. Explain the domain of the app, explain the risk(s) you are trying to mitigate, explain the patterns you’d like to the code to use, explain the assertions you’d like to see. As mentioned in the automation paradox article, you are almost writing a test case as a prompt. You need to do this if you want to get the test you would have written as the output.
- Be Specific - Provide examples where you can, be clear about the tool versions you are using. While they claim to be intelligent, their knowledge is broad and generative, so being specific will help tailor the model.
- Test the output - Test the output, please test the output. Don’t just copy and paste it, run it, see it work and declare yourself done. How did it do it? What variable names did it use? Does this code have the same quality standards as the existing code? Apply the same if not more rigor, you would if you’d written the code.
- Know what good looks like - Put in the ground work yourself to learn what good looks like, don’t trust that the AI knows. Absolutely use AI to help you on this journey. Ask it to explain its decisions, and expand on the code, but complement that with your own learning from docs, blogs, videos and so forth. It’s your work at the end of the day, even if you used AI to create it all, so you have a responsibility to make sure it’s a high standard.
Testing Agents and Autonomous Testing
A more recent addition to our testing toolboxes are testing agents, and specifically for this section, agents capable of doing autonomous testing. Autonomous testing is the term that’s being used to describe a type of testing where AI agents are trained to autonomously explore an application, it’s like a hybrid between exploratory testing and test automation. Or at least that’s how it’s being pitched. It’s definitely an area with huge potential, but from my early experiences is still a way of human testers. They are slightly intelligent web crawlers to me. While we can tell them to complete a specific behaviour unlike a traditional automated test, we have no idea how it’s going to do it. That's a disadvantage against traditional automated testing, where we want to know that the exact flow works. But on the other hand it’s also an advantage that can be utilised.
Why so cautious Richard? These agents often act without the full system context we as humans possess, and it’s very difficult to provide them with it. We can currently only give them pieces of the puzzle, but we all know the real value comes from the human interactions we have on our team where these puzzles are connected. Without all the context they barely scratch the surface of our applications.
Now that doesn’t mean they are not valuable, they will detect some problems, but they will struggle to put them into context due to lacking that insight. At their best, these agents are like energetic juniors: capable, fast, and curious. However, they are in need of guidance and mentorship. Left unsupervised, they will generate noise, but they will keep exploring, they don’t need breaks and time off (please give your juniors that). But if integrated thoughtfully into a team’s testing approach, they can highlight potential areas of concern that warrant a deeper look. They can serve as signposts to tell you that there might be something to discover over here. They won’t replace a human exploratory tester anytime soon, or even traditional test automation, but they might become part of their toolkit.
Using AI as your Testing Assistant
While there is definitely progress in the above tools and techniques, they only work when used as an assistant to the human and complement the existing testing that is taking place. I chose those examples first as they are ones more closely aligned with the industry's view on test automation. However, AI is now opening the door for an infinite amount of tools that have always been possible, but it’s never been easier to make them.
Testing is made up of lots of interconnected processes, and we can now use AI to assist us with many of them, and even connect them for us.
Test Idea Generation
One of the earlier processes in software testing is to think of test ideas. There’s often no right or wrong answers in the initial stages, so why not get AI to help? Within five minutes it could have generated hundreds of test ideas, and even evaluated your own ideas. You may proceed with the ones you’d already thought of, and that’s fine, you’ll be proceeding feeling a bit more confident in your selection. Or, you may have added a few more that otherwise may have never been thought off. It’s important to focus on being efficient here, AI will never create the perfect test ideas, and it would be a futile process to try and make it do so. Instead focus on using it to help trigger your own thought process.
Data / Information Processing
Many testing processes rely on gathering and organising data to then utilise it. Think processes things like gathering all the sprint stories, mapping them to pull requests, mapping those changes to tests and guestimate where risk may be. AI can do that process for us. We could utilise a set of AI Agents and design a workflow to gather and interrupt in a matter of minutes. Again, it will make mistakes, but if the processes are well designed, and also provide us with the raw data, it will make a solid enough start we can build on from. Getting the raw data is a critical step here, without it, we run the risk of AI blinding or biasing us. If we capture that, we are able to validate that its decisions make sense. Or defend when mistakes are made.
Conclusion on AI Assistants
There are also things like test data generation, failed test investigations, failed build investigations, automated tests reviews/appraisals, the opportunities are endless. Now is the time to be experimenting and seeing what works for you, and what doesn’t.
It’s important to remember these things though:
- AI is never right, it doesn’t know what right is, it’s generative. So, always apply your own judgement.
- It’s heavily biased on the data it was trained by, this could mislead you. So, always cross reference with your own learning.
- It’s often lacking the whole context
- It’s commonly outdated, as it may have been tried on older information.
Adapting to the AI-Assisted Testing World
So what does the future hold for testers as AI becomes more embedded in our tools, software and processes?
1. Shifting Skills, Not Displacement
The testers of the future will not be replaced by AI, but they may be replaced by someone who understands how to work with AI. They said the same about test automation, but they made the mistake of pitching it as a replacement for testers. We need to avoid that with AI, because its value is in assistance.
2. AI Literacy
Just as knowing test techniques, or a specific test automation became essential for teams a decade ago, familiarity with AI-assisted testing platforms will become expected. Understanding their assumptions, biases, boundaries and possibilities will be just as important as understanding their syntax. People like to talk about the art of google, and how others are significantly better at getting results, the same is true for AI assistants.
3. Testers as Ethical Stewards
As AI is increasingly embedded into software products, testing has a growing role in ensuring ethical outcomes. That means testing not just functionality but fairness, bias, explainability, and safety.
4. Increased Focus on Human-in-the-Loop Testing
Rather than replacing humans, the most effective AI testing strategies will focus on collaboration. The human should always be in the driving seat to validate, challenge, and refine what AI generates. Just like a spell checker doesn’t replace editing, AI doesn’t replace testing, it supports it and when used correctly, makes the job more enjoyable.
5. Expanding the Scope
We have to get better at talking about testing, and stop viewing testing as a simple few step process. We need to be expanding those processes even more, and bringing AI into them at multiple stages of them.
Final Thoughts: The Quality Mindset in an AI World
In a landscape increasingly shaped by speed, scale, and complexity, AI offers powerful tools to speed us up. But testing/quality has never been about tools alone. It’s about the mindset; curiosity, skepticism, empathy, and systems thinking. AI doesn’t reduce the need for testing. It increases the need for thoughtful testing.
The future of software testing is not AI or human, just like it was never manual or automated. It’s skilled humans with tools, and our toolboxes just get a whole lot bigger.
I also feel AI will be a catalyst for teams to start taking quality engineering more seriously. With AI now helping to write code, generate requirements, produce tests and execute tests, investigate broken builds, the possibilities are really endless the importance of quality shifts. We need to be asking more questions like , “Does this make sense?”, “Is this the right thing to do?”, “Is it fair?”, “Has this made the process better?”.
The testers (or those who wear the testing hat in their teams) of the future are not just verifying functionality—they're verifying AI-assisted decisions. They’re ensuring that the glue holding these autonomous components together is secure, sensible, and serving the user. They’re identifying where AI is creating value, and where it’s creating risk.
So where does AI fit in the future of software testing?
Right alongside us. Not as a replacement, but as a powerful co-pilot, your on demand pair engineer, one that still needs a human doing all the driving.
About Automated Testing series
You're at the final part of our Automated Testing series - a collaboration with Richard Bradshaw. The series will explore many facets of Testing Automation and its implementations.
- In the first part, we explored the Manual and Automated Testing paradox. Check it out here.
- In the second part, we explored the different types of UI Automated Testing. Check it out here.
- In the third part, we looked at the essential types of Automated API Testing. Check it out here.
- In the fourth part, we explore how knowing your system’s architecture, data flow, and integrations is key to building test automation that truly delivers. Check it out here.
- In the fifth part, we explored the SACRED model to building reliable automated tests. Check it out here.