Automatic software testing explained for managers
It’s hard for any manager to make decisions on something they don’t know anything about. It is common for a manager (Product Manager or CEO) to have their dev team come to them and say: “we need 20% of each sprint to add tests to our code, cool?”. The manager will then naturally wonder about the ROI for this effort, asking about the expected benefit, only to get some vague answer such as “it will help us have fewer bugs”, “it will speed up development in the long run”, or “it will move us closer to a CI/CD process”. How is one to make such a call?
It’s even harder for managers to see the value of automated testing, because it’s something most startups can live without for the first year or so. In the early days of a startup, all you care about is releasing something as fast as humanly possible, and to do so, you might pass on some best-practices of writing code. When there are bugs, you can often catch them in time with manual testing, or just get away with it because your customers know you’re just 5–6 people crammed in a WeWork cubicle.
Because the ROI of software testing is difficult to measure, it is often added to a company’s development process only when bugs start sprouting every other release, and manual regression testing starts taking more and more time and resources. By then, it might take months to catch up.
The goal of this post is not to teach you all about software testing, or to drill down to which type of testing you should use in your company. First, I’m not an expert on the matter. Second, and more importantly, you should leave that to your VP R&D, CTO, or whoever you have in charge of the R&D team. After all, that’s what they’re there for.
The goal of this post is to familiarize yourself with what software testing actually is, what it is useful for, and give you a feel of how it works. I will also introduce some basic concepts, so that when they’re tossed around, you won’t be left staring wide-eyed at your developers.
To do that, I’ll use an example of a basic feature. I won’t be writing any code, but I will be using technical logic, as the goal is to give you deeper understanding and intuition of software testing.
Say we have a task app, and we’d like to add a new feature: tags. Tags (sometimes called labels) are helpful in organizing your tasks into groups. Example tags for tasks might be: Urgent, Important, Ongoing, etc.. Whenever you add a new task, you can add them to an existing tag, or create a new tag and make this task the first to use it.
In software development, even the simplest of features often has multiple steps. Let’s break our very basic tagging feature into steps we can code:
Tag architecture (simplified):
- Create a database table for distinct tag names, each tag with its own ID
- In the tasks database table, add to each task a field called “tags”, which contains a list of tag IDs linked to that task
Adding a new tag:
- A user inputs a tag string in a task’s page
- The system then gets a list of all existing distinct tags from the database
- The system checks if the input tag already exists in the fetched list
- If the tag already exists, add the tag ID to the task in the tasks table
- If the tag is a new one, first add it to the database, and then add its ID to the task
Why code breaks
There are many reasons for code to break. Your company’s code probably includes thousands of files, hundreds of thousands of lines of code, which include many many variables and functions. Your developers don’t just add new code — they continuously modify and remove existing code.
Take step 3 above — checking if a tag already exists. It’s actually made of 2 sub-steps:
3a. create a function “checkIfTagExists” that receives a list of tags with IDs and a new tag, and outputs either the matching tag ID or a “Tag does not exist” response.
3b. call the above function with the user’s new tag and the list we fetched from the DB.
These 2 steps, while related, might be located in different, distant files in the code. 3a might in fact be generalized to “checkIfItemExists”, and be used in other similar situations other than tags, such as assignee (each assignee having a name and an ID). If assignees are being worked on, the developer, not realizing the “checkIfItemExists” function is used by the tags functionality, might decide to change its behavior, returning a simple “Item exists” response instead of the item’s ID. But tags still expect to get a tag ID, and oops—a bug is born.
Types of software tests
A test is basically running a piece of code, giving it an input and see if we get the desired output. There are 3 main types of tests:
Check one isolated component.
For our function checkIfItemExists(newItem, existingItems), our test might be:
- for checkIfItemExists(“tag1”, [[“tag1”, 1234],[“tag2”, 5678]]) we expect to get the response “1234”.
- for checkIfItemExists(“tag3”, [[“tag1”, 1234],[“tag2”, 5678]]) we expect to get the response “Item does not exist”.
- for checkIfItemExists(“tag1”, [“tag1”, “tag2”]) we expect to get the response “invalid input”.
Let’s go back to our bug from earlier. When a tag was found, we expected to get its ID, but instead, all we get now is “Item found”. As you can see, the first test scenario would have failed.
Checks how multiple components work together.
After checkIfItemExists runs, we’ll need to execute other functions, according to the response we get. If it returns an ID, we might apply the addTagToTask function. If checkIfItemExists returns “Doesn’t exist”, we will first need to apply the createNewTag function. The entire process might all be a part of a larger function, updateTag, which includes all the above logic.
An integration test might be (I’ll spare you the detailed example this time, I think you get the point):
- If updateTag gets a known tag, just add its ID to the task
- If updateTag gets an unknown tag, add it to the list, return the new tag ID, then add it to the task
- If updateTag gets a blank tag, return “invalid input”.
As you can see, integration tests are similar to unit tests, but they include multiple components working together. An error in either of those components will break the integration test (without us knowing which part caused the test to fail).
In our case, the first test will fail here as well, as it can’t add the tag’s ID, which was not received.
E2E tests check the entire workflow. For example, they will simulate a browser page, input the new tag into the input box, and expect a certain result, for example, a yellow box above the input field, with the new tag name.
In our example, an end-to-end test scenario might be:
- Create a task and add the tag “tag1”.
- Check that “tag1” was added in a yellow box above the input
- Go to the tag list page
- See that “Tag1” was added
- Create another task and add the tag “Tag 1” to it
- See that it was added in yellow
- Go back to the tag list page
- See that “tag1” appears only once (indicating it wasn’t created again)
Once again, the test will fail with our new bug.
The difference between the tests
Unit tests take a lot of time to create, as we usually have many many components. If they fail, we’ll know right away what the issue is, as the test was highly focused. However, it is possible for all unit tests to pass and still have a bug, if the issue is in the way the components interact.
Integration test are less time consuming and might catch more bugs (a failed unit is likely to fail an integration test), but we will still need to dig in to find out exactly why the test failed.
End-to-end tests, if done correctly, are great at catching bugs, but are even less indicative of exactly where the issue is. They are the equivalent of a QA person writing a comment in Jira saying “so I’ve tested this new assignee feature, and it seems to have broken tags”.
One important thing to note is that software tests are not an “all or nothing” sort of deal. They are time-consuming, and there’s no need to get to 100% coverage. Follow the 80/20 rule. Try to cover those parts of the code that, if broken, might cause real damage to your customers.
To sum up
There are a number of technical concepts I’ve seen managers without a technical background struggle with. I still struggle with some of them. What does refactoring mean? What’s CI/CD, how to get to it (hint: automatic tests are a big part of it)? Why does releasing to production sometimes take so long, and why can’t we remove that one feature that’s still in QA from the release, and deliver the other features?
This post originated with me trying to get a clearer understanding of why software testing is important, as we’re about to start adding them to our workflow in the startup I recently joined, which is just under a year in the making. In a previous company, we didn’t start adding tests until 2–3 years into the company’s life, and we faced some unrelenting issues because of it. I’m really looking forward to doing better this time :)
If this post helped you, go ahead and give it some claps, so that more people get to see it!