If you’ve ever written API tests, you know how tedious it can be to repeat the same test logic for different input data sets. Whether you're verifying an API’s behavior for various user roles, product configurations, or environmental conditions, manually creating multiple test cases can quickly become unmanageable and time-consuming. This is where data-driven API automation testing swoops in to save the day. By separating test logic from test data, this approach enables testers to execute the same test with multiple data inputs efficiently. It not only reduces redundancy but also improves coverage and makes your test suite more scalable and maintainable.
The Challenge of API Testing
While working with data-driven testing, you’ve probably encountered formats like Excel, CSV, JSON, or even property files to feed test data into your scripts. Each has its pros and cons, and choosing the right one can significantly impact your testing workflow.
● Excel/CSV Files:
- Pros: User-friendly for non-technical users; anyone can open a spreadsheet, add rows, and instantly create more tests.
- Cons: Parsing them adds complexity to your automation framework, and maintaining these files can get messy. Collaborating on a shared Excel file can also be a nightmare.
● JSON Files:
- Pros: Great for complex, structured data; lightweight and mirrors API structures.
- Cons: Larger JSON files become harder to maintain, and deserializing them requires extra effort.
● Property Files:
- Pros: Commonly used for storing configuration and key-value pairs in automation scripts.
- Cons: Not ideal for complex datasets; limitations arise when managing large or nested test data.
While I’ve worked with Excel sheets and JSON files to store and manage data in my projects, it often becomes tricky, especially with large datasets. Updating these files can be cumbersome. Another option is directly accessing the application’s database, but this presents challenges such as security barriers and the need to write complex SQL queries, which isn't ideal when the focus should be on testing APIs.
A New Approach with Spring Boot and MongoDB
To simplify the process, I built a Spring Boot application in Kotlin (though it could be done in Java too—I just wanted to pick up Kotlin!). This application, called data-provider, uses MongoDB for data storage and makes managing customer data super easy with basic CRUD (Create, Read, Update, Delete) endpoints. The best part? You can automate the entire process!
Even though we're talking about customer data here, this method can be used for managing any type of data just as easily!
Data-Provider Application
This Spring Boot application, written in Kotlin and backed by MongoDB, offers a practical solution for API automation. With data-provider, you can:
- Create customer data.
- Retrieve the data.
- Update or delete customer data
Why MongoDB?
I decided to use MongoDB for this application because it’s a NoSQL database that handles large, unstructured, or semi-structured data well. In a dynamic testing environment, MongoDB’s document model fits perfectly for our customer data. You’re not bound by strict schema rules, meaning you can modify or add fields without breaking your test framework.
● Flexible Data Structure: You don’t need to worry about predefined table schemas, making it easier to change your data model as your API evolves.
● Fast and Scalable: MongoDB is fast and scales easily, which is crucial when running large volumes of automated tests or dealing with lots of test data.
Automating Test Data Generation
With the data-provider app, you can fully automate test data management. Instead of relying on Excel or JSON files, test data is generated on the fly during script execution and stored directly in the MongoDB database. For example, in the application I'm working on, a customer is created as part of the signup flow. I store the data generated during this test in MongoDB by making a simple POST request to the data-provider app, along with other relevant data. This approach simplifies test data handling and eliminates manual effort.
Sample Request
curl --location 'localhost:8080/v1/customer' \ --header 'Content-Type: application/json' \ --data-raw '{ "name": "Jack Dawson", "preferredName": "Jack", "email": "jack.nicolson@example.com", "mobile": "3234567890", "pincode": 123456, "metadata": { "env": "staging", "inUse": true, "tags": [ "customerTest" ] } }'
Here, everything except the metadata is generated by the automated test. The metadata is created prior to saving the data in the database.
● env: Indicates the environment to which the data belongs.
● inUse: A boolean field that shows whether the data is currently in use by a test.
● tags: A list of strings used to categorize customers for different tests.
Using Test Data
Once this data is saved in MongoDB, I can retrieve it using the mobile number, email, or tags in the setup method. For instance, if I have a test to update a customer's email address, I first fetch the data using the GET API provided by the data provider application. I then use this data to log in and complete the test. After the test is finished, I also update the information in the data provider application.
If you want to run the same test with different data, you can utilize tags. For instance, when creating customers, you can assign the tag "email test" and specify that you've created 50 customers with this tag. You can then use a GET API to retrieve these 50 customers and execute the test on them. The "get customer by tag" function returns a list of customers.
curl --location 'http://localhost:8080/v1/customer/tag?tag=customerTest'
You can handle concurrency by using the inUse field. Before executing the test, check whether inUse is false. If it is, set it to true and proceed with the test. After the test is finished, change it back to false. By implementing this check, no test will run on the same data simultaneously
Drawbacks
While the data-provider application offers significant advantages for managing test data, it also comes with some challenges. Hosting the application and MongoDB can introduce additional costs, and increased API calls during test execution may slow down the process. There's also the risk of network latency or downtime, adding a layer of dependency on service availability. The setup and maintenance of this service can become complex, requiring updates, backups, and scalability management. Security concerns arise if proper measures like authentication aren’t in place. For smaller projects, this solution might even be over-engineering compared to simpler file-based methods.
Setting up data-provider application
1. Clone the repository.
2. Install mongo db using docker cmd
docker run -d -p 27017:27017 --name=mongo mongo:latest
3. Install mongo sh to work with mongo db.
4. Make sure unique indexes are created where ever required.
check the indexes using db.customer.getIndexes()
create unique indexes using db.collection.createIndex( { "email": 1 }, { unique: true } )
5. Postman to execute APIs
6. Run the application and execute APIs.
In Summary
The data-provider application is a powerful tool for automating your API testing in a data-driven way. Instead of relying on outdated Excel, CSV, or JSON files, you can use this Spring Boot and MongoDB-powered app to generate, store, and manage your test data in real time.
By dynamically creating data with POST APIs and fetching it during the tests, your test cases become more flexible and resilient to changes. Plus, MongoDB makes it all so much easier by providing the flexibility and performance needed for a test-heavy environment.
So next time you're setting up your API automation, think about moving away from static test files and embracing a more dynamic and scalable approach like this!
That’s all for today—hope this gives you some fresh ideas for your API testing projects!
Happy testing! 🚀