Python Coding Challenge: Filtering Pokémon from a Public API
Recently, I encountered an interesting Python coding challenge during an interview. The task involved working with the PokéAPI — a free and open RESTful API for Pokémon data.
Here’s what the challenge asked:
- Query the endpoint: https://pokeapi.co/api/v2/pokemon
- From the results, extract and print a list of Pokémon that satisfy both of the following conditions:
- The Pokémon’s
base_experience
is greater than200
. - The Pokémon have a
fire
listed among theirtypes
.
- The Pokémon’s
- For each matching Pokémon, print the following details:
name
height
sprite_url
(which should come fromsprites.front_default
)
Prerequisites: requirements.txt
File
I have used Python version 3.12.2
.
All the required Python dependencies for running the following scripts are listed in the requirements.txt
file. To set up your environment with these modules, simply run the following command after creating your virtual environment:
pip3 install -r requirements.txt
This will ensure that all the necessary libraries are installed before you execute the script.
Initial Approach: The Quick and Simple Method
The first solution is straightforward, though not the most efficient. It involves querying the main Pokémon API endpoint to fetch a list of Pokémon. Then, for each Pokémon in that list, we make an additional request to retrieve its detailed data.
From there, we filter the results based on the given conditions. Finally, we return only the Pokémon that meet these criteria.
While this method works, it’s worth noting its a slow and inefficient process as we are working with a large dataset. The programs execution time averages around 170 seconds on my machine.
Boosting Performance with Coroutines: asyncio
+ aiohttp
The next logical step is to parallelize the API calls to improve performance, especially since we need to make individual requests for each Pokémon’s detailed data.
By leveraging Python’s asyncio
along with the aiohttp
library, we can send multiple API requests concurrently instead of waiting for each one to finish before starting the next. This significantly reduces the overall runtime, making the solution much faster and more scalable.
This approach is ideal for I/O-bound tasks like querying external APIs, and it turns a slow, sequential loop into an efficient, coroutine-powered fetch operation. The programs execution time averages around 152 seconds on my machine.
Optimized and Scalable: Structured Concurrency with asyncio.Queue
While firing off all requests in parallel can work for small datasets, it quickly becomes a bottleneck when working with hundreds of items, especially when dealing with API rate limits or system memory constraints.
To overcome this, we can adopt a producer-consumer pattern using asyncio.Queue
. This method lets us strike a perfect balance between concurrency and control:
- A producer coroutine fetches the initial list of Pokémon and pushes individual detail URLs into a queue.
- Multiple consumer coroutines then pick items from the queue and process them concurrently, but in a controlled manner, defined by the number of workers(
NUM_CONSUMERS
).
By limiting the number of simultaneous API calls, this approach avoids common pitfalls like connection timeouts, rate limiting, and memory spikes, while still being highly performant due to asynchronous I/O.
Here’s why this structure shines:
- Scalable: Easily control concurrency by tweaking the number of consumers.
- Resilient: Gracefully handles large datasets without overwhelming the network or system.
- Organized: The code is modular and easier to extend (e.g., add retry logic, logging, or error handling per task).
This solution combines the best of both worlds, the speed of aiohttp
with the reliability of structured task management. The programs execution time averages around 29 seconds on my machine(achieves a 6x speed improvement over the initial approach).
Bonus Thought: Dual Queues for Full Parallelism
An advanced variation of this approach is to introduce a separate queue for page-level tasks (i.e. producers). This would allow you to process paginated API responses and enqueue the next
pages in parallel while still processing the results
concurrently.
I’ll leave this as an exercise for curious readers who are interested in a setup that handles both pagination and item processing concurrently. It’s a great way to push the boundaries of structured concurrency in Python.
Wrapping up
Using asyncio
and aiohttp
with thoughtful concurrency patterns like queues can significantly improve performance and scalability when working with external APIs. Whether you’re dealing with a small dataset or a paginated API, understanding when and how to parallelize is key to writing efficient and elegant asynchronous code.
Happy coding and may your Pokémon always be fire-type with a high base experience!