The Factory Pattern: Plug-and-Play Architecture for Your Data Pipeline
A practical example from my MarketPipe project
Table of contents
Introduction
In today's bullet fast tech world, the ability to adapt and expand systems quickly and efficiently is a game changer.
One powerful design pattern that allows just this is the Factory Pattern.
In this post, I dive into how you can leverage the Factory Pattern to create a plug-and-play architecture for your data pipeline, using a practical example from the MarketPipe project.
What is the Factory Pattern?
The Factory Pattern is a creational design pattern that provides an interface for creating objects in a superclass, but allows subclasses to alter the type of objects that will be created. It's like having a production line that can switch between creating different products without changing the production process itself.
Why Use the Factory Pattern?
Decoupling: The Factory Pattern decouples the creation of objects from their usage, making your code more modular and easier to maintain.
Flexibility: It allows you to introduce new types of objects without changing the existing code.
Scalability: As your application grows, you can add new functionalities without major changes to your codebase.
Practical Example: MarketPipe Project
In the MarketPipe project, the Factory Pattern is used to create a flexible API client system. This allows for easy addition of new API clients without modifying the core logic of the pipeline.
Let's check a simplified version of this implementation.
Step-by-Step Implementation
Step 1: Define a Base API Client Class
First, create an abstract base class for all API clients. This class will define a common interface that all concrete API clients must implement.
import logging
# Base class for all API clients, defining a common interface
class BaseApiClient:
def __init__(self, logger: logging.Logger = None):
self.logger = logger or logging.getLogger(__name__)
# Method to be implemented by all subclasses
def get_data(self):
raise NotImplementedError("Each API client must implement this method.")
This base class ensures that all API clients comply to a common interface, making it easier to manage and extend the pipeline.
Step 2: Create Concrete API Client Classes
Next, create concrete classes for each API client. Each class will implement the get_data
method.
class StockApiClient(BaseApiClient):
def get_data(self):
self.logger.info("Fetching data from Stock API...")
# Implement the logic to fetch data from a Stock API
return {"data": "stock_data"}
class CryptoApiClient(BaseApiClient):
def get_data(self):
self.logger.info("Fetching data from Crypto API...")
# Implement the logic to fetch data from a Crypto API
return {"data": "crypto_data"}
These concrete classes implement the get_data
method, making it easy to fetch data from different APIs without changing the core logic.
Step 3: Implement the Factory Class
Now, create the factory class that will dynamically load and instantiate the appropriate API client based on configuration.
import json
from typing import Dict, Type
from importlib import import_module
# Mocking the CONFIG data structure
CONFIG = {
"clients": {
"stock": {"module": "path.to.stock_module", "class": "StockApiClient"},
"crypto": {"module": "path.to.crypto_module", "class": "CryptoApiClient"}
}
}
# Factory class to dynamically load and instantiate API clients
class ApiClientFactory:
def __init__(self, logger: logging.Logger):
self.logger = logger
self.clients = self.load_clients()
@staticmethod
def load_clients() -> Dict[str, Type[BaseApiClient]]:
clients = {}
for client_name, settings in CONFIG["clients"].items():
module_name, class_name = settings["module"], settings["class"]
module = import_module(module_name) # Dynamically import the module
clients[client_name] = getattr(module, class_name) # Get the class
return clients
# Method to get the appropriate client based on client_type
def get_client(self, client_type: str) -> BaseApiClient:
client_class = self.clients.get(client_type)
if client_class is None:
raise ValueError(f"Invalid client type: {client_type}")
return client_class(logger=self.logger)
The factory class dynamically loads and instantiates API clients based on a configuration file, making it easy to add new clients by simply updating the configuration.
Step 4: Using the Factory in Your Pipeline
Finally, use the factory to get the appropriate API client in your data processing pipeline.
# Data processor class that uses the factory to get API clients
class DataProcessor:
def __init__(self, asset_type: str, api_client_factory: ApiClientFactory, logger: logging.Logger):
self.asset_type = asset_type
self.api_client = api_client_factory.get_client(asset_type)
self.logger = logger
# Method to process data
def process_data(self):
try:
data = self.api_client.get_data()
# Further processing of data
print(f"Processed {self.asset_type} data: {data}")
except Exception as e:
logger.error(f"Error processing {self.asset_type} data: {e}")
# Example usage of the DataProcessor with the factory
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("DataPipeline")
# Create an instance of ApiClientFactory
api_client_factory = ApiClientFactory(logger)
# Creating a DataProcessor for 'stock' client and processing data
processor = DataProcessor("stock", api_client_factory, logger)
processor.process_data()
# Creating a DataProcessor for 'crypto' client and processing data
processor = DataProcessor("crypto", api_client_factory, logger)
processor.process_data()
This setup allows for seamless integration of new API clients. To add a new client, simply define the client class, update the configuration, and the factory takes care of the rest!
Conclusion
The Factory Pattern is powerful for building a flexible, maintainable, and scalable data pipeline.
By decoupling the creation of objects from their usage, it allows for easy integration of new components and adapts quickly to changing requirements.
The MarketPipe project is a great example of how to implement this pattern in a real-world scenario, providing a robust and adaptable architecture for your data processing needs.
Take your project to the next level with the Factory Pattern!