PatternAI: WhatsApp Node
The WhatsApp Integration module extends the PatternAI Agent framework to support seamless, multi-modal conversational interactions through WhatsApp. It uses FastAPI to handle incoming webhook events from the Meta WhatsApp Business API, integrating tightly with the agent’s LLM, STT, and TTS services.
System Architecture
The module is structured for cloud-native deployment and includes:
Development Environment
- Source Code: A FastAPI-based server to process WhatsApp messages via
/webhook
. - Dockerfile: Defines the container definition for packaging the application.
- Configuration: Environment variables and Meta API credentials for secure operations.
Deployment Workflow
- Docker Build – Container image is created from the source.
- Push to Registry – Stores the image in Google Container Registry.
- Deploy – Creates or updates the Cloud Run service.
Google Cloud Platform Infrastructure
-
Container Registry:
Stores Docker images at gcr. Provides versioned image storage for rollbacks. Integrates with Cloud Run for automatic deployments. -
Cloud Run:
Container runs the WhatsApp webhook service. Exposes/webhook
endpoint for Meta events. Environment variables store Meta tokens and configuration. -
Cloud Logging:
Centralized logging for observability.
Message Processing Flow
Incoming WhatsApp messages go through a structured pipeline:
Components:
demo_whatsapp.py
: Entry point for handling FastAPI webhook logic.whatsapp_base.py
: Base class that contains AI integration logic (LLM, STT, TTS).
Sequence of Operations
Step | Operation |
---|---|
1 | User sends a message (text or audio) to the registered WhatsApp Business number. |
2 | Meta sends a webhook event with the message data via an HTTP POST request to /webhook on our FastAPI server. |
3 | Webhook Handler (handle_incoming_message ) in WhatsApp class (from demo_whatsapp.py , inheriting WhatsAppBase ) receives and logs the incoming data, ignoring non-user status updates. |
4 | Parse and Validate message using parse_webhook_event() from WhatsAppBase . |
5 | Deduplication check against Redis using is_duplicate_message(message_id) . If duplicate: {"status": "duplicate", "message_id": message_id} . If new: mark as processed. |
6 | Message Type Detection to route message to either process_text_message (for text) or process_audio_message (for voice). |
Message Handling
Text Message Flow
_call_llm()
processes the text via LLM._extract_message()
retrieves the LLM’s reply._call_tts()
synthesizes the reply to audio.send_audio_message()
delivers audio back to the user.
Audio Message Flow
download_media()
retrieves the audio file._call_stt()
transcribes the audio._call_llm()
generates a response._call_tts()
converts it to audio.send_audio_message()
returns the audio reply.
API Endpoint
Endpoint | Method | Description |
---|---|---|
/webhook | POST | Receives and processes WhatsApp events |
On failure, a fallback message is sent:
Sorry, we couldn't process your request. Please try again later
.
This ensures the system degrades gracefully under failure conditions.
Logging & Observability
- Centralized logging via Cloud Logging
Tracks: - Incoming webhook events
- AI processing
- Outgoing replies
- Deduplication logic
- Processing errors