We built a solution that allows traders to set alerts for all stocks, bonds, commodities, F&O contracts and currencies across exchanges based on user defined conditions using simple, easy-to-use operators.
Traders are then notified via email/sms and/or push notification of mobile app whenever the trigger condition meets the alert criteria. The Client has around 2,00,000 to 2,50,000 daily active users out of around 7,00,000+ total user base on top of it atleast 5,000-10,000 new accounts created per month.
If we consider on an average each user creating 5 custom alerts, it leads to ~30,00,000 user defined trigger (alert) conditions residing in database. We have to scan all the trigger conditions each time we receive a real-time stock market feed. So, for each feed across 25,000 stock symbols (let’s call it 'n') we need to traverse all 30,00,000 i.e 3 million triggers (let’s call it 'm'), we need to process n*m data per unit time.
For PoC stage, we have divided solution as 3 separate modules:
- UI Agent : An independent module which will store user requested (alerts) data in persistent storage. We are using Node.js/Express based web app which will store the data in PostgreSQL database
- Processing Engine : An independent module which will take feeds input (via TCP port) and compare it with stored alerts data (from database) in memory and when triggers condition is matched, it will store results in a staging area, let's say a Messaging Queue.
- Notification Worker : An independent worker which reads from MQ and calls Notification engine APIs to send the data to end user via various channels as per user requirements.
We tried multiple approaches for processing engine
- Store data directly in Redis cluster
- Store data using persistent storage + Redis cluster
- Store data and query directly from persistent storage
- Multithreading can be changed to multiprocessing with horizontal scaling - sharding the processes with business logic
- Exploring Apache Flink (Stateful computation over data streaming)
Currently, we could achieve under 15 secdons turnaround time for triggers at scale of 3 million stored triggers and 25,000 without using Apache Flink or horizontal scaling. We aim to improve system performance further capable of catering 100 million stored triggers data.