The client is a diversified and well-integrated financial services enterprise providing full range of top quality Trading Services, Asset Management, Wealth Management, Investment Banking, Investment, Advisory Services, Proprietary Trading, Life Insurance and Brokerage to a diverse range of institutional investors.
The client was using an Excel based reporting system as visualization tool for their business analytics purpose. The system had several limitations, including:
The client’s reporting system relied on IT teams to manually Extract, Transform and Load (ETL) data from various sources into a SQL Server database. The data was then queried and processed to generate Excel based reports. This process was time consuming and prone to errors, leading to delays in data availability for decision making.
Client's existing ETL pipeline was as shown below:
The client recognized that their reporting system needed to be modernized to enable faster, more efficient data processing and analysis. Oneture helped client in assessment of existing system and proposing recommendation on implementation plan. Post assessment of existing ETL pipeline, Oneture discovered that the problem goes beyond replacing Excel based reporting with BI dashboards. The pipeline also has manual intervention in pulling data from data sources, multiple points of failure, limited automation and no capabilities of auto restart or self healing in event of failure.
Oneture proposed a modernization strategy that involved automating and optimizing current ETL process, developing BI dashboard and migrating the client’s onprem MIS reporting system to cloud based Data Lakehouse with a real-time interactive dashboard. The Data Lakehouse approach will enable the client to store all their data in one place, providing a single source of truth for all data queries. It will also provide scalability and flexibility to handle the client’s future data needs.
To maintain business continuity during modernization journey and prioritizing business outcome, the proposed modernization strategy involved the following steps:
The modernization strategy will also leverage Databricks and AWS cloud services to enable scalability and reduce infrastructure costs.
Step1: Optimizing current ETL process
This involved multiple tasks including
This process includes resolving performance issues, optimizing data transformations, improving data quality checks, and ensuring scalability and fault-tolerance. Oneture team followed standard data engineering principles, avoided redundant storage, and identified potential issues to make the ETL process more efficient. Best practices for data modeling, storage, and processing were used to handle large volumes of data. We helped client by migrating failed ADO.NET based Exe loader jobs to standard and reliable tool using SSIS.
After consulting with business stakeholders, our developers identified and corrected existing data quality issues. The data quality issues were analyzed using data profiling tools. The root causes of the data quality issues were identified, and data quality controls were implemented to prevent data quality issues from recurring. Moreover, the data quality controls should be scalable and able to handle large data volumes.
This involved using RPA tools to automate data retrieval and updating processes, ensuring that the data in excel sheets is up-to-date and accurate leveraging multiple techniques including scraping from websites using selenium, API integration with data source provider, JDBC based database intergration etc. RPA bots also helped in fetching data from the DB using SQL Query and populate the excel sheet with the latest data to avoid manual intervention.
This involves working with SQL and database management systems to optimize queries and improve data retrieval performance. We also need to understand the client's data model and business requirements to design effective and efficient queries. Oneture team worked with client to make complex queries joining multiple tables to create final View queries that give a unified view of data, which has significant business value with one click using power query feature of excel to tackle business problem for short term by providing required reporting.
Step2: Design and develop a real-time interactive dashboard to enable data visualization and analysis
Oneture is working with client in providing an alternative approach of current Excel based visualization to support near real-time updates. Client also requires role and access management with AD connector support. Oneture team divided the work with following tasks:
Step3: Migrate the existing data to a cloud-based Data Lakehouse
As part of long term plan, Oneture team will be working with client which involves moving the existing data warehouse and reporting components to the cloud and adopting a lakehouse architecture, which combines the benefits of a data lake and a data warehouse. This migration is aimed at achieving benefits such as better scalability, cost-efficiency, improved data access, and more flexibility in data processing.
Future Plans
The modernization initiative is an ongoing process, and our company continues to work with the client to identify areas for improvement and optimization. Our team is currently working on optimizing the ETL process and creating realtime BI dashboard to ensure smooth data flow processing and visualization. Later we will be moving on migrating to data lakehouse architecture. We are also exploring the use of machine learning algorithms to enable predictive analytics and automate data processing.