Building a data lake and environment for Data Science for full stack financial services provider
Set up a data pipeline, build a data lake and environment for Data Science to take full advantage of ‘growing’ data
Data lake finaicial services


Eko India Financial Services is a full stack financial services provider to gig economy workers. With over 12+ years of experience in payments and money transfer, they have served 17 mn+ customers and partnered with 2 lakh+ MSMEs

Problem Statement / Opportunity

Eko’s biggest challenge was to consolidate all its different datasets and build systems and processes that seamlessly manage quick and secure access to the fast growing datasets and to upgrade to a big data architecture to support multiple analytics use cases like customer retention, risk and fraud analytics

Oneture's Role

Overall project scope was to setup a robust data-pipeline to take full advantage of the available data in order to make Eko operations more efficient and use advanced analytics and data science techniques and benefit from the same.

Oneture quickly assembled the team, quickly understood legacy data architecture, schema and data quality and recommend the best way to ingest, store and clean the same. We created a detailed implementation roadmap to ingest all the data in the data lake and enabled Eko to develop customized real-time BI dashboards

3 months of strict timelines and challenges associated with transforming 13+ year old data architecture were pretty demanding,  we built a unified view of diverse data sources (data lake), established stable data pipelines leading to marked improvements in operational performance across dashboards and reporting and an environment ready for advanced analytics

We also designed hybrid cloud platform for multi-cloud connectivity that in order to connect to Eko’s Data Center and have the datalake components communicate with it. Apart from this Oneture team also recommend best practices / requirements for data encryption, access and other governance concerns.

Oneture’s prior experience in handling big data projects in BFSI industry and technical expertise played important role in developing customized solution for Eko’s requirement.

Proposed Solution & Architecture

Entire solution was built with strict budget constraints while we made sure its robust, performance and other SLAs are met. To make it happen we had to (re)engineer few aspects e.g. – (re) building near real time components to save cost and still managing to meet SLAs

The above high-level architecture can serve various kinds of analytical use-cases be it batch or real-time, with end-to-end pipelines from ingestion to visualization.

Above network diagram shows an overview of the multi-cloud connectivity that was done in order to connect to Eko’s DC and have the datalake components communicate with it.

Tools and Technologies Used
Technology Domain Tools
Amazon Web Services Amazon S3, Amazon EMR, AWS Glue, Amazon Athena, Amazon EC2, Amazon VPC, AWS KMS, AWS IAM
Big Data Tools Apache Spark, Apache Hive, Sqoop, Apache NiFi, Apache Airflow

Value Delivered
  • We helped Eko to implement Data Lake which gives unified view of the data across various datasets.
  • Established stable data pipelines resulting into improvements in operational performance across dashboards and reporting
  • Data platform for advanced analytics & data science use cases

Lessons Learned
  • Always go with customer first approach, be flexible and ready to adapt to new unaccounted challenges. It is difficult to grasp the needs of a customer unless you are flexible and ready to face any challenges during project implementation.
  • Set the right expectations and be consistent in your service delivery. Make sure you have a complete understanding of who the client is, what their needs are from day one.
  • Certainly, there are advantages of being an AWS Select partner when it comes to delivering expertise on AWS native services, but at the same time in the interest of time to market, cost and available talent pool at the client, be open to using client present stack, open source tools, services and platforms as may be needed
  • Build quality solution which is economical too, this mindset provides opportunities to utilize the capital appropriately