

Client
A leading North American bank
Goal
Migrate credit risk data and SAS-based analytics models from on-premises data warehouse to AWS to enhance functionality
Tools and Technologies
AWS Glue, Redshift, DataSync, Athena, CloudWatch, SageMaker; Apache Airflow; Delta Lake; Power BI
Business Challenge
The credit risk unit of a major bank aimed to migrate SAS-based analytics models containing data for financial forecasting and sensitivity analysis to Amazon SageMaker.
This was to leverage benefits such as enhanced scalability, improved maintenance for MLOps engineers, and better developer experience. It also sought to migrate credit risk data from a Netezza-based on-premises data warehouse to AWS, utilizing a data lake on AWS S3 and a data warehouse on Redshift to support model migration.

Solution
- Decoupled data workload processing from relational systems using the phased approach with a focus on historical migration, transformational complexities, data volumes, and ingestion frequencies of the incremental loads
- Developed a flexible ETL framework using DataSync for extracting data to AWS as flat files from Netezza
- Transformed data in S3 layers using Glue ETL and moved it to the Redshift data warehouse
- Enabled Glue integration with Delta Lake for incremental data workloads
- Built ETL workflows using Step Functions during orchestration and concurrent runs of the workflow; orchestrated the concurrent runs of workflows using Apache Airflow
- Architected data shift from Netezza to AWS, leveraging a flexible ETL framework

Outcomes
- Enhanced financial forecasting and sensitivity analysis operations with analytical models and data migrated to the AWS public cloud
- Expedited time-to-market catering to client’s downstream consumption needs through Power BI and Amazon SageMaker
