Azure SQL Server (Managed Instance) to AWS PostgreSQL(RDS) Journey
We were given a humongous task to enable a Best Global Platform as a Product making High business impact and used at least in 10 countries with key features such as transactional (OLTP) & analytics (DataMart) both enabled at the same time also meeting a goal of “Differentiate through data” by integrating with an enterprise data warehouse. The biggest challenge tasked was to refactor an existing legacy POC product that was used by very few users in a specific country.
Like any other company were supposed to enable MVP and A/B test first before operationalizing and qualifying with internal 15 Quality Gates. Keeping aside Experience (Web/Mobile), Micro Services, and a few other backend components, our main challenge was refactoring the existing legacy database SQL server (in Azure) built with rigidity and unscalable data model
The team took the challenge of making all integrations possible and correcting data structures also moving to Postgresql (RDS) in AWS .
As a first step, we approached MVP with migrating SQL Server as is to AWS SQL Server RDS & laid down a migration plan towards AWS PostgreSQL RDS
The Situation with SQL Server (AWS RDS) in MVP:
A. As the platform is expected to grow and expand to 10 countries with ~200,000 users scalability, performance, and maintenance were a concern
B. Cost: SQL server on RDS does include expensive license cost with single AZ, Multi-AZ was even higher.
C. Complex maintenance, especially for Transaction logs and TempDB. Since we don’t have control over the physical files, there is a lot of dependency on AWS for any unforeseen issues.
D. Many times encountered Performance issues due to failed stored procedures to use optimized execution plans especially for tables that participated in batch jobs for data processing.
Plan & Action:
Established a phased approach based on the complexity of Services. Logically grouped data in each phase with a total of 3 phases.
Migrated Micro Data services carefully with each release 7 days incubation to invoke Postgresql (primary) and other to SQL server (for any blackout).
Post assurance of zero negative results enabled consumption of Postgresql respective services and stop consuming SQL server
This step was important to avoid any data discrepancies as it was OLTP, connected to DataMart and Enterprise data warehouse.
Move existing operations(DDL/DML/DCL/DQL) on SQL Server to Postgresql.
Scale up the existing Postgresql server (from t3.medium to m5.large) enabled Multi-AZ
Usage of AWS Schema conversion tool to convert the Data storage objects( tables, indexes, etc) & Data code objects( Stored Procedures, Views, Functions, etc). Choosing between auto vs manual conversion based on the type of objects eg: for data code objects we approached 70% manual conversion
Usage of AWS DMS for Data migration (Full load + ongoing replication till data is live on Postgresql)
Usage of Flyway version control and ADO for deploying Postgresql objects
Keep warm SQL server for at least a month before decommissioning
Challenges and solutions:
Schema conversion (Storage + Code objects) — Huge manual efforts and less accurate tools for AWS RDS: First-time usage of AWS Schema conversion tool, a bit of a learning curve. 90% of Storage objects were converted by tool and the remaining 10 % converted manually following tool's suggestion. 30% of Code objects converted by tool, 40% of objects manually updated with minimal efforts based on tools suggestion & 30% planned for manual efforts.
AWS Schema conversion tool challenges — SQL Columns of Type “Identity” converted to Postgres Type “GENERATED ALWAYS AS IDENTITY”. Replication is not possible for these Tables. SQL Columns of Data Type nvarchar(max) are converted to VARCHAR(1) at PostgreSQL Side when using AWS SCT Tool: 1) After Schema Conversion, convert ALWAYS to DEFAULT Type Identity. 2) Identity All The Tables(Columns) which are having data type as nvarchar(max) at SQL Server and convert them to text type
Data migration with minimum downtime: Usage of AWS DMS with Full Load + ongoing replication for migration. Continue till Production is up on PostgreSQL and stop replication
AWS DMS Challenges — Replication fails if CDC Not Enabled at Source SQL Server: Enabling Change data capture at Database Level & Table Level in SQL Server
1) ~$3000 monthly saves on RDS instance (both Prod & Non-Prod combined)
2) Moving towards Line of the business goal of using Postgresql as Datastore align with technology standards
3) Provide flexibility to use JSONs considering a lot of operational data in place.
4) Reduce overhead of maintaining two different databases engines and using different version control