Skip to content

Relational Database Service (RDS)

  • RDS is a Database server as a Service (DBSaaS). We can run multiple databases on a DB Server
  • RDS is a managed service. We don't have access to OS or SSH access except in RDS custom which does provide some low level access

RDS Cost Parameters

  • Instance Size & Type
  • Multi AZ or not
  • Storage type & amount
  • Data transferred
  • Backups & Snapshots
  • Licensing (applicable on using commercial DB types)

Multi AZ - Instance deployments

  • Primary instance is configured to replicate data synchronously to a standby replica in another AZ
  • Replication is at the storagte level
  • Accesses to data is provided through database CNAME which points to primary instance
  • All accesses read/write will occur from the primary instance
  • On failover (which can be done manually for testing) the database CNAME will then point to the secondary instance (60-120s for failover)
  • Instance deployment provide 1 standby replica only
  • Multi AZ can be within the same region
  • Database backups can be taken from standby to improve performance
  • Failover Scnearios: AZ Outage, Primary Failure, Manual failover, instance type change and software patching

Multi AZ - Cluster deployments

  • 1 writer instance can replicate to 2 reader instances (different AZs)
  • Runs on much faster hardware: Graviton + local NVME SSD (Fast writes to local storage --> Flushed to EBS)
  • Replication is done via transaction logs which is more efficient
  • Reader instances can be utilized, but only for read transactions
  • Cluster Endpoint: Points at the writer instance. used for read, write & administration
  • Reader Endpoint: Points any reads at an available reader instance
  • Instance Endpoint: Point at a specific instnace. Generally used for testing/fault finding
  • Failover is faster ~35s+transaction logs apply
  • Writes are "committed" when atleast 1 reader instance has confirmed

RDS Backups

The 2 types of backups for RDS: Automated Backups and Snapshots are stored in AWS. But they are stored in AWS managed buckets & not visible in S3.

Note

There is a IO pause & performance impact when backups & snapshots are taken in a single AZ deployment. In multi AZ deployments the operation is performed from standby instances.

Snapshot

  • First Snap is FULL size of consumed data then it is incremental
  • Snapshots don't expire (Even when RDS instance is deleted)

Automated Backups

  • Occur once per day
  • First backup is FULL size then it is incremental
  • Transaction logs are also written to S3 every 5 minutes
  • Backups are automatically cleared by AWS. This can be configured from 0-35 days (0 means disabled, maximum is 35 days)
  • When deleting database automated backups can be retained but they still expire based on based on the retention period settings.

Cross-Region backups

  • RDS can replicate backups(both snapshots & transaction logs) to another region
  • Charges apply for cross-region data copy & storage used in the destination region
  • NOT DEFAULT: Cross-Region backups must be configured in automated backups

Restores

  • Creates a new RDS instance when you restore a automated backup or snapshot - NEW DNS Address
  • Restoring a manual SNAPSHOT means restoring in a single point in time (snapshot creating time)
  • Restoring a automated backup means restoring any 5 minute point in time
  • Backups are restored & transcations logs are replayed to bring DB to desired point in time (provides a low RPO)
  • Restoring aren't fast - Long RTO for a large database

Read Replica

  • Read only replicas of an RDS instance.
  • Read Replicas are not part of the main database. They have seperate endpoint address
  • Kept in sync using asynchronous replication so reads are eventually consistent
  • Replicas can exist in same region as primary or in other region (Cross region read replicas)
  • Can create up to 5 Read Replicas per database instance
  • Read Replicas can have their own read-Replicas - but lag starts to be a problem
  • Read replicas provide gloabl performance improvements for reads - great for global availablility improvement & resilience
  • Read Replicas offer near 0 RPO
  • Replicas can be promoted quickly to their own DB - low RTO (Replicas are read only until promoted)
  • Should be used for Failure only because data corruption would affect the reader instance as well.

Network Cost - Read Replica

In AWS there's a network cost when data goes from one AZ to another. But, for RDS Read Replicas within the same region, there is no fee since it is a managed service. But for cross region replicas, it incurs a replication fee.

RDS Security

  • SSL/TLS in transit is available fo RDS (Can be mandatory)
  • Encryption at rest(EBS Volume) is supported through KMS
  • AWS or Customer Managed Key (CMK) generates data keys
  • Data keys are used for encryption operations
  • Storage, Logs, Snapshots & Replicas are encrypted
  • Encryption cannot be removed once added
  • RDS MSSQL & RDS Oracle support TDE (Transaparent Data Encryption) --> Encryption is handled with the DB engine
  • RDS Orcale supoprts TDE using CLoudHSM (Much stronger key controls)

IAM Authentication

  • RDS can be configured to use IAM User authentication against a database
  • Start with a RDS instance & create a local database user account configured to allow authentication using AWS authentication token
  • IAM Users & EC2 Role have policies attached that allow users or roles that map that IAM identity onto the local RDS cluster
  • Based on the policies a token with a 15 minute validity is generated. This token can be used to login to database user within RDS without requiring a password

Warning

This is only authentication & not authorization. Permissions over the RDS database are still contoller by the permissions on the local database user.

RDS Custom

  • Provides access to opearting system & databse engine that runs RDS
  • Can connect to the database host using SSH, RDP, Session Manager
  • Supported for MSSQL & Oracle
  • EC2 instances, EBS volumes and S3 buckets are visible in the AWS account for RDS Custom

Amazon Aurora

Architecture

  • Proprietary database from AWS
  • Aurora uses a cluster (A single primary instance + 0 or more replicas)
  • Aurora does not use local storage. It uses shared cluster volume that is available to all compute instances within the cluster
  • Faster provisioning, improved availability & performance
  • Secondary replicas of aurora can serve as failover if primary instance fails and they can also be used for read operations(secondary instances are read replicas) during normal functioning of the cluster
  • Storage grows automatically in incrememts of 10GB upto 128 TB (Storage is build on what's used)
  • Data written to primary instance is synchronously replicated across all 6 storage nodes across AZs
  • Replication is at the storage level i.e. no resources consumed on the instances/replicas
  • Aurora can have 15 replicas(any can be failover targets) while MYSQL has 5 and the replication process is faster sub 10 ms lag
  • Failover in Aurora is instantenous. It's HA by default.
  • Replicas can be added & removed without requiring storage provisioning

Cost

  • No free tier option
  • Aurora doesn't support micro instances
  • Beyond RDS singleAZ (micro) aurora offers much better value
  • Compute is charged hourly, billing is per second with a 10 minute minimum
  • Storage is billed GB/month consumed, high watermark (maximum storage consumed in a month) and IO cost per request
  • 100% of DB size in backips are included

Restore, Clone & Backtrack

  • Backups in Aurora work in the same way as RDS
  • Restored will create a new cluster
  • Backtrack(enabled at cluster level) can be used which allows for in-place rewinds to a previous point in time
  • Fast clones makes a new database much faster than copying all the data (copy on write)

Aurora Serverless

  • Supports scalabale ACU (Aurora Capacity Units)
  • Aurora Serverless cluster has a min & max ACU
  • Cluster adjusts based on load and can even go to 0 & be paused
  • Consumption billing per-second basis
  • Same resilience as Aurora (6 copies across AZs)
  • ACUS are allocated from a shared pool managed by AWS
  • Connection intiated by a user connecting to Aurora Serverless goes through Aurora Proxy Fleets(Proxy fleets broker a connection between client & ACU)

Use cases for Aurora Serverless:

- Infrequently used applications
- New applications (unsure of the laod & size of database instances)
- Variable workloads
- Unpredictable worklaods
- Development & test databases (DB can be paused when not in used)
- Multi-tenant applications (where scaling is proportional to customer revenue )

Aurora Global

  • Replication can take ~1s (1 way) and happens at the storage layer
  • Replication has no impact on DB performance
  • Secondary regions can have 16 replicas
  • Promoting another region (for disaster recovery) has an RTO(Recovery Time Objective) of < 1 minute
  • Currently MAX 5 secondary regions are supported

Use case:

  • Cross-Region DR & Business Continuity
  • Global read scaling - low latency performance improvements for read

Multi-Master Write

Default Aurora

  • Default Aurora mode is Single-Master(One Read/Write + 0 Read Only replicas)
  • Cluster Endpoint is used for read/write and read endpoint used for load balancing reads
  • Failover takes time - replica promoted to read/write

Multi-Master

  • All instances are capable of read/write by default
  • No load balanced cluster endpoint. Application can connect to one or both of the cluster endpoints
  • Use case: Fault Tolerance
  • Failover events can happen inside the application itself
  • No disruption of traffic between application & DB in case of failover

RDS Proxy

  • Fully managed proxy for RDS - serverless, autoscaling, highly available (Multi-AZ) by default
  • Allows apps to pool & share DB connections established with the DB (Proxy maintains a long term connection pool)
  • Improve database efficiency by reducing the stress on DB resources (CPU/RAM). Also minimizes open connections & timeouts
  • Reduces RDS & Auora failover time by up to 66% in case of failover
  • Accessed via Proxy endpoint - no app changes in most cases
  • Proxy can enforce SSL/TLS
  • Enforce IAM authentication for DB, & securely store credentials in the AWS secrets manager
  • RDS Proxy is never publicly accessible (Must be accessed from a VPC)
  • Lambda can use RDS Proxy
  • Abstracts failure away from your applications

When to use RDS proxy:

- Application connection failing with too many connection errors
- DB instances using T2/T3 (small/burst) instances
- AWS Lambda - time saved per connection & reuse & IAM auth
- Long running connections (SAAS apps) - low latency
- Where resilicence to database failure is priority
- RDS proxy can reduce time for failover and make it transparent to the application

Database Migartion Service (DMS)

  • A managed database migration service
  • Runs a replication instance
  • Source and destination endpoints point at source and target DBs
  • One of endpoints must be running on AWS

Job Types:

  1. Full Load: One off migration of all data
  2. Full Load + CDC: Migrates existing data & replicates any ongoing changes
  3. CDC Only: Replicate only data changes

Schema Conversion Tool (SCT)

  • DMS does not support schema conversion but Schema Conversion Tool (SCT) can assist with Schema Conversion
  • Convert from one database engine to another
  • SCT is not used for movement of database between compatible database engines
  • Works with OLTP DB (MySQL, MSSQL, Oracle) and OLAP (Terradata, Oracle, Vertica, Greenplum)

Info

DMS can utilize snowball for large scale DB migrations with SCT.

  1. Use SCT to extract data locally & move to snowball device
  2. Ship the device back to AWS. They load onto an S3 bucket
  3. DMS migrates from S3 into the target store
  4. Change Data Capture (CDC) can capture changes & via S3 intermediary they are also written to the target database