On-Demand Backup and Restore in DynamoDB
Welcome to CloudAffaire and this is Debjeet
In the last blog post, we have created a table in AWS DynamoDB console.
In this blog post we are going to discuss On-Demand Backup and Restore in DynamoDB. We will also create a backup of a DynamoDB table and then restore it.
On-Demand Backup and Restore in DynamoDB
Amazon DynamoDB provides on-demand backup capability. It allows you to create full backups of your tables for long-term retention and archival for regulatory compliance needs. You can back up and restore your DynamoDB table data anytime with a single click in the AWS Management Console or with a single API call. Backup and restore actions execute with zero impact on table performance or availability.
When you create an on-demand backup, a time marker of the request is cataloged. The backup is created asynchronously by applying all changes until the time of the request to the last full table snapshot. Backup requests are processed instantaneously and become available for restore within minutes.
Note: Each time you create an on-demand backup, the entire table data is backed up. There is no limit to the number of on-demand backups that can be taken.
All backups in DynamoDB work without consuming any provisioned throughput on the table. DynamoDB backups do not guarantee causal consistency across items; however, the skew between updates in a backup is usually much less than a second.
Included in the backups:
- Global secondary indexes (GSIs)
- Local secondary indexes (LSIs)
- Provisioned read and write capacity
Excluded from backups and needs to be manually set up after restoration:
- Auto scaling policies
- AWS Identity and Access Management (IAM) policies
- Amazon CloudWatch metrics and alarms
- Stream settings
- Time To Live (TTL) settings
A table is restored without consuming any provisioned throughput on the table. The destination table is set with the same provisioned read capacity units and write capacity units as the source table, as recorded at the time the backup was requested. The restore process also restores the local secondary indexes and the global secondary indexes. You can only restore the entire table data to a new table from a backup. You can write to the restored table only after it becomes active.
Note: You can’t overwrite an existing table during a restore operation.
The time it takes you to restore a table will vary based on multiple factors, and the restore times are not always correlated directly to the size of the table. For example, because of parallelization, it is possible that restoring a 300 GB table could take the same amount of time as restoring a 3 GB table.
Here are some of the considerations for restore times:
- You restore backups to a new table. It can take up to 20 minutes (even if the table is empty) to perform all the actions to create the new table and initiate the restore process.
- For tables with even data distribution across your primary keys, the restore time is proportional to the largest single partition by item count and not the overall table size. For the largest partitions with billions of items, a restore could take less than 10 hours.
- If your source table contains data with significant skew, the time to restore may increase. For example, if your table’s primary key is using the month of the year for partitioning and all your data is from the month of December, you have skewed data.
Next, we are going to create a table, insert some data and then take a backup of the table.
Step 1: Create a table
aws dynamodb create-table ^ --table-name CloudAffaire ^ --attribute-definitions AttributeName=company,AttributeType=S AttributeName=id,AttributeType=N ^ --key-schema AttributeName=company,KeyType=HASH AttributeName=id,KeyType=RANGE ^ --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 ^ --output table
Note: We have not included the endpoint of local dynamodb, hence the table will be created in AWS.
Step 2: Insert some data in the ClodAffaire table
aws dynamodb batch-write-item ^ --request-items file://employee-short.json
Note: You can download the employee-short.json file from github
Step 3: Create an On-Demand backup of CloudAffaire table
aws dynamodb create-backup ^ --table-name CloudAffaire ^ --backup-name CloudAffaireBackup
Note down the backup arn
Step 4: Get the backup details using describe-backup AWS CLI.
aws dynamodb describe-backup ^ --backup-arn arn:aws:dynamodb:ap-south-1:###########:table/CloudAffaire/backup/01546250917568-e8b9f684
You can also view the backups in AWS console under backup.
You can also use AWS CLI list-backups to list all the available backups in your region.
aws dynamodb list-backups
Next, we are going to delete the table and then restore it using this backup.
Step 5: Delete the CloudAffaire table
aws dynamodb delete-table --table-name CloudAffaire
Step 6: Restore the table using the previous backup
aws dynamodb restore-table-from-backup ^ --target-table-name CloudAffaire ^ --backup-arn arn:aws:dynamodb:ap-south-1:###########:table/CloudAffaire/backup/01546250917568-e8b9f684
You can check the console as well
Note: It takes time to restore DynamoDB tables and once the restore is complete, the table will be available.
Step 7: Delete the table
aws dynamodb delete-table ^ --table-name CloudAffaire
Hope you have enjoyed this article. In the next blog post we will discuss Point In Time Recovery in DynamoDB
To get more details on DynamoDB, please refer below AWS documentation