Amazon DynamoDB and Amazon S3 zero-ETL integration using AWS Glue

Amazon DynamoDB → AWS Glue → Amazon S3

Create an Amazon DynamoDB and S3 bucket and integrate them using an AWS Glue job for zero-ETL data transfer.

This pattern sets up Amazon DynamoDB and Amazon S3 buckets, and integrates them with an AWS Glue job. Using this setup, you can move data from Amazon DynamoDB to Amazon S3 buckets (triggers are not implemented in the pattern, need to be added as required) and use that for analytics or long term storage. The scenarios where this pattern can be used are when there are large amounts of data on Amazon DynamoDB and you want to move them to a data lake as part of data strategy, or if the data has to be moved to long term storage due to regulatory reasons. The AWS Glue job copies the data into an encrypted Amazon S3 bucket and stores them in the specified format. In this pattern the format has been set to Parquet.

This pattern also creates the required roles and policies for the services, with the right level of permissions required. The roles and policies can be expanded if additional services come into play, based on principle of least privilege.

< Back to all patterns

Language:: Python
Framework:: Terraform

Download this pattern (.zip)

View this pattern on GitHub

Clone repo

git clone https://github.com/aws-samples/serverless-patterns/cd serverless-patterns/dynamodb-glue-s3-terraform

Deploy

terraform initterraform planterraform apply

Testing

See the GitHub repo for testing instructions.

Cleanup

terraform destroy

Additional resources

AWS Glue

Amazon DynamoDB

Amazon S3

Created by:

Kiran Ramamurthy

I am a Senior Partner Solutions Architect for Enterprise Transformation. I work predominantly with partners and specialize in migrations and modernization.

Follow on LinkedIn