Amazon DynamoDB and Amazon S3 zero-ETL integration using AWS Glue

Amazon DynamoDB → AWS Glue → Amazon S3

Create an Amazon DynamoDB and S3 bucket and integrate them using an AWS Glue job for zero-ETL data transfer.

This pattern sets up Amazon DynamoDB and Amazon S3 buckets, and integrates them with an AWS Glue job. Using this setup, you can move data from Amazon DynamoDB to Amazon S3 buckets (triggers are not implemented in the pattern, need to be added as required) and use that for analytics or long term storage. The scenarios where this pattern can be used are when there are large amounts of data on Amazon DynamoDB and you want to move them to a data lake as part of data strategy, or if the data has to be moved to long term storage due to regulatory reasons. The AWS Glue job copies the data into an encrypted Amazon S3 bucket and stores them in the specified format. In this pattern the format has been set to Parquet.
This pattern also creates the required roles and policies for the services, with the right level of permissions required. The roles and policies can be expanded if additional services come into play, based on principle of least privilege.

< Back to all patterns

GitHub icon Download this pattern (.zip)

GitHub icon View this pattern on GitHub


Clone repo

git clone https://github.com/aws-samples/serverless-patterns/cd serverless-patterns/dynamodb-glue-s3-terraform

Deploy

terraform initterraform planterraform apply


Testing

See the GitHub repo for testing instructions.

Cleanup

terraform destroy

Created by:

Kiran Ramamurthy

Kiran Ramamurthy

I am a Senior Partner Solutions Architect for Enterprise Transformation. I work predominantly with partners and specialize in migrations and modernization.

Follow on LinkedIn