Data Processing Pattern

Processes images from S3 and extracts metadata and labels, storing them in DynamoDB

This application creates a DynamoDB table, S3 bucket and State Machine. The State Machine processes a list of images and retrieves objects metadata and image labels using machine learning in parallel, ultimately storing the result as one entry in the DynamoDB table. This pattern demonstrates how Step Functions can work with multiple data stores, manipulate, merge and store data for later prcessing.

< Back to all workflows

GitHub icon View this workflow on GitHub


Clone repo

git clone https://github.com/aws-samples/step-functions-workflows-collection/tree/main/data-processing/cdk/cd step-functions-workflows-collection/data-processing/

Deploy

Navigate to cdk directory and run <code> cdk deploy </code>Navigate to the shared directory and run <code> python scripts/uploadImagesToS3.py </code>


Testing

See the GitHub repo for detailed testing instructions.

Cleanup

1. Navigate to the cdk directory and run cdk destroy

Created by:

Kurt Tometich

Kurt Tometich

Kurt is a Sr. Solutions Architect based in Colorado who enjoys building lean, mean serverless solutions.

Follow on LinkedIn