Knowledge base refresh pipeline with AWS Step Functions & Amazon S3 Vectors

Amazon S3 → AWS Step Functions → AWS Lambda → Amazon S3 Vectors

Automate ingestion of new documents into an Amazon S3 Vectors knowledge base using AWS Step Functions Distributed Map with validation

When new documents land in an S3 bucket, a Step Functions workflow fans out via Distributed Map to process each document in parallel.
For each document, a Lambda function reads the content, generates vector embeddings using Amazon Bedrock, and stores them with PutVectors in the S3 Vectors vector bucket.
After ingestion completes, a validation step uses QueryVectors to confirm the new content is searchable. A Choice state either confirms success or rolls back by deleting the newly added vectors if validation fails.

< Back to all patterns

GitHub icon Download this pattern (.zip)

GitHub icon View this pattern on GitHub


Clone repo

git clone https://github.com/aws-samples/serverless-patterns/cd serverless-patterns/sfn-s3vectors-rag-refresh-cdk

Deploy

npm installcd lambda && npm install && cd ..cdk deploy


Testing

See the GitHub repo for detailed testing instructions.

Cleanup

Delete the stack: cdk destroy.

Created by:

Ben Freiberg

Ben Freiberg

Ben is a Senior Solutions Architect at Amazon Web Services (AWS) based in Frankfurt, Germany.

Follow on LinkedIn