Amazon S3 to AWS Lambda durable functions to Amazon Textract and Amazon Bedrock

Amazon S3 → AWS Lambda (Durable) → Amazon Textract → Amazon Bedrock → Amazon DynamoDB

Extract text from documents with Amazon Textract and summarize with Amazon Bedrock using an AWS Lambda durable function.

This pattern demonstrates a durable document processing pipeline using AWS Lambda durable functions.
When a document (PDF, PNG, or JPG) is uploaded to Amazon S3, it triggers a durable Lambda function.
The function starts an asynchronous Amazon Textract text detection job and polls for completion using waitForCondition with exponential backoff.
Once text extraction completes, the extracted text is sent to Amazon Bedrock (Amazon Nova Lite) for summarization.
Results including the summary are stored in Amazon DynamoDB.
Durable functions provide automatic checkpointing, so if the function is interrupted during the long-running Textract polling, it resumes from the last checkpoint without re-executing completed steps.
Example use cases: invoice processing, contract analysis, insurance document intake, and compliance review.

< Back to all patterns

GitHub icon Download this pattern (.zip)

GitHub icon View this pattern on GitHub


Clone repo

git clone https://github.com/aws-samples/serverless-patterns/cd serverless-patterns/s3-lambda-textract-bedrock-durable-cdk-ts

Deploy

Clone the repository: <code>git clone https://github.com/aws-samples/serverless-patterns</code>Change directory: <code>cd s3-lambda-textract-bedrock-durable-cdk-ts</code>Install dependencies: <code>npm install</code>Deploy the CDK stack: <code>cdk deploy</code>


Testing

Get the S3 bucket name: BUCKET_NAME=$(aws cloudformation describe-stacks --stack-name S3LambdaTextractBedrockDurableStack --query 'Stacks[0].Outputs[?OutputKey==`DocumentBucketName`].OutputValue' --output text)
Upload a test document: aws s3 cp test-document.pdf s3://$BUCKET_NAME/
Check DynamoDB for results: TABLE_NAME=$(aws cloudformation describe-stacks --stack-name S3LambdaTextractBedrockDurableStack --query 'Stacks[0].Outputs[?OutputKey==`ResultsTableName`].OutputValue' --output text) && aws dynamodb scan --table-name $TABLE_NAME

Cleanup

Delete the stack: cdk destroy

Created by:

Marco Jahn

Marco Jahn

Senior Solutions Architect - ISV, Amazon Web Services

Follow on LinkedIn