[{"data":1,"prerenderedAt":53},["ShallowReactive",2],{"workflow-distributed-map-ingest-analyze-cdk":3},{"id":4,"title":5,"cleanup":6,"contributors":10,"deploy":13,"description":17,"diagram":18,"extension":19,"framework":20,"gitHub":21,"introBox":30,"level":34,"meta":35,"resources":36,"s3URL":39,"services":40,"simplicity":44,"stem":45,"testing":46,"type":50,"usecase":51,"videoId":29,"__hash__":52},"workflows\u002Fworkflows\u002Fdistributed-map-ingest-analyze-cdk.json","Distributed Map - Ingest & Analyze Historical Storm Data",{"headline":7,"text":8},"Cleanup",[9],"1. Delete the stack: \u003Ccode>cdk destroy\u003C\u002Fcode>.",[11,12],"content\u002Fcontributors\u002Frevanth-anireddy.json","content\u002Fcontributors\u002Fpraveen-marthala.json",{"text":14},[15,16],"1. Bootstrap CDK, if needed: \u003Ccode>cdk bootstrap aws:\u002F\u002F{your-aws-account-number}\u002F{your-aws-region}\u003C\u002Fcode>","2. Deploy the stack: \u003Ccode>cdk deploy\u003C\u002Fcode>","Wait for an asynchronous Job to finish before moving onto the next state","\u002Fassets\u002Fimages\u002Fworkflows\u002Fdistributed-map-ingest-analyze-cdk.png","json","AWS CDK",{"template":22,"payloads":27},{"repoURL":23,"templateDir":24,"templateFile":25,"ASL":26},"https:\u002F\u002Fgithub.com\u002Faws-samples\u002Fstep-functions-workflows-collection\u002Ftree\u002Fmain\u002Fingest-and-analyze-historical-storm-events\u002F","ingest-and-analyze-historical-storm-events","app.py","statemachine\u002Fstatemachine.asl.json",[28],{"headline":29,"payloadURL":29},"",{"headline":31,"text":32},"How it works",[33],"In this workflow we will use the distributed map feature of AWS Step functions by iterating over the raw compressed files (.gz) in the S3 bucket and decompressing them at scale. In the same orchestration process we will use AWS Glue Crawler to create\u002Fupdate the schema of the storm events. Once the crawl is process is complete, the step function will invoke the Athena query to retrieve the information from the AWS Glue data catalog tables","300",{},{"headline":37,"bullets":38},"Additional resources",[],null,[41,42,43],"s3","glue","sfn","3 - Application","workflows\u002Fdistributed-map-ingest-analyze-cdk",{"headline":47,"text":48},"Testing",[49],"See the GitHub repo for detailed testing instructions.","Standard","Data Processing","aPLzCwTGdczOeMjt94papphAK81Kvp2U0Q-KYobJ_MQ",1778846889003]