I have a cloud Formation template for a AWS Batch POC with 6 resources.
- 3 AWS::IAM::Role
- 1 AWS::Batch::ComputeEnvironment
- 1 AWS::Batch::JobQueue
- 1 AWS::Batch::JobDefinition
The AWS::IAM::Role have the policy "arn:aws:iam::aws:policy/AdministratorAccess" (In order to avoid issues.)
The roles are used:
- 1 into the AWS::Batch::ComputeEnvironment
- 2 into the AWS::Batch::JobDefinition
But even with the policy "arn:aws:iam::aws:policy/AdministratorAccess" I get "CannotPullContainerError: Error response from daemon: Get https://********.dkr.ecr.eu-west-1.amazonaws.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" when I rin a job.
Disclainer: All is FARGATE (Compute enviroment and Job), not EC2
AWSTemplateFormatVersion: '2010-09-09'
Description: Creates a POC AWS Batch environment.
Parameters:
Environment:
Type: String
Description: 'Environment Name'
Default: TEST
Subnets:
Type: List<AWS::EC2::Subnet::Id>
Description: 'List of Subnets to boot into'
ImageName:
Type: String
Description: 'Name and tag of Process Container Image'
Default: 'upload:6.0.0'
Resources:
BatchServiceRole:
Type: 'AWS::IAM::Role'
Properties:
RoleName: !Join ['', ['Demo', BatchServiceRole]]
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: 'Allow'
Principal:
Service: 'batch.amazonaws.com'
Action: 'sts:AssumeRole'
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/AdministratorAccess'
BatchContainerRole:
Type: 'AWS::IAM::Role'
Properties:
RoleName: !Join ['', ['Demo', BatchContainerRole]]
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
-
Effect: 'Allow'
Principal:
Service:
- 'ecs-tasks.amazonaws.com'
Action:
- 'sts:AssumeRole'
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/AdministratorAccess'
BatchJobRole:
Type: 'AWS::IAM::Role'
Properties:
RoleName: !Join ['', ['Demo', BatchJobRole]]
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: 'Allow'
Principal:
Service: 'ecs-tasks.amazonaws.com'
Action: 'sts:AssumeRole'
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/AdministratorAccess'
BatchCompute:
Type: "AWS::Batch::ComputeEnvironment"
Properties:
ComputeEnvironmentName: DemoContentInput
ComputeResources:
MaxvCpus: 256
SecurityGroupIds:
- sg-0b33333333333333
Subnets: !Ref Subnets
Type: FARGATE
ServiceRole: !Ref BatchServiceRole
State: ENABLED
Type: Managed
Queue:
Type: "AWS::Batch::JobQueue"
DependsOn: BatchCompute
Properties:
ComputeEnvironmentOrder:
- ComputeEnvironment: DemoContentInput
Order: 1
Priority: 1
State: "ENABLED"
JobQueueName: DemoContentInput
ContentInputJob:
Type: "AWS::Batch::JobDefinition"
Properties:
Type: Container
ContainerProperties:
Command:
- -v
- process
- new-file
- -o
- s3://contents/{content_id}/{content_id}.mp4
Environment:
- Name: SECRETS
Value: !Join [ ':', [ '{{resolve:secretsmanager:common.secrets:SecretString:aws_access_key_id}}', '{{resolve:secretsmanager:common.secrets:SecretString:aws_secret_access_key}}' ] ]
- Name: APPLICATION
Value: upload
- Name: API_KEY
Value: '{{resolve:secretsmanager:common.secrets:SecretString:fluzo.api_key}}'
- Name: CLIENT
Value: upload-container
- Name: ENVIRONMENT
Value: !Ref Environment
- Name: SETTINGS
Value: !Join [ ':', [ '{{resolve:secretsmanager:common.secrets:SecretString:aws_access_key_id}}', '{{resolve:secretsmanager:common.secrets:SecretString:aws_secret_access_key}}', 'upload-container' ] ]
ExecutionRoleArn: 'arn:aws:iam::**********:role/DemoBatchJobRole'
Image: !Join ['', [!Ref 'AWS::AccountId','.dkr.ecr.', !Ref 'AWS::Region', '.amazonaws.com/', !Ref ImageName ] ]
JobRoleArn: !Ref BatchContainerRole
ResourceRequirements:
- Type: VCPU
Value: 1
- Type: MEMORY
Value: 2048
JobDefinitionName: DemoContentInput
PlatformCapabilities:
- FARGATE
RetryStrategy:
Attempts: 1
Timeout:
AttemptDurationSeconds: 600
Into AWS::Batch::JobQueue:ContainerProperties:ExecutionRoleArn I harcoded the arn because if write !Ref BatchJobRole I get an error. But it's no my goal with this question.
The question is how to avoid "CannotPullContainerError: Error response from daemon: Get https://********.dkr.ecr.eu-west-1.amazonaws.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" when I run a Job.
question from:
https://stackoverflow.com/questions/65672036/aws-batch-cloudformation-cannotpullcontainererror 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…