ECS Task Role AccessDenied is always fixable if you stop blaming AWS
January 22, 2026
You know the vibe: the container is healthy, the service is green, the app starts… and then your logs say:
AccessDeniedException: User is not authorized to perform s3:GetObject
At that point, most teams do the classic panic dance. They slap AmazonS3FullAccess on some role, redeploy, and pray. Sometimes it “works” and sometimes it still fails, which is even worse because now it feels random.
It isn’t random. It’s usually one of three things:
You gave permissions to the wrong role
The right role exists, but the task isn’t using it
The role is right, but you’re missing a second permission edge (KMS, resource policies, STS, VPC endpoints)
This post is the production-grade runbook for debugging it without guessing.
The mental model that stops the pain
In ECS you have two IAM roles that people constantly mix up:
Task execution role
This is what ECS needs to start your task: pulling images, writing logs, fetching secrets at startup, etc. AWS calls it the “task execution IAM role.”
Task role
This is what your application code uses once it’s running, when it calls AWS APIs via an SDK (S3, DynamoDB, SQS, Secrets Manager, you name it). AWS calls it the “task IAM role.”
If your app is throwing AccessDenied while calling AWS APIs, the permissions almost always belong on the task role, not the execution role.
Why this works cleanly: ECS delivers the role credentials to the container via a standard container credentials flow (SDK reads an env var like AWS_CONTAINER_CREDENTIALS_RELATIVE_URI and fetches temporary creds).
The fastest way to see what identity your container is actually using
You want to stop arguing about what role “should” be used and instead prove what role is used.
Inside the container (or via ECS Exec), run:
aws sts get-caller-identity
If you don’t have AWS CLI in the image, do the same with your SDK (print the caller identity once on startup), or temporarily add a minimal debug endpoint that calls STS and returns the ARN.
If you see an ARN you didn’t expect, the problem is upstream: task definition, role attachment, trust policy, or your SDK credential chain.
The classic failure mode
You added permissions to the execution role, redeployed, still got AccessDenied.
That is completely consistent with how ECS is designed. The execution role is for ECS “plumbing,” the task role is for your app runtime calls.
So the real question becomes:
Does your task definition actually set a task role?
In the task definition JSON, you want both (often):
executionRoleArn
taskRoleArn
Example skeleton:
{
"family": "my-service",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/myServiceTaskRole",
"containerDefinitions": [
{
"name": "api",
"image": "123456789012.dkr.ecr.eu-west-2.amazonaws.com/my-api:latest",
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-service",
"awslogs-region": "eu-west-2",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
If taskRoleArn is missing, your app will not have the intended permissions.
Trust policy checks that waste hours if you forget them
Even if you set taskRoleArn, ECS can only assume it if the role trust policy allows the ECS tasks service principal.
Your task role trust policy should look like this (the important part is the principal):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "ecs-tasks.amazonaws.com" },
"Action": "sts:AssumeRole"
}
]
}
If this is wrong, you’ll see “access denied” style failures that look like permissions but are actually “role cannot be assumed.”
AWS has a solid write-up on ECS role best practices and why task roles are the right isolation boundary.
The boring but correct way to write the policy
Let’s say your app needs to read from one S3 bucket prefix, and nothing else.
Do this (least privilege), not AmazonS3FullAccess:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ReadArtifacts",
"Effect": "Allow",
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::my-bucket/private/artifacts/*"]
},
{
"Sid": "ListPrefix",
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": ["arn:aws:s3:::my-bucket"],
"Condition": {
"StringLike": {
"s3:prefix": ["private/artifacts/*"]
}
}
}
]
}
If you only grant GetObject and forget ListBucket, you’ll get weird “works sometimes” behaviour depending on whether your code lists before reading.
The hard mode gotchas that make people think ECS is cursed
These are the ones that bite experienced engineers because they are not obvious.
1) SSE-KMS encrypted S3 objects
Your role can have S3 permissions and still fail because the object is encrypted with KMS and you don’t have kms:Decrypt for the key.
Symptoms: S3 calls fail even though policy “looks right.”
Fix: Add KMS permissions on the key (and check key policy too).
2) Bucket policy or resource policy overrides you
IAM allow does not automatically win if the bucket policy denies, or only allows a different principal. Same for Secrets Manager resource policies.
Symptoms: You swear the role has permissions, but AccessDenied persists.
Fix: Inspect the resource policy and make sure it allows your task role ARN.
3) Your SDK is not using ECS container credentials
ECS injects a container credentials endpoint for the SDK to use. If your app is overriding credentials (env vars, shared config, a hardcoded profile), it might ignore the task role entirely. AWS documents how container credential providers work and what variables SDKs use.
Symptoms: get-caller-identity shows an unexpected principal.
Fix: remove overrides, ensure the SDK is allowed to use default credential resolution, and verify the ECS-provided env var exists.
4) You’re debugging the execution role instead of the task role
This happens a lot when the logs are fine (execution role works), but runtime calls fail (task role missing/wrong). The roles have different jobs.
The no-guesswork triage flow I use in real systems
Confirm the failing AWS action and resource from the error text
From inside the container, run aws sts get-caller-identity (or SDK equivalent)
Confirm the task definition has taskRoleArn set
Confirm the task role trust policy allows ecs-tasks.amazonaws.com
Confirm the IAM policy includes the exact action and the correct resource ARN
Check resource policies (S3 bucket policy, KMS key policy, Secrets Manager policy)
If still stuck, use AWS’s ECS IAM role config troubleshooting guide as a structured checklist
This is the difference between a senior engineer and a chaos goblin: you turn “it’s broken” into a deterministic elimination process.
Make it visual on your blog
If you want this post to slap, add one diagram. Either draw it in Figma, or recreate it cleanly.
Diagram idea 1
Two lanes:
ECS agent lane: “Pull image, send logs, fetch secrets at startup” → Execution role
Application lane: “Call S3, DynamoDB, SQS, Secrets at runtime” → Task role
Reference AWS docs in the caption so it looks legit.
Diagram idea 2
Credentials flow:
Task role → STS temporary creds → ECS injects container credentials endpoint → SDK reads env var → API call succeeds
Copy-ready ending checklist
If your ECS task is throwing AccessDenied, check this in order:
Task definition has taskRoleArn set (not just execution role)
Task role trust policy allows ECS tasks (ecs-tasks.amazonaws.com)
Your container is actually using that role (STS caller identity)
Policy matches the exact action + resource ARN
Resource policies and KMS aren’t silently blocking you
SDK is not overridden away from container credentials