As you noted when you create an EMR cluster, the tags are the same for all nodes (Master, Slave, Task).
You will find that this process using the AWS CLI to be complicated. My recomendation is to review the examples below and then write a Python program to do this.
Process to add your own tags to the EC2 instances.
STEP 1: List your EMR Clusters:
aws emr list-clusters
This will output JSON:
{
"Clusters": [
{
"Id": "j-ABCDEFGHIJKLM",
"Name": "'MyCluster'",
"Status": {
"State": "WAITING",
"StateChangeReason": {
"Message": "Cluster ready after last step completed."
},
"Timeline": {
"CreationDateTime": 1536626095.303,
"ReadyDateTime": 1536626568.482
}
},
"NormalizedInstanceHours": 0
}
]
}
STEP 2: Make a note of the Cluster ID from the JSON:
"Id": "j-ABCDEFGHIJKLM",
STEP 3: Describe your EMR Cluster:
aws emr describe-cluster --cluster-id j-ABCDEFGHIJKLM
This will output JSON (I have truncated this output to just the MASTER section):
{
"Cluster": {
"Id": "j-ABCDEFGHIJKLM",
"Name": "'Test01'",
....
"InstanceGroups": [
{
"Id": "ig-2EHOYXFABCDEF",
"Name": "Master Instance Group",
"Market": "ON_DEMAND",
"InstanceGroupType": "MASTER",
"InstanceType": "m3.xlarge",
"RequestedInstanceCount": 1,
"RunningInstanceCount": 1,
"Status": {
"State": "RUNNING",
"StateChangeReason": {
"Message": ""
},
"Timeline": {
"CreationDateTime": 1536626095.316,
"ReadyDateTime": 1536626533.886
}
},
"Configurations": [],
"EbsBlockDevices": [],
"ShrinkPolicy": {}
},
....
]
}
}
STEP 4: InstanceGroups is an array. Find the entry where InstanceGroupType
is MASTER
. Make note of the Id
.
"Id": "ig-2EHOYXFABCDEF",
STEP 5: List your cluster instances:
aws emr list-instances --cluster-id j-ABCDEFGHIJKLM
This will output JSON (I have truncated the output):
{
"Instances": [
....
{
"Id": "ci-31LGK4KIECHNY",
"Ec2InstanceId": "i-0524ec45912345678",
"PublicDnsName": "ec2-52-123-201-221.us-west-2.compute.amazonaws.com",
"PublicIpAddress": "52.123.201.221",
"PrivateDnsName": "ip-172-31-41-111.us-west-2.compute.internal",
"PrivateIpAddress": "172.31.41.111",
"Status": {
"State": "RUNNING",
"StateChangeReason": {},
"Timeline": {
"CreationDateTime": 1536626164.073,
"ReadyDateTime": 1536626533.886
}
},
"InstanceGroupId": "ig-2EHOYXFABCDEF",
"Market": "ON_DEMAND",
"InstanceType": "m3.xlarge",
"EbsVolumes": []
}
]
}
STEP 6: Find the matching InstanceGroupId
ig-2EHOYXFABCDEF
. This will give you the EC2 Instance ID for the MASTER: "Ec2InstanceId": "i-0524ec45912345678"
Step 7: Tag your EC2 instance:
aws ec2 create-tags --resources i-0524ec45912345678 --tags Key=EMR,Value=MASTER
The above steps might be simpler with CLI Filters
and / or jq
, but this should be enough information so that you know how to find and tag the EMR Master Instance.