AWS CLI
Listing EC2 Instances
I need to list the instances and parse that list so I know what I am working with. Though not needed for creating snapshots, this was helpful in learning how aws-cli functions. I started with a serverfault post[1] and broke down the steps contained within. The example provided was
aws ec2 describe-instances --filters Name=vpc-id,Values=vpc-e2f17e8b --query 'Reservations[].Instances[].Tags[?Key==`Name`].Value[]'
But the query statement didn't make much sense to me. At this point I started to replicate this in my lab. The key here is that you want to look at the output of the standard describe-instances command
{ "Reservations": [ { "Instances": [ { "Monitoring": { "State": "disabled" }, "PublicDnsName": "ec2-xxx-xxx-xxx-xxx.us-east-2.compute.amazonaws.com", "StateReason": { "Message": "Client.UserInitiatedShutdown: User initiated shutdown", "Code": "Client.UserInitiatedShutdown" }, "State": { "Code": 80, "Name": "stopped" ... "InstanceId": "i-xxxxxxxxxxad6183c", ... "Tags": [ { "Value": "My-VM", "Key": "Name" } ],
Since the information I need is nested, I will need to drill down.[2] Starting with Reservations, then Instances, I can then select the information I need.
NOTE: This is case sensitive
[root@aws-cli ~]# aws ec2 describe-instances --query 'Reservations[].Instances[].{Instance_name:Tags[?Key==`Name`].Value,ID:InstanceId,State:State.Name,Volume:BlockDeviceMappings[].Ebs.VolumeId}' [ { "Instance_name": [ "My-VM" ], "Volume": [ "vol-xxxxxxxxxxxxxx142" ], "State": "stopped", "ID": "i-xxxxxxxxxxad6183c" } ]
To break this down:
- "Reservations[]" This will query ALL reservations.
- "Instances[]" This will query ALL instances
- "{}" This is creating an array, since we want multiple values found inside of Instances.
- "Instance_name" is an arbitrary name, you can put anything you want here without spaces. There might be a way to use spaces, but you shouldn't use them anyway.
- ":Tags[?Key==`Name`].Value" I do not fully understand this as of yet.[3] However I needed it to parse the human readable name I gave the instance.
- "ID" is an arbitrary name.
- ":InstanceId" will pull the instance ID.
- "State" is an arbitrary name.
- ":State.Name" will pull the human readable state of the instance. In this case "Stopped".
- "Volume" is an arbitrary name.
- ":BlockDeviceMappings[].Ebs.VolumeId" Will grab the VolumeIDs that we will need later.
To only list Instance IDs for processing
[root@aws-cli ~]# aws ec2 describe-instances --query 'Reservations[].Instances[].InstanceId' --output text | sed -e 's/\s\+/\n/g' i-xxxxxxxxxxad6183c i-xxxxxxxxxxad6345c
To grab the associated volume IDs
[root@aws-cli ~]# aws ec2 describe-instances --query 'Reservations[].Instances[].InstanceId' --output text | sed -e 's/\s\+/\n/g' | while read line; do aws ec2 describe-instances --instance-ids "$line" --query 'Reservations[].Instances[].BlockDeviceMappings[].Ebs.VolumeId' --output text| sed -e 's/\s\+/\n/g' ; done vol-xxxxxxxxxxxxxx142 vol-xxxxxxxxxxxxxxc0f
Creating Snapshots
According to Amazon[4], you want to stop the instance before taking a snapshot to ensure the state is clean. We will see about that. If you have time for that go for it! Otherwise we will look at making snapshots of live root volumes.
The other thing we will be doing is creating snapshots based on tags. I don't want to snapshot all machines, just the critical ones I get yelled at if they are not working.
#!/bin/bash now=$(date +%s) tag="My_Tag" aws ec2 describe-instances --filter "Name=tag-value,Values=$tag" --query 'Reservations[].Instances[].{ID:InstanceId}' --output text | while read line; do id=$line; vol=$(aws ec2 describe-instances --instance-ids "$id" --query 'Reservations[].Instances[].{Volume:BlockDeviceMappings[].Ebs.VolumeId}' --output text | awk '{print $2}'); name=$(aws ec2 describe-instances --instance-ids "$id" --query 'Reservations[].Instances[].{Instance_name:Tags[?Key==`Name`].Value}' --output text | awk '{print $2}'); snapid=$(aws ec2 create-snapshot --description "$name $id $now" --volume-id $vol | grep -oE snap-[0-9a-z]+) aws ec2 create-tags --resources $snapid --tags Key=function,Value=$tag Key=source,Value=$id Key=creation_date,Value=$now; done
Deleting Snapshots
Since you have to pay for the storage of these snapshots, you probably want to delete old ones.[5] We want to search through the snapshots using tags, then delete ones older than a set time. In my case I want to delete any older than two weeks. All my time stamps are in seconds making math easy.
#!/bin/bash now=$(date +%s) tag="My_Tag" aws ec2 describe-snapshots --filters Name=tag-value,Values=$tag --query 'Snapshots[].{ID:SnapshotId}' --output text | while read line; do snapid=$line; cdate=$(aws ec2 describe-snapshots --snapshot-id $snapid --query 'Snapshots[].Tags[?Key==`creation_date`]' --output text | awk '{print $2}'); diff=$(($now - $cdate)); if [ "$diff" > "1209600" ]; then aws ec2 delete-snapshot --snapshot-id $snapid; fi; done;
Restoring Snapshots
The process for restoring a snapshot for a live system is
- Create a volume from the snapshot.[6]
- Attach the volume to a new instance, the same type as the one being replaced.
- Boot the new image, and ensure it is responding correctly.
- Re-assign the elastic IP to the new instance.[7]
Easy enough using the web gui, but that could be prone to error and it takes longer than it should. Since I have daily snapshots taken the following will pull from the last snapshot and restore. Alternatively an option can be added to select from which day to restore.
Syntax as follows
[root@aws-cli ~]# siterestore.sh yoursite.tld
The script being built is as follows.
#!/bin/bash ##Grab the site name from standard input. sitename=$1 ##Run query for the site data. sitequery=$(aws ec2 describe-instances --filter "Name=tag-value,Values=$1" --query 'Reservations[].Instances[]' --output text) if [ -z "$sitequery" ]; then echo "$1 not found. Check for typos" else echo "Gathering data" values=$(aws ec2 describe-instances --filter "Name=tag-value,Values=$1" "Name=tag-key,Values=Name" --query 'Reservations[].Instances[].{IP:PublicIpAddress,VPC:VpcId,ID:InstanceId,Type:InstanceType,AZ:Placement.AvailabilityZone,SEC:SecurityGroups[].GroupId,VOLID:BlockDeviceMappings[].Ebs.{ID:VolumeId}AMI:ImageId,KEY:KeyName,subnetid:SubnetId}') ipaddress=$(grep -oE "\"IP\":\ \"[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}') type=$(grep -Eo "\"Type\": \"[a-z][0-9].[a-z]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}') az=$(grep -Eo "\"AZ\": \"us-[a-z0-9-]+\"" <<<$values| grep -Eo us-[a-z0-9-]+ | head -1) security=$(grep -Eo sg-[0-9a-z]+ <<<$values) id=$(grep -Eo "\"ID\":\ \"i-[0-9a-z]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}') volid=$(grep -Eo "\"VOLID\":\ \[\ \{\ \"ID\":\ \"vol-[0-9a-z]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $6}') imageid=$(grep -Eo ami-[0-9]+ <<<$values) keyname=$(grep -Eo "\"KEY\":\ \"[0-9a-zA-Z-]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}') subnetid=$(grep -Eo subnet-[a-z0-9]+ <<<$values) #Search for last snapshot created for this site. echo "Finding latest snapshot" dayago=$(date +%Y-%m-%d --date=yesterday) volvalues=$(aws ec2 describe-snapshots --filter "Name=volume-id,Values=$volid" "Name=start-time,Values=$dayago*" --query 'Snapshots[].{State:State,CreationDate:StartTime,SnapID:SnapshotId}') snapid=$(grep -Eo "\"SnapID\":\ \"snap-[0-9a-z]+" <<<$volvalues | grep -Eo snap-[0-9a-z]+) #Create volume from snapshot echo "Creating volume from snapshot..." echo "AZ: $az ID: $snapid" volvalues=$(aws ec2 create-volume --availability-zone $az --snapshot-id $snapid --volume-type gp2) volidnew=$(grep -Eo "\"VolumeId\":\ \"vol-[0-9a-z]+\"" <<<$volvalues | awk 'BEGIN {FS="\""}{print $4}') #Check that volume is created volstatus=$(aws ec2 describe-volumes --filters Name=volume-id,Values=$volidnew --query 'Volumes[].State' | grep -Eo [a-z]+) if [ -z "$volstatus" ]; then echo "Can not determine volume creation status. Stopping." exit 1 else until [ "$volstatus" = "available" ]; do volstatus=$(aws ec2 describe-volumes --filters Name=volume-id,Values=$volidnew --query 'Volumes[].State' | grep -Eo [a-z]+) if [ -z "$volstatus" ]; then echo "Can not determine volume creation status. Stopping." exit 1 fi echo "Still creating volume..." sleep=3 done echo "Volume has been created" fi #Re-lable / tag / name volumes and instances echo "Re-tagging assets" aws ec2 create-tags --resources "$id" "$volid" --tags Key=Name,Value=$1\_old > /dev/null aws ec2 create-tags --resources $volidnew --tags Key=Name,Value=$1 > /dev/null #Create instance for the new volume to use echo "Creating new instance..." newvalues=$(aws ec2 run-instances --image-id $imageid --count 1 --instance-type $type --key-name $keyname --security-group-ids $security --subnet-id $subnetid) newid=$(grep -Eo "\"InstanceId\":\ \"i-[a-z0-9]+\"" <<<$newvalues | grep -Eo i-[0-9a-z]+) if [ -z "$newid" ]; then echo "Could not identify new instance" exit 1 else newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}') echo "Waiting 30 seconds for things to settle" sleep 10 echo "...20 seconds" sleep 10 echo "...10 seconds" sleep 5 echo "...5 seconds" aws ec2 stop-instances --instance-ids $newid --force > /dev/null until [ "$newidstatus" = "stopped" ]; do newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}') echo "Waiting 30 seconds for new instance to stop..." sleep 10 echo "...20 seconds" sleep 10 echo "...10 seconds" sleep 5 echo "...5 seconds" sleep 2 echo "...3" sleep 1 echo "...2" sleep 1 echo "...1" aws ec2 stop-instances --instance-ids $newid --force > /dev/null done aws ec2 create-tags --resources "$newid" --tags Key=Name,Value=$1 > /dev/null fi #Remove volume from new instance echo "Removing temp volume from new instance" tempvolid=$(aws ec2 describe-volumes --filters Name=attachment.instance-id,Values=$newid | grep -Eo vol-[0-9a-z]+ | head -1) aws ec2 detach-volume --volume-id $tempvolid > /dev/null sleep 1 aws ec2 delete-volume --volume-id $tempvolid > /dev/null echo "Attaching volume to new instance" aws ec2 attach-volume --device=/dev/xvda --instance-id $newid --volume-id $volidnew echo "Starting new instance" aws ec2 start-instances --instance-ids $newid newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}') until [ "$newidstatus" = "running" ]; do newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}') echo "Waiting for new instance to be in running state" sleep 2 done #Verify new instance passed checks, then move elastic IP to new instance until [ "$statuscheck" = "passed" ]; do statuscheck=$(aws ec2 describe-instance-status --instance-ids $newid --query 'InstanceStatuses[].InstanceStatus.Details[].Status' | grep -Eo [A-Za-z]+) echo "Waiting for instance to pass status checks...". sleep 10 done echo "Moving elastic IP to new intsance" aws ec2 associate-address --instance-id $newid --public-ip $ipaddress --allow-reassociation > /dev/null #Stop old instance aws ec2 stop-instances --instance-ids $id --force > /dev/null #Clear variables just to make sure we don't break something. unset statuscheck unset tempvolid unset ipaddress unset type unset az unset security unset id unset volid unset snapid unset volvalues unset volidnew unset volstatus unset newvalues unset newid unset imageid unset type unset keyname unset security unset subnetid fi
- ↑ https://serverfault.com/questions/578921/how-would-you-go-about-listing-instances-using-aws-cli-in-certain-vpc-with-the-t
- ↑ https://docs.aws.amazon.com/cli/latest/userguide/controlling-output.html#controlling-output-filter
- ↑ https://github.com/aws/aws-cli/issues/621
- ↑ https://docs.aws.amazon.com/cli/latest/reference/ec2/create-snapshot.html
- ↑ https://docs.aws.amazon.com/cli/latest/reference/ec2/delete-snapshot.html
- ↑ https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-snapshots.html
- ↑ https://docs.aws.amazon.com/cli/latest/reference/ec2/associate-address.html