From Michael's Information Zone
Jump to navigation Jump to search

Listing EC2 Instances

I need to list the instances and parse that list so I know what I am working with. Though not needed for creating snapshots, this was helpful in learning how aws-cli functions. I started with a serverfault post[1] and broke down the steps contained within. The example provided was

aws ec2 describe-instances --filters Name=vpc-id,Values=vpc-e2f17e8b --query 'Reservations[].Instances[].Tags[?Key==`Name`].Value[]'

But the query statement didn't make much sense to me. At this point I started to replicate this in my lab. The key here is that you want to look at the output of the standard describe-instances command

    "Reservations": [
            "Instances": [
                    "Monitoring": {
                        "State": "disabled"
                    "PublicDnsName": "", 
                    "StateReason": {
                        "Message": "Client.UserInitiatedShutdown: User initiated shutdown", 
                        "Code": "Client.UserInitiatedShutdown"
                    "State": {
                        "Code": 80, 
                        "Name": "stopped"
                    "InstanceId": "i-xxxxxxxxxxad6183c",

                    "Tags": [
                            "Value": "My-VM", 
                            "Key": "Name"

Since the information I need is nested, I will need to drill down.[2] Starting with Reservations, then Instances, I can then select the information I need.
NOTE: This is case sensitive

[root@aws-cli ~]# aws ec2 describe-instances --query 'Reservations[].Instances[].{Instance_name:Tags[?Key==`Name`].Value,ID:InstanceId,State:State.Name,Volume:BlockDeviceMappings[].Ebs.VolumeId}'
        "Instance_name": [
        "Volume": [
        "State": "stopped", 
        "ID": "i-xxxxxxxxxxad6183c"

To break this down:

  • "Reservations[]" This will query ALL reservations.
  • "Instances[]" This will query ALL instances
  • "{}" This is creating an array, since we want multiple values found inside of Instances.
  • "Instance_name" is an arbitrary name, you can put anything you want here without spaces. There might be a way to use spaces, but you shouldn't use them anyway.
  • ":Tags[?Key==`Name`].Value" I do not fully understand this as of yet.[3] However I needed it to parse the human readable name I gave the instance.
  • "ID" is an arbitrary name.
  • ":InstanceId" will pull the instance ID.
  • "State" is an arbitrary name.
  • ":State.Name" will pull the human readable state of the instance. In this case "Stopped".
  • "Volume" is an arbitrary name.
  • ":BlockDeviceMappings[].Ebs.VolumeId" Will grab the VolumeIDs that we will need later.

To only list Instance IDs for processing

[root@aws-cli ~]# aws ec2 describe-instances --query 'Reservations[].Instances[].InstanceId' --output text | sed -e 's/\s\+/\n/g'

To grab the associated volume IDs

[root@aws-cli ~]# aws ec2 describe-instances --query 'Reservations[].Instances[].InstanceId' --output text | sed -e 's/\s\+/\n/g' | while read line; do aws ec2 describe-instances --instance-ids "$line" --query 'Reservations[].Instances[].BlockDeviceMappings[].Ebs.VolumeId' --output text| sed -e 's/\s\+/\n/g' ; done

Launching Instances

Launch with startup bash script

  • Create the shell script.[4]
dd if=/dev/zero of=/SWAP bs=1024 count=6291453
chmod 0600 /SWAP
mkswap /SWAP
swapon /SWAP
echo "/SWAP swap swap defaults 0 0" >> /etc/fstab
amazon-linux-extras install lamp-mariadb10.2-php7.2
yum -y install mod_ssl yum-cron php-mbstring php-zip htop firewalld mariadb-server php
ln -sf /usr/share/zoneinfo/America/New_York /etc/localtime
sed -i 's/update_cmd\ =\ default/update_cmd\ =\ security/; s/apply_updates\ =\ no/apply_updates\ =\ no/' /etc/yum/yum-cron.conf
systemctl start firewalld
systemctl enable firewalld
firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=https
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source service name=ssh accept'
firewall-cmd --permanent --remove-service=ssh
firewall-cmd --reload
rm -f /etc/httpd/conf.d/welcome.conf
sed -i 's/expose_php\ =\ On/expose_php\ =\ off/; s/upload_max_filesize\ =\ 2M/upload_max_filesize\ =\ 128M/; s/post_max_size\ =\ 8M/post_max_size\ =\ 64M/; s/max_execution_time\ =\ 30/max_execuion_time\ =\ 180/' /etc/php.ini
cat << EOF>> /etc/httpd/conf/httpd.conf
ServerTokens Prod
ServerSignature Off
LoadModule deflate_module modules/
systemctl enable httpd
systemctl enable mariadb
systemctl start httpd
systemctl start mariadb
mysql -e "UPDATE mysql.user SET Password=PASSWORD('password') WHERE User='root';"
mysql -e "DELETE FROM mysql.user WHERE User='root' AND Host NOT IN ('localhost', '', '::1');"
mysql -e "DELETE FROM mysql.user WHERE User='';"
mysql -e "DROP DATABASE test;"
usermod -a -G apache ec2-user
chown -R ec2-user:apache /var/www
chmod 2775 /var/www
find /var/www -type d -exec chmod 2775 {} \;
find /var/www -type f -exec chmod 0664 {} \;

  • Then run the following as you normally would.
[root@aws-cli]# aws ec2 run-instances --image-id ami-8c122be9 --count 1 --instance-type m5.large --key-name <your key name> --subnet-id <your subnet id> --security-group-ids <your security group> --instance-initiated-shutdown-behavior stop --user-data file://web.txt


Creating Snapshots

According to Amazon[5], you want to stop the instance before taking a snapshot to ensure the state is clean. We will see about that. If you have time for that go for it! Otherwise we will look at making snapshots of live root volumes.

The other thing we will be doing is creating snapshots based on tags. I don't want to snapshot all machines, just the critical ones I get yelled at if they are not working.

now=$(date +%s)
aws ec2 describe-instances --filter "Name=tag-value,Values=$tag" --query 'Reservations[].Instances[].{ID:InstanceId}' --output text | while read line;
do id=$line;
vol=$(aws ec2 describe-instances --instance-ids "$id" --query 'Reservations[].Instances[].{Volume:BlockDeviceMappings[].Ebs.VolumeId}' --output text | awk '{print $2}');
name=$(aws ec2 describe-instances --instance-ids "$id" --query 'Reservations[].Instances[].{Instance_name:Tags[?Key==`Name`].Value}' --output text | awk '{print $2}');
snapid=$(aws ec2 create-snapshot --description "$name $id $now" --volume-id $vol | grep -oE snap-[0-9a-z]+)
aws ec2 create-tags --resources $snapid --tags Key=function,Value=$tag Key=source,Value=$id Key=creation_date,Value=$now;

Deleting Snapshots

Since you have to pay for the storage of these snapshots, you probably want to delete old ones.[6] We want to search through the snapshots using tags, then delete ones older than a set time. In my case I want to delete any older than two weeks. All my time stamps are in seconds making math easy.

now=$(date +%s)
aws ec2 describe-snapshots --filters Name=tag-value,Values=$tag --query 'Snapshots[].{ID:SnapshotId}' --output text | while read line;
do snapid=$line;
cdate=$(aws ec2 describe-snapshots --snapshot-id $snapid --query 'Snapshots[].Tags[?Key==`creation_date`]' --output text | awk '{print $2}');
diff=$(($now - $cdate));
if [ "$diff" > "1209600" ];
then aws ec2 delete-snapshot --snapshot-id $snapid;

Restoring Snapshots

The process for restoring a snapshot for a live system is

  • Create a volume from the snapshot.[7]
  • Attach the volume to a new instance, the same type as the one being replaced.
  • Boot the new image, and ensure it is responding correctly.
  • Re-assign the elastic IP to the new instance.[8]

Easy enough using the web gui, but that could be prone to error and it takes longer than it should. Since I have daily snapshots taken the following will pull from the last snapshot and restore. Alternatively an option can be added to select from which day to restore.

Syntax as follows

[root@aws-cli ~]# yoursite.tld

The script being built is as follows.


##Grab the site name from standard input.
##Run query for the site data.
sitequery=$(aws ec2 describe-instances --filter "Name=tag-value,Values=$1" --query 'Reservations[].Instances[]' --output text)
if [ -z "$sitequery" ]; then
	echo "$1 not found. Check for typos"
	echo "Gathering data"
	values=$(aws ec2 describe-instances --filter "Name=tag-value,Values=$1" "Name=tag-key,Values=Name" --query 'Reservations[].Instances[].{IP:PublicIpAddress,VPC:VpcId,ID:InstanceId,Type:InstanceType,AZ:Placement.AvailabilityZone,SEC:SecurityGroups[].GroupId,VOLID:BlockDeviceMappings[].Ebs.{ID:VolumeId}AMI:ImageId,KEY:KeyName,subnetid:SubnetId}')
	ipaddress=$(grep -oE "\"IP\":\ \"[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}')
	type=$(grep -Eo "\"Type\": \"[a-z][0-9].[a-z]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}')
	az=$(grep -Eo "\"AZ\": \"us-[a-z0-9-]+\"" <<<$values| grep -Eo us-[a-z0-9-]+ | head -1)
	security=$(grep -Eo sg-[0-9a-z]+ <<<$values)
	id=$(grep -Eo "\"ID\":\ \"i-[0-9a-z]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}')
	volid=$(grep -Eo "\"VOLID\":\ \[\ \{\ \"ID\":\ \"vol-[0-9a-z]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $6}')
	imageid=$(grep -Eo ami-[0-9]+ <<<$values)
	keyname=$(grep -Eo "\"KEY\":\ \"[0-9a-zA-Z-]+\"" <<<$values | awk 'BEGIN {FS="\""}{print $4}')
	subnetid=$(grep -Eo subnet-[a-z0-9]+  <<<$values)

#Search for last snapshot created for this site.
	echo "Finding latest snapshot"
	dayago=$(date +%Y-%m-%d --date=yesterday)
	volvalues=$(aws ec2 describe-snapshots --filter "Name=volume-id,Values=$volid" "Name=start-time,Values=$dayago*" --query 'Snapshots[].{State:State,CreationDate:StartTime,SnapID:SnapshotId}')
	snapid=$(grep -Eo "\"SnapID\":\ \"snap-[0-9a-z]+" <<<$volvalues | grep -Eo snap-[0-9a-z]+)

#Create volume from snapshot
	echo "Creating volume from snapshot..."
	echo "AZ: $az ID: $snapid"
	volvalues=$(aws ec2 create-volume --availability-zone $az --snapshot-id $snapid --volume-type gp2)
	volidnew=$(grep -Eo "\"VolumeId\":\ \"vol-[0-9a-z]+\"" <<<$volvalues | awk 'BEGIN {FS="\""}{print $4}')

#Check that volume is created
	volstatus=$(aws ec2 describe-volumes --filters Name=volume-id,Values=$volidnew --query 'Volumes[].State' | grep -Eo [a-z]+)
	if [ -z "$volstatus" ]; then
		echo "Can not determine volume creation status. Stopping."
		exit 1
		until [ "$volstatus" = "available" ]; do
			volstatus=$(aws ec2 describe-volumes --filters Name=volume-id,Values=$volidnew --query 'Volumes[].State' | grep -Eo [a-z]+)
			if [ -z "$volstatus" ]; then
				echo "Can not determine volume creation status. Stopping."
				exit 1
			echo "Still creating volume..."
		echo "Volume has been created"
#Re-lable / tag / name volumes and instances
	echo "Re-tagging assets"
	aws ec2 create-tags --resources "$id" "$volid" --tags Key=Name,Value=$1\_old > /dev/null
	aws ec2 create-tags --resources $volidnew --tags Key=Name,Value=$1 > /dev/null
#Create instance for the new volume to use
	echo "Creating new instance..."
	newvalues=$(aws ec2 run-instances --image-id $imageid --count 1 --instance-type $type --key-name $keyname --security-group-ids $security --subnet-id $subnetid)
	newid=$(grep -Eo "\"InstanceId\":\ \"i-[a-z0-9]+\"" <<<$newvalues | grep -Eo i-[0-9a-z]+)
	if [ -z "$newid" ]; then
		echo "Could not identify new instance"
		exit 1
		newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}')
		echo "Waiting 30 seconds for things to settle"
		sleep 10
		echo "...20 seconds"
		sleep 10
		echo "...10 seconds"
		sleep 5
		echo "...5 seconds"
		aws ec2 stop-instances --instance-ids $newid --force > /dev/null
		until [ "$newidstatus" = "stopped" ]; do
			newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}')
			echo "Waiting 30 seconds for new instance to stop..."
			sleep 10
			echo "...20 seconds"
			sleep 10
			echo "...10 seconds"
			sleep 5
			echo "...5 seconds"
			sleep 2
			echo "...3"
			sleep 1
			echo "...2"
			sleep 1
			echo "...1"
			aws ec2 stop-instances --instance-ids $newid --force > /dev/null
	aws ec2 create-tags --resources "$newid" --tags Key=Name,Value=$1 > /dev/null
#Remove volume from new instance
	echo "Removing temp volume from new instance"
	tempvolid=$(aws ec2 describe-volumes --filters Name=attachment.instance-id,Values=$newid | grep -Eo vol-[0-9a-z]+ | head -1)
	aws ec2 detach-volume --volume-id $tempvolid > /dev/null
	sleep 1
	aws ec2 delete-volume --volume-id $tempvolid > /dev/null
	echo "Attaching volume to new instance"
	aws ec2 attach-volume --device=/dev/xvda --instance-id $newid --volume-id $volidnew
	echo "Starting new instance"
	aws ec2 start-instances --instance-ids $newid

	newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}')
	until [ "$newidstatus" = "running" ]; do
		newidstatus=$(aws ec2 describe-instances --filters Name=instance-id,Values=$newid --query 'Reservations[].Instances[].State.{State:Name}' | grep -Eo "\"State\":\ \"[a-z]+\"" | awk 'BEGIN {FS="\""}{print $4}')
		echo "Waiting for new instance to be in running state"
		sleep 2
#Verify new instance passed checks, then move elastic IP to new instance
	until [ "$statuscheck" = "passed" ]; do
		statuscheck=$(aws ec2 describe-instance-status --instance-ids $newid --query 'InstanceStatuses[].InstanceStatus.Details[].Status' | grep -Eo [A-Za-z]+)
		echo "Waiting for instance to pass status checks...".
		sleep 10
	echo "Moving elastic IP to new intsance"
	aws ec2 associate-address --instance-id $newid --public-ip $ipaddress --allow-reassociation > /dev/null

#Stop old instance
aws ec2 stop-instances --instance-ids $id --force > /dev/null

#Clear variables just to make sure we don't break something.

unset statuscheck
unset tempvolid
unset ipaddress
unset type
unset az
unset security
unset id
unset volid
unset snapid
unset volvalues
unset volidnew
unset volstatus
unset newvalues
unset newid
unset imageid
unset type
unset keyname
unset security
unset subnetid