From Fedora Project Wiki

Revision as of 06:32, 12 October 2011 by Gholms (talk | contribs)

Amazon Web Services (AWS) comprise a public cloud, a collection of computing services that allows one to build and run software services in Amazon's data centers. Fedora publishes system images for AWS's virtual machine platform, Amazon Elastic Compute Cloud (EC2), which allows one to create virtual machines in the cloud with very little effort. The objective of this primer is to familiarize the reader with EC2's terminology and functionality. For more detailed documentation, see the AWS website.

EC2 Concepts

What follows are some short explanations of EC2 terminology. For more detailed information, see the EC2 documentation.

Images and Instances

A machine image is a snapshot of a system (specifically its / filesystem) that provides the basis for a virtual machine in EC2. When you run a new virtual machine in EC2 you choose a machine image to use as a template. The new virtual machine is then an instance of that machine image that contains its own copy of everything in the image. The instance keeps running until you stop or terminate it, or until it fails. If an instance fails, you can launch a new one from the same image. You can create multiple instances of a single machine image. Each instance will be independent of the others.

You can use a single image or multiple images, depending on your needs. From a single image, you can launch different types of instances. An instance type defines what hardware the instance has, including the amount of memory, disk space, and CPU power.

Amazon, Fedora, other groups, and individuals publish images for public use. You might only need to use images that reputable sources provide, and you can simply customize the resulting instances to suit your needs as you launch them. You can also create your own machine images, but that is beyond the scope of this document.

Machine images in EC2 are sometimes referred to as AMIs.

Machine images have identifiers that begin with ami, such as ami-6ebe4507. Instances have identifiers that begin with the letter i, such as i-12459dbd.

Regions and Availability Zones

Amazon hosts datacenters many parts of the world. Those from a particular part of the world make up a region. Regions' names are based on their locations, such as in us-east-1.

Regions are broken up into availability zones, which are designed to isolate failures from one another but still provide faster communication than communication between regions. Distributing a web application amongst several availability zones can help improve its reliability if an availability zone encounters problems. Availability zones' names are based on the regions in which they reside, such as us-east-1a.

Storage

EC2 instances use one or more of three types of storage provided by AWS:

Amazon S3

Amazon Simple Storage Service (S3) is a web service-based storage system that is accessible inside EC2 and elsewhere on the Internet. As this primer will not focus on S3, see the Amazon S3 documentation for more details.

Elastic Block Store (EBS)

Amazon Elastic Block Store (EBS) provides instances with persistent, disk-like storage that you can attach to and detach from instances, similar to portable disk drives. By creating EBS volumes and attaching them to instances you can store data that you wish to be portable to more than one instance in the event an instance fails or is replaced. Since instances' root filesystem tend to have limited space, volumes also provide a simple way of adding additional disk capacity to instances.

Volumes have identifiers that begin with vol, such as vol-ffe93704.

You can create a backup snapshot of a volume. From the snapshot you can then create a new volume and attach it to another instance. You can create multiple volumes from the same snapshot. Each volume will be independent of the others.

Snapshots have identifiers that begin with snap, such as snap-773491a0.

Instance Storage

Some instance types have instance storage, scratch space that persists only as long as an instance runs. Instance storage is destroyed when an instance stops, terminates, or fails. For this reason, it is also referred to as ephemeral storage.

When EC2 was first introduced, all machine images were backed by instance storage, meaning that their instances' root filesystems were stored in instance storage. Machine images can now also be backed by EBS, meaning that their instances' root filesystem instead reside on EBS volumes.

Security Groups

A security group defines firewall rules for your instances. These rules specify which incoming network traffic should be delivered to an instance (e.g., accept web traffic on port 80 or SSH traffic on port 22). All other traffic is ignored. You can modify the rules for a group at any time.

Every instance runs inside of a security group. You can create your own security groups, or you can use the default security group that EC2 provides for you. When you run a new instance it will run in the default security group unless you choose a different one.

Getting Started with Fedora on EC2

Get Account Details

To use AWS you need to create an online account. You can do this by going to the AWS web site, clicking on Create an AWS Account, and following the instructions.

Amazon AWS is not free
AWS is designed as a pay-as-you-go online service. Much of EC2 is free for new users; the rest is available for per-hour or per-month fees that are detailed on the EC2 Website. As such, Amazon requests a credit card number to keep on file with your new account. If you are participating in a Fedora Test Day, the Fedora Cloud SIG may be able to provide you with a sponsored account or financial reimbursement.

One can interact with EC2 through either a web-based management console or via euca2ools, a suite of command line tools designed for services like EC2. This tutorial will focus on using EC2 with euca2ools at the command line.

To using the command line tools you first need to obtain access keys for your account. You can find them by going to the AWS management console on the web, clicking your name on the top, followed by Security Credentials, and scrolling down to the section titled Access Credentials. Make note of the Access Key ID and the Secret Access Key that appears beside it. Both of them should be long sets of alphanumeric characters. Create a file called .iamrc in your home directory that contains those keys in this format:

AWSAccessKeyId=your_access_key_id
AWSSecretKey=your_secret_key

Since euca2ools is designed to work with all AWS-compatible clouds, not just AWS itself, it needs to know which cloud to contact. Create a file called .eucarc in your home directory with the following content to point it toward AWS:

export AWS_CREDENTIAL_FILE=~/.iamrc
export EC2_URL=https://ec2.amazonaws.com/
export S3_URL=https://s3.amazonaws.com/
export EUARE_URL=https://iam.amazonaws.com/

source "$AWS_CREDENTIAL_FILE"
export EC2_ACCESS_KEY=$AWSAccessKeyId
export EC2_SECRET_KEY=$AWSSecretKey
export AWS_ACCESS_KEY=$AWSAccessKeyId
export AWS_SECRET_ACCESS_KEY=$AWSSecretKey

Finally, add these settings to your shell's environment by running:

source ~/.eucarc

Do Initial Setup

Install the Command Line Tools

Install the euca2ools package. To do so with yum, run:

yum install euca2ools

Choose a Region

Choose an EC2 region to use. Things to consider when choosing a region include its geographic location, the pricing for instances in that region, and whether the image you wish to use is available in that region. You can get a list of regions by running euca-describe-regions, which results in a list such as this:

REGION  eu-west-1       ec2.eu-west-1.amazonaws.com
REGION  us-east-1       ec2.us-east-1.amazonaws.com
REGION  ap-northeast-1  ec2.ap-northeast-1.amazonaws.com
REGION  us-west-1       ec2.us-west-1.amazonaws.com
REGION  ap-southeast-1  ec2.ap-southeast-1.amazonaws.com

When you choose an EC2 region you can make euca2ools start using it by editing the line that contains EC2_URL in your .eucarc file:

export EC2_URL=https://ec2.us-east-1.amazonaws.com/

...and then re-set the settings in your shell's environment:

source ~/.eucarc

Create a Key Pair

The primary way of logging into Fedora instances is via SSH. Since Fedora instances have no passwords, you need a SSH key pair to log in to them. The private half of this key pair is stored on your computer, while the public half is stored in EC2 so instances can download them as they start. This allows you to securely log into your instances without a password.

You can have multiple key pairs. Each key pair has its own name. Key pairs are specific to each EC2 region.

Choose a name for a new key pair and then use the euca-add-keypair command to create it and write the private key to a file. Be sure to choose a name that is easy to remember.

euca-add-keypair mykey > mykey.pem
Key pairs are irreplaceable
EC2 does not store the private halves of key pairs. The time you run euca-add-keypair is the only chance you will have to save a copy of the private key. There is no way to recover a lost private key from EC2.

You can use euca-describe-keypairs to display a list of your keypairs.

$ euca-describe-keypairs
KEYPAIR mykey1  7b:9b:33:cf:bf:12:4d:62:b6:7c:fa:02:f2:f7:bc:59:e3:7e:40:fb
KEYPAIR mykey2  f9:93:1e:73:4b:2e:c1:0d:7f:79:e1:bc:c0:d0:7c:95:32:55:b7:dd

You can use euca-delete-keypairs to delete a keypair. Deleting a keypair does not remove it from instances that are already running; it merely prevents new instances from using it.

euca-delete-keypair mykey1

Set up a Security Group

Each security group has its own set of firewall rules. While this tutorial uses the default security group that EC2 provides for you, you can also create your own security groups.

The euca-authorize command lets you tell EC2 to allow traffic from ranges of IP addresses and ports into a security group. To allow access to SSH (TCP port 22) running on instances in the default security group, run the following command:

euca-authorize default -p 22 -s your_system's_ip_address/32

If you do not specify a range of IP addresses then the port(s) you choose will be open to the entire Internet. For example, the following command allows SSH access from any machine Internet, not just your computer:

euca-authorize default -p 22

To allow pings and other ICMP traffic you can run:

euca-authorize default -P icmp

The opposite of euca-authorize is euca-revoke. You can use euca-describe-groups to obtain a list of security groups and the firewall permissions you have applied to them.

Run an Instance

You are now ready to run an instance.

Choose an Image

The Cloud SIG maintains an index of machine images published by Fedora. While all of the images for a given release behave the same, they differ by architecture, EC2 region, and where the root filesystem is stored (that is, instance store or EBS). Choose the image that is most appropriate for you and note its ID, which begins with ami.

Choose an Instance Type

Amazon offers several instance types, which it summarizes on the EC2 web site. As of the time of writing, the smallest and cheapest instance types are m1.small and t1.micro, though each of those carries a restriction: code>m1.small instances must use the i386 architecture. t1.micro instances have no instance storage and therefore must boot from EBS. If the image you choose fits neither of these criteria then you need to use a larger and more expensive instance type.

Run an Instance

Run a new instance of the image and instance type you chose with euca-run-instances. To be able to log into the new instance, you must also specify the name of the key pair you created earlier. For example, to run a t1.micro instance of the image ami-7f5a063a with a key pair named mykey, run the following command:

$ euca-run-instances ami-7f5a063a -t t1.micro -k mykey
RESERVATION  r-4d5ea00a  0123456789ab  default
INSTANCE     i-910fbbd6  ami-7f5a063a  pending  0  mykey  t1.micro  2011-10-11T00:00:00.000Z us-east-1c  aki-9ba0f1de

The output of euca-run-instances contains the ID of the instance you just started. In the example above, the instance's ID is i-910fbbd6. You will need this ID to use tools that need to refer to the instance.

The instance starts in the pending state. When they finish booting they change to the running state. When you terminate them they change to the shutting-down and finally terminated states.

Log into the Instance

As the instance starts it obtains an IP address from EC2 and changes to the running state. You can check on your instances by running euca-describe-instances, optionally with the ID of the instance in question. When the instance is ready (or nearly ready) to use, euca-describe-instances will display the address you can use to log into it:

$ euca-describe-instances
RESERVATION  r-4d5ea00a  0123456789ab  default
INSTANCE     i-910fbbd6  ami-7f5a063a  ec2-204-236-168-22.us-east-1.compute.amazonaws.com  ip-10-170-15-23.us-east-1.compute.internal  running  0  mykey  t1.micro  2011-10-11T00:00:00.000Z us-east-1c  aki-9ba0f1de

The public address of the instance in this example is ec2-204-236-168-22.us-east-1.compute.amazonaws.com. Other useful bits of information from this command include the availability zone in which the instance is running (us-east-1c) and the time that the instance started.

Once the instance is running you can log into it with ssh. On Fedora's images you should log in as the user ec2-user:

$ ssh -i mykey.pem ec2-user@ec2-204-236-168-22.us-east-1.compute.amazonaws.com
[ec2-user@i-910fbbd6 ~]$ cat /etc/fedora-release
Fedora release 16 (Verne)

You can now use the instance as you would use any other computer running Fedora.

Terminate the Instance

When you finish using the instance you should terminate it with euca-terminate-instances to free up resources and reduce costs:

euca-terminate-instances i-910fbbd6

Using EBS

Managing Volumes

EBS volumes act like removable disks that you can attach to instances, except that you can create and destroy them at will. Each volume is specific to an availability zone. What follows are descriptions of how to use volumes.

Creating Volumes

You can create a volume of nearly any size, in 1 GiB increments. As of the time of writing, the maximum size of a volume is 1 TiB. To create a new, empty volume, choose a size (in GiB) and the availability zone in which to create it and supply those values to euca-create-volume:

$ euca-create-volume -s 10 -z us-east-1c
VOLUME  vol-23ca3542  10  creating  2011-10-11T00:00:00.000Z

The command's output contains the ID of the newly-created volume. In the example above, the volume's ID is vol-23ca3542. You will need this ID to use tools that need to refer to the volume.

Describing Volumes

euca-describe-volumes will provide a list of all volumes available to you in the entire region in addition to where they are attached:

$ euca-describe-volumes
VOLUME  vol-23ca3542  10  us-east-1c  available  2011-10-11T00:00:00.000Z

Using Volumes

For an instance to make use of a volume you must first attach the volume to the instance. You also need to supply a device name that the volume should appear as from inside the instance. The device name you choose must be /dev/sdX, where X is a letter. It will appear inside the instance as either /dev/sdX or /dev/xvdX.

euca-attach-volume -i i-910fbbd6 -d /dev/sdf vol-23ca3542
Volumes are zone-specific
Each volume exists only within one availability zone. A volume in a given zone can therefore only be attached to instances that are running in the same zone.

Once you have attached a volume to the instance it will appear as a disk in the instance's /dev directory, ready to be formatted and used.

[ec2-user@i-910fbbd6 ~]$ mkfs.ext4 /dev/xvdf
[ec2-user@i-910fbbd6 ~]$ mount /dev/xvdf /mnt

When you finish using a volume you can unmount it from within the instance and then detach it:

[ec2-user@i-910fbbd6 ~]$ umount /dev/xvdf
[ec2-user@i-910fbbd6 ~]$ logout
$ euca-detach-volume vol-23ca3542

Deleting Volumes

When you finish using a volume you can delete it to free up resources and reduce costs:

euca-delete-volume vol-23ca3542

Using Snapshots

Volume snapshots provide an easy way to save a backup copy of an entire volume. Unlike a volume, a snapshot is available to all availability zones within a region, which makes snapshots the simplest way to copy a volume between availability zones.

Creating a Snapshot

You can create a snapshot by providing the name of the volume you wish to take a snapshot of to euca-create-snapshot:

$ euca-create-snapshot vol-23ca3542
SNAPSHOT  snap-00acc96e  vol-23ca3542  pending  2011-10-11T00:00:00.000Z

The command's output contains the ID of the newly-created snapshot. In the example above, the snapshot's ID is snap-00acc96e. You will need this ID to use tools that need to refer to the snapshot.

Volumes should not change while creating snapshots
Snapshots take time to complete. While a snapshot is in progress, ensure that the contents of the volume do not change to avoid data corruption. You can monitor a snapshot's progress with euca-describe-snapshots.

Describing Snapshots

euca-describe-snapshots will provide a list of all snapshots available to you in the region:

$ euca-describe-snapshots
SNAPSHOT   snap-00acc96e   vol-042d3a6a  completed  2011-10-12T05:56:29.000Z  100%
Dealing with too much output
By default, euca-describe-snapshots will list all snapshots that you can access, including those that you do not own. To narrow down the command's output you can supply a list of snapshots to the command or use any of its numerous methods of filtering output.

Creating Volumes from Snapshots

To copy the contents of a snapshot to a new volume, run euca-create-volume and specify a snapshot instead of a size:

euca-create-volume --snapshot snap-00acc96e -z us-east-1c

You can create multiple volumes from the same snapshot. Each volume will be independent of the others.

Deleting Snapshots

To delete a snapshot, use euca-delete-snapshot. Any volumes created from that snapshot will be unaffected.

euca-delete-snapshot snap-00acc96e