aws s3, cloudberry

Amazon S3

I have two AWS accounts and I made a mistake by using mixing the usage of services. More specifically, I hosted an application on one account but used S3 on the other. So I perpetually had to switch back and forth between accounts to access all services I used. First I thought fixing it would be a non-issue but it proved to be a rather daunting task.

Bucket naming in S3

In S3, all buckets must have unique names. You cannot use a name if it’s already taken (much like domain names). Since I was using the bucket already, creating the same bucket in the other account and copying its contents was not an option. The second idea was to create the target bucket with a temporary name, copy the contents, delete the first one and rename the target bucket. Well, guess what? You cannot rename a bucket either! Another problem is when you delete a bucket you can create a new one with the same name right away. I’m guessing this is because of the redundancy S3 provides. It takes time to propagate the operation to all the nodes. My tests showed that I could re-create the bucket in the other account only after 45 – 50 minutes.

To develop or not to develop

My initial instinct was to develop a tool to handle this operation but I decided to check out the what’s already available. I was occasionally using Cloudberry but wanted to check its competitors hoping one of the tools would support the functionality I need.

Cloudberry Explorer for Amazon S3

I find this tool quite handy. It has lots of functions and a nice intuitive. It comes in flavours: Free and Pro version. I used free version so far and unless you are a big enterprise it seems sufficient. It allows you to manage multiple AWS account. It allows copying objects among accounts but not moving a bucket (actually after my findings above I wasn’t very hopeful anyway)

AmazonS3 CloudBerry Main

As you can see in the menu bar, it supports lots of features.

S3 Browser

This one comes with a free version too as well as a paid version. The free version is limited to 2 accounts and you can only see one account at a time.

S3 Browser

I tried to copy a file and paste to another but it got an Access Denied error. I could do the same thing with Cloudberry in seconds by simply dragging and dropping to the target folder.

Bucket Explorer

Third candidate only has a 30-day trial version as opposed to a free one. The second I installed it I knew it was a loser for me because it doesn’t support multiple accounts. Also as you can see below the UI is hideous so this is not a tool for me.

Bucket explorer

..and the winner is

Cloudberry won by a landslide! It looks much more superior than both of the other tools combined.

Operation Bucket Migration

So I backed up everything locally and deleted the source bucket so that I could create the same one in the new account. After periodically checking for 45 minutes I finally created the bucket and uploaded the files. Set the permissions and the operation was completed without any casualties.. Well, at least I thought that was the case..

Nobody is perfect!

After I uploaded the images I reloaded my blog. The first image re-appeared and I was ready for the celebrations which were abruptly interrupted by the missing images in the second post. The images were nowhere to be found locally in none of the two backups I took. I think Cloudberry has a bug when handling filenames with hyphens. I’m still not certain that is the case but that’s the only characteristic that differs from the other files. Anyway, the moral of the story is triple-check everything before you’re initiating a destructive process and don’t trust external tools blindly.

Resources

awsdevops auto_scaling

Amazon Web Services (AWS) Auto-Scaling

Auto-scaling has always been a feature of Amazon Web Services (AWS). Until today, it could be done in 2 ways:

  • Using command line tool (See resources section for the link)
  • Using Elastic BeanStalk to deploy your application

Yesterday (10/12/2013) they announced they added Auto-Scaling support to AWS console. I was planning to create auto-scaling my blog anyway so I cannot think of a better time to apply this.

Auto-scaling using AWS Management Console

Step 01: Launch Configuration

First we tell AWS what we want to launch. This step is a lot like creating a new EC2 instance. First you select an AMI. So before I started I created an AMI of my current blog and selected that one for the launch configuration. Then we select the instance properties. In this wizard we have the option for using spot instances. They are not suitable for Internet-facing applications so I’ll skip that part.

Step 02: Auto Scaling Group

At the end of Launch Configuration wizard we can select create auto-scaling group with that launch configuration and jump right into Step 2. First we specify the name and the initial instance count for the group. Also we need to choose at least 1 availability zone. I always select all of them, I’m not sure if there is any trade-off with narrowing down your selection.

An important point to pay attention here is to expand Advanced Details section because it contains the load balancer selection. For web applications auto-scaling makes sense when the instances are behind a load-balancer. Otherwise new instances could not be reached anyway. Once you create the auto-scaling group you cannot associate it with an ELB so make sure you select your load balancer at this step.

Create Auto Scaling Group

After comes another important step: Specifying scaling policies. Basically, telling AWS the action to take when it needs to scale up or down and when to do it. “When” is defined by CloudWatch alarms. For scaling up, I added an alarm for average CPU utilization over 80% for 5 minutes and for scaling up CPU utilization under 20% for 5 minutes. When high CPU alarm goes off it will take the action we select, which in my case is adding 1 more instance. And scaling down is just the opposite: remove 1 instance from the existing machine farm.

Create Auto Scaling Group

On next step we define the notifications we want to receive when an AS event is triggered. I would definitely would like to know everything that happens to my machines so I requested an email for all events.

Create Auto Scaling Group

That’s all it takes to create an AS group using the wizard.

Testing the scaling

The easiest way to test auto-scaling group is to terminate the instance it just launched. As you can see below once I killed the instance it immediately launched another one to match the minimum number requirement of AS group. So auto-scaling group is working but how can I be sure that it will launch a new instance when I need it most. Time to make it sweat a little! But first we have to setup an environment to create load on the system:

Installing Siege

The easiest and simplest load testing tool I know is a Linux-based one called Siege. To prepare my simple load testing environment I quickly downloaded siege:

wget http://www.joedog.org/pub/siege/siege-latest.tar.gz

tar -xzvf siege-latest.tar.gz

It requires a C compiler which doesn’t come out-of-the-box with an Amazon Linux AMI. So first we need to install that:

yum install gcc*

And configure it by

./configure

At the end of the configuration it instructs us to run the following commands:

Siege configuration

So after running make Siege is ready to go. We can check the configuration by

/usr/local/bin/siege -C

It should display the current version and other details about the tool.

Siege Configuration

Ready to go

Now, we have a micro instance running Siege and a small instance launched by auto-scaling.

AWS Instances

The auto-scaling is supposed to launch another instance and add it to load balancer if the CPU usage is too high on the existing one. Let’s see if it’s really working.

Under Siege!

I first created a URL file from my sitemap so that the load can be more realistic. I fired up 20 threads and it started to bombard my site:

Siege in Action

When I try to load my site it was incredibly slow. The CPU usage kept rising on the single instance until the CloudWatch alarm went off. It triggered auto-scale to launch a new instance.

AWS Instances

Now, I had 2 instances to share the load but that could only happen if the new instance was added to the Elastic Load Balancer (ELB) automatically. After a few minutes it passed the health checks and went in service.

Auto-scaling using AWS Management Console - Elastic Load Balancer Overview

At this point I had 2 instances and when I tried to load posts from my blog I noticed it was quite fast again. The CPU usage graph below tells how it all went down:

Auto-scaling using AWS Management Console - CPU utilization

My first instance (orange) was running silently and peacefully until it was attacked by Siege. After a few minutes of hard times the cavalry came to rescue (blue instance) and started getting its fair share of the load. Then ELB distributed load as evenly as possible making the system running smoothly again. OK, so the system can withstand a spike and scale itself but it costs money. What’s going to happen after the storm. So I stopped Siege and sure enough, as we’d expect, after a few minutes Low CPU alarm kicked off and set the instance count back to 1 by terminating one of the instances.

AWS Instances

Also, I was notified in every step of this process. So that I could be able to keep track of my instances at all times.

Auto-scaling using AWS Management Console - Notifications

Architecture of the system

So at this point the architecture of the system looks like this:

Auto-scaling using AWS Management Console - System Architecture

I’m planning to cover some basics (EC2, RDS, S3) in more detail in a later post. Also I’ll try to add more AWS services and enhance this architecture as I go along.

Final Words

  • If you are planning to use auto-scaling in production environment make sure to backup all your stuff externally. Also create snapshots for all the volumes.
  • Even though network traffic is cheap it still costs. So for extended tests I suggest you keep an eye on your billing statement
  • In Amazon Linux AMI Apache and MySQL don’t start automatically so you may need to update your configuration like I did. I used the script I found here.

Resources

devops

DevOps (Development + Operations) is one of most popular terms in the IT world recently. From what I’ve read and listened to so far, my understanding is it is all about continuous deployment (or delivery). Basically, you have to automate everything from development to deployment to practice DevOps.

Current problem

Traditionally, successful deployment is a huge challenge. It is mostly a manual and cumbersome process. Because of its sensitive nature the system admins are not huge fans of deployments. Also, another challenge is the miscommunication (or no communication in some cases) between system admin and development teams. They are generally run by different high-level executives and their priorities conflict most of the time.

Solution

On the philosophical side, DevOps is bringing these teams together and work in harmony. Having social events with both teams’ attendance is a key to build confidence among team members. As Richard Campbell (from RunAsRadio and .NET Rocks podcasts) says “Pizza and beer is a global lubricant”.

Dev…

On the development side, the key requirement is continuous integration. You have to able to run unit tests and acceptance tests automatically on build servers. This means development has to be done in short sprints in an agile way with frequent check-ins. One step further of this stage is continuous deployment.

…Ops

This is where the IT team comes into play. When the whole system is automated, deploying to production frequently and without much headache becomes possible. Cloud computing is one of the core technologies that makes DevOps possible. Ability to manage virtual machines programmatically (i.e. AWS, OpenStack) leads to a whole bunch of possibilities.

This is a fairly complex topic encompassing many disciplines and technologies. Also it’s quite dynamic and open to innovation. Definitely worth keeping an eye on.

Resources