Verifying Data Integrity with AWS S3

July 20, 2018 dev , aws s3, csharp

When it comes to transferring files over network, there’s always a risk of ending up with corrupted files. To prevent this on transfers to and from S3, AWS provides us with some tools we can leverage to guarantee correctness of the files.

Verifying files while uploading

In order to verify the file is uploaded successfully, we need to provide AWS the MD5 hash value of our file. Once upload has been completed, AWS calculates the MD5 hash on their end and compares the both values. If they match, it means it went through successfully. So our request looks like this:

var request = new PutObjectRequest
{
    MD5Digest = md5,
    BucketName = bucketName,
    Key =  key,
    FilePath = inputPath,
};

where we calculate MD5 hash value like this:

using (var stream = new FileStream(fullPath, FileMode.Open, FileAccess.Read, FileShare.Read))
{
    using (var md5 = MD5.Create())
    {
        var hash = md5.ComputeHash(stream);
        return Convert.ToBase64String(hash);
    }
}

In my tests, it looks like if you don’t provide a valid MD5 hash, you get a WinHttpException with the inner exception message “The connection with the server was terminated abnormally”

If you provide a valid but incorrect MD5, the exception thrown is of type AmazonS3Exception with the message “The Content-MD5 you specified did not match what we received”.

Amazon SDK comes with 2 utility methods named GenerateChecksumForContent and GenerateChecksumForStream. At the time of this writing, GenerateChecksumForStream wasn’t available in the AWS SDK for .NET Core. So the only method worked for me to calculate the hash was the way as shown above.

Verifying files while downloading

When downloading we use EtagToMatch property of GetObjectRequest to have the verification:

var request = new GetObjectRequest
{
	BucketName = bucketName,
    Key =  key,
    EtagToMatch = "\"278D8FD9F7516B4CA5D7D291DB04FB20\"".ToLower() // Case-sensitive
};

using (var response = await _s3Client.GetObjectAsync(request))
{
    await response.WriteResponseStreamToFileAsync(outputPath, false, CancellationToken.None);
}

When we request the object this way and if the the MD5 hash we send doesn’t match the one on the server we get an exception with the following message: “At least one of the pre-conditions you specified did not hold”

Once important point to keep in mind is that AWS keeps the hashes in lowerc-ase and the comparison is case-sensitive so make sure to convert everything to lower-case before you send it out.

Resources

AWS Certification Notes: AWS Certified Cloud Practitioner

July 8, 2018 dev , aws certification, certified_cloud_practitioner

As I decided to get full AWS certification I started preparing for the exams. I wanted to start with the Cloud Practitioner just to get my self accustomed with the exam procedure in general. Here’s my notes:

Exam Objectives

According to Amazon’s official exam description page, this exam validates the following aspects:

Define what the AWS Cloud is and the basic global infrastructure
Describe basic AWS Cloud architectural principles
Describe the AWS Cloud value proposition
Describe key services on the AWS platform and their common use cases (for example, compute and analytics)
Describe basic security and compliance aspects of the AWS platform and the shared security model
Define the billing, account management, and pricing models
Identify sources of documentation or technical assistance (for example, whitepapers or support tickets)
Describe basic/core characteristics of deploying and operating in the AWS Cloud

Main Subject Areas

Billing and pricing (12%)
Cloud concepts (28%)
Technology (36%)
Security (24%)

Preparation Notes

aws.training Online Training Notes

Cloud Computing

On-demand delivery of IT resources. Can scale up and down based on needs.
Fosters agility (number one reason why customers switch to cloud computing): Speed (global reach), experimentation (operations as code, templated environments with CloudFormation) and culture of innovation (experiment quickly with low cost)
Region vs Availability Zone (AZ): Region is a physical location in the world which contains multiple AZs. AZs contain one or more discrete data centers with independent resources and housed in different facilities.
Using Auto Scaling and ELB, scale up and down and only pay for what you use.
Ability to deploy systems in multiple regions (lower latency)
Ability to choose the region where data is stored
AWS is responsible for data center security
Security policy can be formalized (as code)
Ability to recover from failures

Core Services

Global Infrastructure:
- Regions: Have multiple AZs
- Availability Zones: Have one or more data centres. They all have different power supplier companies.
- Edge Locations: Used by CloudFront.
Amazon Virtual Private Cloud (VPC)
- Uses same concepts as on-premise networking
- VPC can span across multiple AZs
- Supports multiple subnets (each of which can be deployed in a different AZ)
- Can create public-facing subnets and private-facing subnets within the same VPC
- Each account can create multiple VPCs
- Using fewer VPCs is recommended to avoid complexity
- Can assign Internet Gateways to specific subnets to allow public access
Security Groups
- Act like a built-in firewall
- Best practice: Allow what’s required only and block everything else
Compute Services
- Amazon Lightsail: Managed Virtual Private Servers service
  - Fixed price.
  - Includes a static IP, DNS management and storage
  - Fixed configuration
  - Uses t2 class EC2 instances under the hood
- AWS Elastic Compute Cloud (EC2)
  - Difference betwwen EC2-Classic and EC2-VPC
    - EC2-Classic: Your instances run in a single, flat network that you share with other customers.
    - EC2-VPC: Your instances run in a virtual private cloud (VPC) that’s logically isolated to your AWS account.
- AWS Lambda
  - No servers to manage
  - Pay as you go: Only pay for the time your code runs
  - Continuous scaling
  - Supports subsecond metering. Charged for every 100 milliseconds of execution time
  - Some limitations apply: AWS Lambda Limits
- AWS Elastic Beanstalk
  - Platform as a service
  - Allows quick deployments of applications
  - Allows HTTPS on load balancers
  - Supports various platforms (node.js, python etc)
  - Provisions the resources required (EC2, ELB etc) automatically
- Application Load Balancer
  - 2nd type of load balancer offered by ELB
  - Comes with new features
  - Supports routing to containers
  - Key terms:
    - Listeners: A process that checks for connection requests using the configuration (protocol, port)
    - Target: Destination for traffic
    - Target Group: Each target group routes requests to one or more registered targets
  - Target checks can be performed per target group basis
  - Integrates with ECS and supports dynamic ports utilized by scheduled containers
  - Need to create at least 2 AZs when creating an Application Load Balancer
  - Ability to route to different target groups based on port or path
- Elastic Load Balancer
  - Supports sticky sessions
  - Supports multiple AZs and cross-zone balancing
  - For HTTP/HTTPS it uses “Least Outstanding” method to route the request. For TCP, it uses “Round robin”. The least outstanding routing algorithm is defined as “A ‘least outstanding requests routing algorithm’ is an algorithm that choses which instance receives the next request by selecting the instance that, at that moment, has the lowest number of outstanding (pending, unfinished) requests.”
- Auto Scaling
  - Adding more instances: Scaling out, terminating instanes: Scaling in
  - Launch configuration answers “What” (AMI, Instance type, Security Groups, Roles). Creating an LC is similar to creating a new EC2 instance.
  - Auto Scaling Group answers “Where” (VPC and subnet(s), load balancer, minimum and maximum instances, desired capacity)
  - Auto Scaling Policy answeres “When” (Scheduled/on-demand/scale out or in policy)
Amazon EBS
- Allows point-in-time snapshots and creation of a new volume from a snapshot
- Supports encrypted volumes free of charge
- EBS volume must be created in the same AZ as the EC2 instance that will use it
Amazon S3
- Objects are stored redundantly across multiple facilities withing the same region
- The bucket names must be globally unique.
- Can configure cross-region replication for backup and disaster recovery
- Amazon S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket
Amazon Glacier
- Vaults have access and lock policies attached to them
- Each AWS account can create up to 1000 vaults
- Can create an S3 lifecycle policy to move to Glacier then delete after a period of time
  - Supports up to 40TB max item size (S3 supports 5TB)
  - It costs more per retrieval
  - Vault Lock allows you to easily deploy and enforce compliance controls for individual Amazon Glacier vaults with a vault lock policy. You can specify controls such as “write once read many” (WORM) in a vault lock policy and lock the policy from future edits. Once locked, the policy can no longer be changed
Amazon RDS
- Can create a standby copy in a different AZ within the same VPC
- Can create multiple read replicas (in different regions as well)
Amazon DynamoDB
- Always uses SSD for storage
- Supports auto-scaling. Increases/decreases the throughput based on load
- Tables are partitioned by primary key
- Two query methods: Query and Scan
- Query uses the primary key to find items. Scan can use any attribute.
- Scan is slower than Query as it needs to look at all items
Amazon Redshift
- Managed data warehouse
- Supports standard SQL
- Supports ODBC/JDBC connectors
Amazon Aurora
- Managed MySQL-clone (compatible with MySQL)
- After a crash it doesn’t need to redo log files. It performs it on every read operation which reduces the restart time
AWS Trusted Advisor
- Checks all the resources used and gives advice based on best practices
- 5 categories:
  - Cost optimisation
  - Performance
  - Security
  - Fault tolerance
  - Service limits
- Upgrading support plan enables all Trusted Advisor recommendations, free plan doesn’t include all
- Has an API and can be used to automate optimisations
- Can use it with CloudWatch alarms

Security

The AWS Shared Responsibility Model
- AWS handles infrastructure security
- AWS provides 3rd party audit reports
- AWS’s responsibilities include: OS and database patching, firewall configuration and disaster recovery
- Customer is responsible for putting logical access controls in place and protect account credentials
- Customers are responsible to secure everything they put in the cloud
AWS Service Catalog
- Allows to centrally manage common IT services that are approved for use on AWS
AWS IAM
- Controls access to AWS resources
- Handles Authentication (who can access resources) and authorization (how they can use resources)
- Users can have programmatic access and/or console access.
- Best practices
  - Delete root account keys. Instead use IAM accounts
  - Use MFA
  - Use groups
  - Use roles
  - Rotate credentials
  - Remove unnecessary users
AWS Security Compliance Programs
- Risk Management: Follow the following standards:
  - COBIT
  - AICPA
  - NIST
- Constantly scans service endpoints for vulnerabilities
- Compliance programs are listed here
AWS Security Resources
- AWS Trusted Advisor: Helps to follow best practices
- AWS Account Teams: First point of contact
- AWS Enterprise Support: 15-minute response time, 24x7 availability
- AWS Partner Network
- AWS Advisories and Bulletins
- AWS Auditor Learning Path
- AWS Compliance Solutions Guide: https://aws.amazon.com/compliance/solutions-guide/
- AWS Security Blog: https://aws.amazon.com/blogs/security/

Architecting

Well-architected framework: https://aws.amazon.com/architecture/well-architected/
Fiver pillars of the framework
- Operational excellence
- Security
- Reliability
- Performance efficency
- Cost optimization
Fault Tolerance
- Remain operational even if components fail
- Built-in redundancy of an application’s components
High-Availability
- A concept for the whole system
- “Always” functioning and accessible
- Without human intervention
- HA Service Tools
  - Elastic Load Balancer
  - Elastic IP Addresses
  - Amazon Route 53
  - Auto Scaling
  - Amazon CloudWatch

Pricing and Support

Core concepts in billing
- Pay as you go: No up front expenses
- Pay less when you reserve: Reserved instances cost less
- Pay even less per unit by using more: Tiered pricing for services such as S3, EC2 etc. Data transfer in is always free of charge.
- Pay even less as AWS grows
Amazon RDS Costs
- Clock hours of server time
- Database characteristics
- Database purchase type
- Number of DB instances
- Provisional storage
  - No charge for backup storage of up to 100% of database storage for active databases. After terminated, the backups are charged
- Additional storage
- Requests
- Deployment type
- Data transfer

General Notes

Exam Centre

The exam centre was very small and there was some sort of music studio next door so there was constant noise. OVerall it was a bit disappointing to take the exam in a desolated business centre and in a small room but it’s the same exam regardless so I was able to focus on the questions after I got used to the noise.

Exam Process

My exam was scheduled at 3:00. I arrived early and the proctor allowed me to sit at 2:00 as there were empty places in the exam room. It was a nice surprise because I definitely didn’t want to wait for another hour in that heat
At one point, the screen froze. I had to call the proctor. He restarted the application. Fortunately it just resumed where it left off.
CCP is the easiest AWS exam but even so there were some challenging questions. Mostly non-technical questions were hard for me (like questions related to support plans). I don’t think I’ll everr see those questions in other exams.

Exam Result

… and the result is : Pass

Amazon has an interesting scoring system apparently. Right after you submit the exam, the screen displays Pass or Fail but not the actual score. You receive that in a separate email. They don’t even announce what the passing score is as they reserve the right to change when they see fit. It’s also based on other candidates’ results too so almost like a curve. Anyway, it was quite a relief to see the pass result on the screen. I’m still curiously waiting for the actual score though.

My next exam will be AWS Certified Solutions Associate. I’ll post my exam notes after that exam as well.

Resources

Automated Email Processing with AWS SES and Lambda

June 26, 2018 dev , aws ses, lambda

A few years ago AWS announced a new SES feature: Incoming Emails. So far I have only used it once to receive domain verification emails to an S3 bucket but haven’t built a meaningful project. In this blog post my goal is to develop a sample project to demonstrate receiving emails with SES and processing those emails automatically by triggering Lambda functions.

As a demo project I will build a system that automatically responds to a sender with my latest CV as shown in the diagram below

Receiving Email with Amazon Simple Email Service

Amazon Simple Email Service (SES) is Amazon’s SMTP server. It’s core functionality has been sending emails but Amazon kept adding more features such as using templates and receiving emails.

Step 1: Verify a New Domain

First, we need a verified domain to receive emails. If you already have one you ca skip this step.

Step 1.1: In the SES console, click Domains –> Verify a New Domain
Step 1.2: Enter the domain name to verify and click Verify This Domain

Step 1.3: In the new dialog click Use Route 53

(This is assuming your domain is in Route53. If not you have to verify it by other means)

Step 1.4: Make sure you check Email Receiving Record checkbox and proceed

Step 1.5 Confirm verification status

Go back to Domains page in SES console and make sure the verification has been completed successfully

In my example, it only took about 2 minutes.

Step 2: Create a Lambda function to send the CV

In the next step we will continue to configure SES to specify what to do with the received email. But first we need the actual Lambda function to do the work. Then we will connect this to SES so that it runs everytime when we receive an email to a specific email.

Step 2.1: Create a Lambda function from scratch

Step 2.2: Create an SNS topic

SES will publish emails to this topic. We will do the plumbing and give necessary permissions later.

Step 2.3: Create subscription for the Lambda function to SNS topic

Now we tie the topic to our Lambda by creating a subscription

Step 2.4: Attach necessary permissions to the new role

In my example, I store my CV in an S3 bucket. So the policy would need to receive SNS notifications, read access to S3 bucket and permissions to send emails.

By default a new Lambda role comes with AWSLambdaBasicExecutionRole attached to it

First add this to have read-only access to a single bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObjectAcl",
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::{BUCKET NAME}",
                "arn:aws:s3:::*/*"
            ]
        }
    ]
}

Then this to be able to send emails

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "ses:SendEmail",
                "ses:SendTemplatedEmail",
                "ses:SendRawEmail"
            ],
            "Resource": "*"
        }
    ]
}

I like to keep these small, modular policies so that I can reuse then in other projects.

After adding the policies you should be able to see these in your Lambda function’s access list when you refresh the function’s page:

Step 3: Develop the Lambda function

In this exmample I’m going to use a .NET Core and C# 2.0 to create the Lambda function.

Step 3.1: Install Lambda templates

In Windows, AWS Lambda function templates come with AWS Visual Studio extension but in Mac we have to install them via command line.

dotnet new -i Amazon.Lambda.Templates::*

Step 3.2: Create Lambda function

dotnet new lambda.EmptyFunction --name SendEmailWithAttachmentFromS3 --profile default --region eu-west-1

Step 3.3:

Now it’s time for the actual implementation. I’m not going to paste the whole code here. Best place to get it is its GitHub repository

Step 3.4 Deploy the function

Create an IAM user with access to Lambda deployment and create a profile locally named deploy-lambda-profile.

dotnet restore
dotnet lambda deploy-function send_cv

Step 4: Create a Receipt Rule

Now that we have a verified domain, we need a rule to receive emails.

In my example project, I’m going to use an email address that will send my latest CV to a provided email adress.

Step 4.1: In the Email Receiving section click on Rule Sets –> Create a Receipt Rule
Step 4.2: Add a recipient

Step 4.3: Add an Action

Now we choose what to do when an email is received. In this example I want it to be published to an SNS topic that I created earlier. I could invoke the Lambda function directly but leveraging publish/subscribe gives me more flexibility as in I can change the subscriber in the future or add more stuff to do without affecting the rule configuration.

Since it supports multiple actions I could choose to invoke Lambda directly and add more actions here later on if need be but I’d like to use a standard approach which is all events are published to SNS and the interested parties subscribe to the topics.

I chose UTF-8 because I’m not expecting any data in the message body so it doesn’t matter too much in this example.

Step 4.4 Give it a name and create the rule.

Step 4: Test end-to-end

Now that it’s all set up, it is time to test.

Step 4.1: Send a blank email to cv@vlkn.me (Or any other address if you’re setting up your own)

Step 4.2:

Then after a few seconds later, receive an email with the attachment:

The second email is optional. Basically, I creted an email subscriber too. So that whenever a blank email is received I get notified by SNS directly. This helps me to keep an eye on traffic if there is any.

Playground for the mind

It's all about the journey, not the destination

Verifying Data Integrity with AWS S3

Verifying files while uploading

Verifying files while downloading

Resources

AWS Certification Notes: AWS Certified Cloud Practitioner

Exam Objectives

Main Subject Areas

Preparation Notes

aws.training Online Training Notes

Cloud Computing

Core Services

Security

Architecting

Pricing and Support

General Notes

Exam Centre

Exam Process

Exam Result

Resources

Automated Email Processing with AWS SES and Lambda

Receiving Email with Amazon Simple Email Service

Step 1: Verify a New Domain

Step 2: Create a Lambda function to send the CV

Step 3: Develop the Lambda function

Step 4: Create a Receipt Rule

Step 4: Test end-to-end

Resources