Published: Tue 18 July 2017
By Jonathan Eifrig
In AWS .
... or, 'Why can't I give my bucket a Route 53 DNS name?'
TL;DR: Because S3 buckets don't get an IP number just because their
'Use this bucket to host a website' toggle is checked. I knew that...
The Setup
I spent a good chunk of yesterday morning trying to figure out why I was
unable to give a DNS entry in Route 53 to the S3 bucket that was hosting this
blog. Resolving this problem gave me a much better understanding of how S3
implements HTTP access to S3 buckets and how DNS, HTTP, and TCP interact in
non-obvious ways.
Now, I knew I wanted to host this blog as a static site in S3, as I this would
be a super-cheap hosting solution and it would give me a chance to experiment
with adding more AWS functionality later. So, after a little fooling
around with Pelican I had a skeleton website ready to go, loaded into an S3
bucket.
Just to review, an Amazon S3 bucket is a simple object store where you can
put arbitrary data as opaque 'objects', and associate each object with a
key value. There's an API for accessing S3 objects programmatically, but
there is also an option to make the contents of an S3 bucket available via
HTTP; this is what makes hosting a website in an S3 bucket so attractive.
When you create an S3 bucket you can enable this functionality by toggling
the 'Use this bucket to host a website' button in the AWS console. If you do this,
the console will reward you with a little note like
Endpoint : http://scratch-bucket.s3-website-us-east-1.amazonaws.com
This is great! All I would need to do would be to enable this feature and my
website would be up and running. Sure
enough, pasting http://scratch-bucket.s3-website-us-east-1.amazonaws.com
into my browser's address bar brought up my scratch website. Success!
All that was left to do was give this endpoint a more convenient URL and I'd be
ready to go. How hard could that be?....
Pain And Woe...
Now, I had read the Route 53 docs. I had watched the informative videos. I knew
that I could use Route 53 (Amazon's DNS service) to map a DNS name to my
bucket. I could just make an entry for blog.foobar.com
pointing to my
bucket and I'd be off to the races. Simple!
The way you're supposed to do this using Route 53 is by creating an 'alias'
record for blog.foobar.com
. However, trying to create such an alias in
Route 53 console resulted in the pulldown menu being populated with an unhelpful
collection of S3 buckets:
— S3 website endpoints —
No Targets Available
OK, well, perhaps there was something wrong with Route 53 or my bucket; I'd
fix that later. We don't need this helpful pulldown menu: we know the name
of the S3 bucket we want to use and we know they're globally unique, so we'll
just jam it into the edit box ourself! But putting 'scratch-bucket
' into
the 'Alias Target' box just gave another error when trying to save the
DNS resource record:
The record set could not be saved because:
- Alias Target contains an invalid value.
Well, OK, trying to make an alias wasn't working. But I'm smart: all I want
is for clients trying to access 'blog.foobar.com
' to be routed to
'scratch-bucket.s3-website-us-east-1.amazonaws.com
', and I know how to do
that: I just needed a DNS CNAME record for 'blog.foobar.com
' that tells DNS
clients that they should redirect to the S3 address instead, right?
But this doesn't work, either: this didn't even resolve correctly! What was
going on?
The Light Finally Dawns...
Hmmm... Alright, I knew that the scratch bucket was accessible: maybe I could
just make a DNS A record that associated 'blog.foobar.com
' with the IP number of
S3 bucket? First I had to fire up nslookup(1)
to see what IP number was:
> nslookup scratch-bucket.s3-website-us-east-1.amazonaws.com.
Server: 192.168.1.1
Address: 192.168.1.1#53
Non-authoritative answer:
scratch-bucket.com.s3-website-us-east-1.amazonaws.com canonical name = s3-website-us-east-1.amazonaws.com.
Name: s3-website-us-east-1.amazonaws.com
Address: 52.216.65.234
Oh. Of course.
Just because S3 buckets that have their 'Bucket Hosting' toggle turned on have
DNS entries doesn't mean they have their own IP numbers. Indeed, nslookup
shows us that all such S3 buckets in an Amazon region are handled by a pool of
IP numbers. The HTTP handler that winds up handling an incoming request relies on the
HTTP Request URL:
header to determine which S3 bucket to look in.
TL;DR : If you want to host a website in AWS using S3, you need to give your
S3 bucket the same name as your DNS hostname.
This is actually a terrible feature of AWS. There's really nothing to stop a
bad actor from essentially cybersquatting on S3 bucket names. I shouldn't be
able to create an S3 bucket named 'sortinghat.google.com
' and thereby
prevent Google from hosting a website at that address using S3.