Amazon Route 53 • CloudFront • DNS Troubleshooting • Beginner Incident Response
Project Type: Real-world AWS troubleshooting incident
Services Used: Amazon Route 53, Amazon CloudFront, ACM
Objective: Restore website reachability after missed AWS domain verification and investigate DNS failure
Earlier today (26th March), one of my previously working domains suddenly stopped resolving. This turned out to be due to my own oversight, as I had missed an AWS verification email when I first set the site up. I later received a notification that the domain had been suspended and needed email verification to be restored. After completing the verification, I expected the site to come back online shortly, but it remained unreachable. Not ideal timing, especially as I’m currently job hunting and relying on the site as part of my CV.
At first, I thought this might just be a delay caused by DNS propagation. After testing the domain more carefully, it became clear that the issue was not a normal delay. The domain was returning NXDOMAIN, which means the DNS lookup could not find a valid record for the site.
The goal of this troubleshooting exercise was to identify whether the outage was temporary or whether there was still a configuration issue inside AWS.
The website uses a Route 53 hosted zone for DNS and points the root domain to a CloudFront distribution. ACM validation CNAME records were also present for certificate-related checks. I plan on creating a separate post in the future outlining how I built this site and the services I used.
User Browser
↓
DNS Lookup for jackdanielpainter.com
↓
Amazon Route 53 Hosted Zone
↓
Apex A Record / Alias
↓
Amazon CloudFront Distribution
↓
Website Content
The main problem was that the website could not be reached in a browser, even after domain verification had been completed.
A DNS lookup from the command line returned:
nslookup jackdanielpainter.com *** can't find jackdanielpainter.com: Non-existent domain
This was the key clue. A result like this does not normally point to a web server problem. It points to a DNS-level problem, which means the domain itself is not resolving correctly.
I checked the Route 53 registered domain details first. These showed that the domain itself was active and using AWS nameservers.
This suggested that the issue was not caused by the domain being suspended anymore. It also suggested that the problem was somewhere between Route 53 DNS records and the CloudFront target.
I then checked the hosted zone and found these records:
jackdanielpainter.com A → d1e90n9pxjm7ij.cloudfront.net
jackdanielpainter.com NS → ns-649.awsdns-17.net
ns-172.awsdns-21.com
ns-1352.awsdns-41.org
ns-1618.awsdns-10.co.uk
jackdanielpainter.com SOA
ACM validation CNAME records
At first glance, this looked correct. However, the website was still down.
The next useful test was to query the domain’s nameservers directly:
nslookup -type=ns jackdanielpainter.com
This confirmed that the domain was delegated to the expected AWS nameservers. That was an important finding because it ruled out one of the most common Route 53 problems: mismatched nameservers.
So at that stage, the troubleshooting path looked like this:
That meant the issue was not simply “wait longer”. There was still a DNS record problem that needed attention.
The issue turned out to be with the apex A record for the root domain. The website depended on Route 53 returning the CloudFront target correctly for jackdanielpainter.com.
Even though the record looked present in the hosted zone, the domain was still failing resolution. That meant the website was not going to recover by itself.
From a beginner AWS perspective, this was a useful lesson: seeing a record in the console does not always mean the DNS response is healthy. The real test is what external DNS tools return.
To recover the site, I focused on the Route 53 apex record for the root domain and attempted to recreate it correctly.
The record needed to represent the root domain and point traffic to CloudFront:
Record name: jackdanielpainter.com Record type: A Target: CloudFront distribution
While working on this, I also hit a Route 53 console issue where deleting the A record and recreating it immediately produced this error:
Thankfully the action had not fully cleared yet even though the record looked deleted. No harm done. No further action required
This incident was a good example of working through AWS troubleshooting step by step rather than guessing.
The main technical lesson was understanding the difference between:
As someone still learning AWS, this incident helped me understand a few important things more clearly.
This was a valuable reminder and good insight into cloud troubleshooting and how we should be isolating based on layers:
Domain Registration
↓
Nameserver Delegation
↓
Hosted Zone Records
↓
CloudFront Target
↓
Website Response
If one layer fails, the whole site can appear offline.