Reading Katie Zhu’s post on NPR’s news app architecture got me curious about a setup where most of the content is static and can be hosted on S3 and EC2 is primarily used to generate the static content which is then uploaded to S3. The benefits were obvious:
- Cost: S3 is cheaper than EC2.
- Reliable: S3 doesn’t go down near as frequently as EC2.
- Scalable: Since it’s primarily static you don’t have to worry about additional capacity or dealing with caching, databases, and all the other fun things.
- Simpler: There are no weird server issues here. As long as you generate the right content and your rendering is good, you don’t need to worry about a web server acting up.
I’ve been meaning to write a script that would scrape Hacker News in order to show me the top content I missed while sleeping. I had some time this weekend and decided to give it a go using this “pseudo-static” approach. The result is called Yet Another Hacker News Reader (YAHNR) and you can take a look at the code on GitHub. Turns out it was pretty simple to write and the most difficult part was thinking differently about the problem. Whereas I’d keep the content in a database I ended up storing them in static JSON files and instead of having the logic to generate the HTML page live on a web server I have it using Mustache templates.