Migrating my hugo blog from Gitlab/AWS S3 to Github Pages with Actions


Until Now - The State of the union

This blog is generated using hugo, an awesome static site generator. So far, the workflow I used to deploy it was:

  • Push commit to the source repository on GitLab
  • GitLab CI kicks off on receiving the push
    • CI downloads latest version of hugo and generates the static site
    • Runs aws-cli to sync the new files to AWS S3
  • S3 serves the static site
  • Cloudflare provides:
    • DNS services (so I can use https://shantanugoel.com without having to prefix it with a www)
    • CDN/Caching services for resilience and keeping S3 bills low for data transfer

Why? What broke the camel’s back

I was mostly happy with this setup with a couple of niggles at the back of my mind, vis. a vis.:

  • aws s3 sync was finicky sometimes and sometimes, albeit rarely, the sync didn’t complete
  • It took a long time to deploy a new update. Whenever I made a new post, it took around 5 minutes or more to appear due to issues stemming from GitLab CI taking time but mostly due to the time spent in s3 sync.
  • I also wanted to reduce the moving parts. I could have used GitLab pages but they’ve proved unreliable whenever I tried to use them
  • When the files ran into 10s of thousands, if I did lot of updates, I started running into incurring some cost due to S3 API usage. Especially because it was hard to upload only changed files. I tried few methods with hashes etc and it could be improved further but still I wanted to get rid of this issue.
  • Finally, the last push was announcement of GitHub Actions. I had the itch, as usual, to try out the new and shiny, and so here we are.

What? All I need is everything

So, what I wanted to do was replace the source storage (GitLab to GitHub), CI (GitLab to GitHub Actions) and site host (AWS S3 to GitHub Pages). I kept Cloudflare’s DNS and CDN services as is because they are awesome, and I still want to continue to use the naked domain without www. Another thing I wanted while going this way was to keep the source repository and pages repository on GitHub separate because I want to be able to keep drafts in GitHub without having them available publicly.

We’ll see in the next section how all this was put together.

How? Let’s get our hands dirty, shall we!

The migration turned out to be pretty simple. This is how my setup looks like now:

  • Made a private source repository at GitHub, let’s call this shantanugoel.com-source

    • This repository will be used to contain the website’s source
  • Made another public repository which will contain our published site, e.g. https://github.com/shantanugoel/shantanugoel.com

  • Go to the settings section of this repository (e.g. https://github.com/shantanugoel/shantanugoel.com/settings/ ) and set it to serve GitHub pages from the master branch

  • Go to the Deploy Keys section of this repository (e.g. https://github.com/shantanugoel/shantanugoel.com/settings/keys ) and generate a deploy key with write access to the repository. Copy this key somewhere as you won’t be able to see this again.

  • Go to your source repository’s Secrets section (e.g. https://github.com/shantanugoel/shantanugoel.com-source/settings/secrets ), and click on “Add a new secret”.

    • Add a new secret named ACTIONS_DEPLOY_KEY and set its value to the key that you copied in previous step.
    • This secret will be used in the action workflow that you’ll create in the next step.
  • Added a GitHub Action workflow specification to the repository at this location relative to root of repository .github/workflows/deploy.yml

    • The filename can be anything ending in .yml and written in YAML, but the directory path needs to be exactly as above.
    • The contents of the file are below. I’ve used comments in the contents to explain their usage.
# Names can be anything as per your choice in the file
name: CI

# This action runs whenever somec hange is pushed to the master branch of this repository
on:
  push:
    branches:
    - master

jobs:
  build-deploy:

    runs-on: ubuntu-18.04

    steps:
    # This action is provided by Github. It checksout the repository along with submodules
    # for the hugo theme I use
    - uses: actions/checkout@v1
      with:
        submodules: true
    
    # This action downloads the latest hugu
    - name: Setup Hugo
      uses: peaceiris/actions-hugo@v2
      with:
        hugo-version: 'latest'

    # Run hugo to generate the site and minify css/js etc
    - name: Build
      run: hugo --minify

    # Publish the website to the designated github pages repository
    # ACTIONS_DEPLOY_KEY uses our deploy key to publish since the action in source repository
    # does not have access permissions to other repositories by default
    # EXTERNAL_REPOSITORY param is used to specify the github pages repository, otherwise by default
    # this action publishes to current repository itself
    # PUBLISH_BRANCH specifies the branch which is set to serve GitHub Pages
    # PUBLISH_DIR is the dir where hugo generates the site contents
    - name: Deploy
      uses: peaceiris/actions-gh-pages@v2
      env:
        ACTIONS_DEPLOY_KEY: ${{ secrets.ACTIONS_DEPLOY_KEY }}
        EXTERNAL_REPOSITORY: shantanugoel/shantanugoel.com
        PUBLISH_BRANCH: master
        PUBLISH_DIR: ./public
  • Now, anytime you commit and push a change to your source repository, it will be live on your destination within a minute or so.

See also