This post is a draft

A HAR file is really useful when you want to capture all the network requests associated with loading a web page in a browser. It can help when troubleshooting items loading and it can help when looking at page performance. Typically you’ll create a HAR file from your browser’s development tools pane - in Chrome it’s on the Network tab of Developer Tools. You may also get HAR files when using 3rd party tools e.g. WebPageTest offers HAR as well as a number of other formats after you’ve run a test. If you’re going to share the HAR file then you must sanitise it for any confidential information. See the Hackers Stole Access Tokens from Okta’s Support Unit article from Krebs on Security

Disclaimer: WebPageTest is now owned by Catchpoint who offer commercial products in this area. We use Catchpoint tools at work but I'm not endorsing Catchpoint

While you can create one-off HAR exports in Chrome easily, there may be scenarios where you want to capture a regular series of HAR files e.g. there’s a performance issue and you’re wondering if it’s related to the time of day or day of the week. So, I decided to Dockerise Chrome, run it headless and call the container from a Python script that I added to a cron job. I’m not a Docker or Python expert so plenty of faults can be found.

The Dockerfile is pretty simple:

FROM node:latest

# Install Chromium
RUN apt-get update && apt-get install chromium -y

# Install chrome-har-capturer
RUN npm install chrome-har-capturer -g

COPY ./docker-entrypoint.sh /

EXPOSE 9222

VOLUME ["/data"]

ENTRYPOINT ["/docker-entrypoint.sh"]

The entrypoint script to go with this is
#!/bin/bash

# Run Chromium headless in the background with remote debugging enabled so chrome-har-capture can connect to it 
/usr/bin/chromium --no-sandbox --remote-debugging-address=0.0.0.0 --remote-debugging-port=9222 --headless --disable-gpu &

# Seeing if the connection refused issue is due to hitting Chromium too quickly
sleep 5

# Now run chrome-har-capture
exec /usr/local/bin/chrome-har-capturer "$@"

It’s based on the instructions for the package chrome-har-capturer which gets you to run Chrome with remote debugging configured. I need to go back and look at why I’m running Chromium with the no-sandbox option. I had to add a sleep in because chrome-har-capturer hits Chrome straight away and it’s not always ready.