Achieve a High Cache Hit Rate with Varnish
Improving the cache hit rate of Varnish will directly improve your application performance. But let's take a step back before we deepdive into the details.
What's Varnish?
Varnish is a high performance HTTP accelerator and a caching reverse proxy specifically designed to speed up the delivery of content. Did you know the core of Fastly (a CDN) is also build on top of Varnish? You'll put Varnish in front of your application and let Varnish serve the content. Cool!
How to increase the hit rate?
A high hit rate means more requests are served from the cache, reducing response time(s) & saving backend resources. Here are some strategies to improve the cache hit rate with Varnish.
Monitor requests
Knowledge is power. You need to understand whether requests are being missed or explicitly passed due to your VCL config.
Cache bypasses can be listed with
varnishtop -i ReqUrl -q "VCL_call eq 'PASS'"
Cache misses can be listed with
varnishtop -i ReqUrl -q "VCL_call eq 'MISS'"
Check memory usage
Varnish stores objects in memory to give it the speed it has. When Varnish (or the server for that matter...) has no room, Varnish will start evicting objects from memory. Evicted objects will obviously lower your cache hit rate as well!
Inspect your cache storage for Varnish by running
varnishstat -f "SMA.*.g_*" -1
Keep an eye on g_bytes
& g_space
. If g_space
is too low it could be an indication that objects are being evicted. You can check n_lru_nuked
with varnishstat
to confirm objects are being evicted due to memory pressure. Time to allocate more space for Varnish or improve your caching strategy.
Normalizing requests
Different url's for the same content? Bye-bye hit rate. Normalizing requests is the process of transforming requests into a consistent format before our cache rules are applied. Small differences in requests can create separate cache entries (for the same content), which is something we'd like to avoid as much as possible. Here's something we can do.
Remove query string params.
This only removes them internally. Analytics & tracking still work (yay?).
sub vcl_recv {
# Remove tracking query string parameters used by analytics tools
if (req.url ~ "(\?|&)(_branch_match_id|_bta_[a-z]+|_bta_c|_bta_tid|_ga|_gl|_ke|_kx|campid|cof|customid|cx|dclid|dm_i|ef_id|epik|fbclid|gad_source|gbraid|gclid|gclsrc|gdffi|gdfms|gdftrk|hsa_acc|hsa_ad|hsa_cam|hsa_grp|hsa_kw|hsa_mt|hsa_net|hsa_src|hsa_tgt|hsa_ver|ie|igshid|irclickid|matomo_campaign|matomo_cid|matomo_content|matomo_group|matomo_keyword|matomo_medium|matomo_placement|matomo_source|mc_[a-z]+|mc_cid|mc_eid|mkcid|mkevt|mkrid|mkwid|msclkid|mtm_campaign|mtm_cid|mtm_content|mtm_group|mtm_keyword|mtm_medium|mtm_placement|mtm_source|nb_klid|ndclid|origin|pcrid|piwik_campaign|piwik_keyword|piwik_kwd|pk_campaign|pk_keyword|pk_kwd|redirect_log_mongo_id|redirect_mongo_id|rtid|s_kwcid|sb_referer_host|sccid|si|siteurl|sms_click|sms_source|sms_uph|srsltid|toolid|trk_contact|trk_module|trk_msg|trk_sid|ttclid|twclid|utm_[a-z]+|utm_campaign|utm_content|utm_creative_format|utm_id|utm_marketing_tactic|utm_medium|utm_source|utm_source_platform|utm_term|vmcid|wbraid|yclid|zanpid)=") {
set req.url = regsuball(req.url, "(_branch_match_id|_bta_[a-z]+|_bta_c|_bta_tid|_ga|_gl|_ke|_kx|campid|cof|customid|cx|dclid|dm_i|ef_id|epik|fbclid|gad_source|gbraid|gclid|gclsrc|gdffi|gdfms|gdftrk|hsa_acc|hsa_ad|hsa_cam|hsa_grp|hsa_kw|hsa_mt|hsa_net|hsa_src|hsa_tgt|hsa_ver|ie|igshid|irclickid|matomo_campaign|matomo_cid|matomo_content|matomo_group|matomo_keyword|matomo_medium|matomo_placement|matomo_source|mc_[a-z]+|mc_cid|mc_eid|mkcid|mkevt|mkrid|mkwid|msclkid|mtm_campaign|mtm_cid|mtm_content|mtm_group|mtm_keyword|mtm_medium|mtm_placement|mtm_source|nb_klid|ndclid|origin|pcrid|piwik_campaign|piwik_keyword|piwik_kwd|pk_campaign|pk_keyword|pk_kwd|redirect_log_mongo_id|redirect_mongo_id|rtid|s_kwcid|sb_referer_host|sccid|si|siteurl|sms_click|sms_source|sms_uph|srsltid|toolid|trk_contact|trk_module|trk_msg|trk_sid|ttclid|twclid|utm_[a-z]+|utm_campaign|utm_content|utm_creative_format|utm_id|utm_marketing_tactic|utm_medium|utm_source|utm_source_platform|utm_term|vmcid|wbraid|yclid|zanpid)=[-_A-z0-9+(){}%.*]+&?", "");
set req.url = regsub(req.url, "[?|&]+$", "");
}
}
Trailing slashes
Keep it consistent. Do I need to add more context here? 🙊
Cookies
Varnish by default will not cache any object from the backend with a Set-Cookie
header. If you're sure your page doesn't need cookies, strip them!
sub vcl_recv {
unset req.http.Cookie; # only do this when you're absolutely sure what it is actually doing ;)
}
Cache-Control
The Cache-Control
header instructs Varnish, via the max-age
directive, how long the object should be stored in cache. For static
(but versioned) assets like js/fonts/images you can let your backend respond with a reasonable max-age=31536000
value (1 year).
Serve stale content
Varnish allows you to serve stale content while revalidating the content in the background asynchronously. This increases (perceived) performance & helps with traffic spikes.
sub vcl_backend_response {
set beresp.grace = 5m;
}
Debugging (client side)
Sometimes you just want to check quickly: was it a hit or miss?
sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
Then use DevTools → Network tab → Response Headers → X-Cache
.
Checklist
Below I created a small checklist you can use to increase your cache hit rate 😊.
- [] Monitor `varnishstat` / `varnishtop` for misses and passes
- [] Ensure enough memory is allocated
- [] Strip useless query params (utm, fbclid, etc.)
- [] Normalize cookies, host headers, and hash inputs
- [] Respect `Cache-Control` headers (and set them properly in your app)
- [] Enable grace mode to serve stale content while revalidating
- [] Add debugging headers for visibility
Hope it helps!