March 31, 2019

A long while back, I experimented with making a Go web server behave differently based on its heap utilization. I used runtime.ReadMemStats, but I didn’t run ReadMemStats on every request because it stops the world.

More recently, I started getting curious about logging memory utilization, against a production system, where we found that it was important to see the memory in use for each and every web request. I thought back to my old friend ReadMemStats, but I was frightened of stopping the world.

I also found Go’s pprof.Lookup("heap"), which, I think, tells you some interesting information about the state of the heap the last time the garbage collector ran. I thought it would be interesting to compare the two.

Benchmark in Isolation

The first thing I did was write a naïve benchmark of the two methods, available here: https://github.com/agocs/goMemProfileTester/blob/master/mem_test.go

It should be noted that my Go version is go1.11.5 darwin/amd64

The results surprised me.

go test -bench=.
goos: darwin
goarch: amd64
pkg: github.com/agocs/goMemProfileTester
BenchmarkMemstats-8    	  100000	     13966 ns/op
BenchmarkPProfHeap-8   	   10000	   1453765 ns/op
PASS
ok  	github.com/agocs/goMemProfileTester	16.125s

runtime.ReadMemStats stabilized at about 14μs per operation, while pprof.Lookup("heap") stabilized at 1.4ms, just about 100x slower!

However, I realize that a benchmark doesn’t reflect the complexities of multiple goroutines stopping the world on top of one another. I’d need to write a webserver to simulate this properly.

#Heavily Loadded Webserver

So I wrote a webserver that Does Stuff: https://github.com/agocs/goMemProfileTester/blob/master/main.go

It fills a slice with some number of randomly generated float64s, and then sorts them. I figured that was a good combination of memory intensive and CPU intensive work.

I used the Vegeta tool for load testing. The exact command was:

echo "GET <http://localhost:8080/readMem$N/>" | vegeta attack -duration 10s | vegeta report

where $N is one of 0, 1, or 2, corresponding to Control, ReadMemStats, or pprof.Lookup.

Vegeta launches 50 requests per second against the endpoint for 10 seconds, and then generates a report on the results.

Preliminary results were promising!