Involved Source Fileself.golabel.gomap.gope.go Package pprof writes runtime profiling data in the format expected
by the pprof visualization tool.
# Profiling a Go program
The first step to profiling a Go program is to enable profiling.
Support for profiling benchmarks built with the standard testing
package is built into go test. For example, the following command
runs benchmarks in the current directory and writes the CPU and
memory profiles to cpu.prof and mem.prof:
go test -cpuprofile cpu.prof -memprofile mem.prof -bench .
To add equivalent profiling support to a standalone program, add
code like the following to your main function:
var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to `file`")
var memprofile = flag.String("memprofile", "", "write memory profile to `file`")
func main() {
flag.Parse()
if *cpuprofile != "" {
f, err := os.Create(*cpuprofile)
if err != nil {
log.Fatal("could not create CPU profile: ", err)
}
defer f.Close() // error handling omitted for example
if err := pprof.StartCPUProfile(f); err != nil {
log.Fatal("could not start CPU profile: ", err)
}
defer pprof.StopCPUProfile()
}
// ... rest of the program ...
if *memprofile != "" {
f, err := os.Create(*memprofile)
if err != nil {
log.Fatal("could not create memory profile: ", err)
}
defer f.Close() // error handling omitted for example
runtime.GC() // get up-to-date statistics
// Lookup("allocs") creates a profile similar to go test -memprofile.
// Alternatively, use Lookup("heap") for a profile
// that has inuse_space as the default index.
if err := pprof.Lookup("allocs").WriteTo(f, 0); err != nil {
log.Fatal("could not write memory profile: ", err)
}
}
}
There is also a standard HTTP interface to profiling data. Adding
the following line will install handlers under the /debug/pprof/
URL to download live profiles:
import _ "net/http/pprof"
See the net/http/pprof package for more details.
Profiles can then be visualized with the pprof tool:
go tool pprof cpu.prof
There are many commands available from the pprof command line.
Commonly used commands include "top", which prints a summary of the
top program hot-spots, and "web", which opens an interactive graph
of hot-spots and their call graphs. Use "help" for information on
all pprof commands.
For more information about pprof, see
https://github.com/google/pprof/blob/main/doc/README.md.pprof_rusage.goproto.goproto_other.goprotobuf.goprotomem.goruntime.go
Package-Level Type Names (total 18, in which 2 are exported)
A Profile is a collection of stack traces showing the call sequences
that led to instances of a particular event, such as allocation.
Packages can create and maintain their own profiles; the most common
use is for tracking resources that must be explicitly closed, such as files
or network connections.
A Profile's methods can be called from multiple goroutines simultaneously.
Each Profile has a unique name. A few profiles are predefined:
goroutine - stack traces of all current goroutines
heap - a sampling of memory allocations of live objects
allocs - a sampling of all past memory allocations
threadcreate - stack traces that led to the creation of new OS threads
block - stack traces that led to blocking on synchronization primitives
mutex - stack traces of holders of contended mutexes
These predefined profiles maintain themselves and panic on an explicit
[Profile.Add] or [Profile.Remove] method call.
The CPU profile is not available as a Profile. It has a special API,
the [StartCPUProfile] and [StopCPUProfile] functions, because it streams
output to a writer during profiling.
# Heap profile
The heap profile reports statistics as of the most recently completed
garbage collection; it elides more recent allocation to avoid skewing
the profile away from live data and toward garbage.
If there has been no garbage collection at all, the heap profile reports
all known allocations. This exception helps mainly in programs running
without garbage collection enabled, usually for debugging purposes.
The heap profile tracks both the allocation sites for all live objects in
the application memory and for all objects allocated since the program start.
Pprof's -inuse_space, -inuse_objects, -alloc_space, and -alloc_objects
flags select which to display, defaulting to -inuse_space (live objects,
scaled by size).
# Allocs profile
The allocs profile is the same as the heap profile but changes the default
pprof display to -alloc_space, the total number of bytes allocated since
the program began (including garbage-collected bytes).
# Block profile
The block profile tracks time spent blocked on synchronization primitives,
such as [sync.Mutex], [sync.RWMutex], [sync.WaitGroup], [sync.Cond], and
channel send/receive/select.
Stack traces correspond to the location that blocked (for example,
[sync.Mutex.Lock]).
Sample values correspond to cumulative time spent blocked at that stack
trace, subject to time-based sampling specified by
[runtime.SetBlockProfileRate].
# Mutex profile
The mutex profile tracks contention on mutexes, such as [sync.Mutex],
[sync.RWMutex], and runtime-internal locks.
Stack traces correspond to the end of the critical section causing
contention. For example, a lock held for a long time while other goroutines
are waiting to acquire the lock will report contention when the lock is
finally unlocked (that is, at [sync.Mutex.Unlock]).
Sample values correspond to the approximate cumulative time other goroutines
spent blocked waiting for the lock, subject to event-based sampling
specified by [runtime.SetMutexProfileFraction]. For example, if a caller
holds a lock for 1s while 5 other goroutines are waiting for the entire
second to acquire the lock, its unlock call stack will report 5s of
contention.
Runtime-internal locks are always reported at the location
"runtime._LostContendedRuntimeLock". More detailed stack traces for
runtime-internal locks can be obtained by setting
`GODEBUG=runtimecontentionstacks=1` (see package [runtime] docs for
caveats).countfunc() intmmap[any][]uintptrmusync.Mutexnamestringwritefunc(io.Writer, int) error Add adds the current execution stack to the profile, associated with value.
Add stores value in an internal map, so value must be suitable for use as
a map key and will not be garbage collected until the corresponding
call to [Profile.Remove]. Add panics if the profile already contains a stack for value.
The skip parameter has the same meaning as [runtime.Caller]'s skip
and controls where the stack trace begins. Passing skip=0 begins the
trace in the function calling Add. For example, given this
execution stack:
Add
called from rpc.NewClient
called from mypkg.Run
called from main.main
Passing skip=0 begins the stack trace at the call to Add inside rpc.NewClient.
Passing skip=1 begins the stack trace at the call to NewClient inside mypkg.Run. Count returns the number of execution stacks currently in the profile. Name returns this profile's name, which can be passed to [Lookup] to reobtain the profile. Remove removes the execution stack associated with value from the profile.
It is a no-op if the value is not in the profile. WriteTo writes a pprof-formatted snapshot of the profile to w.
If a write to w returns an error, WriteTo returns that error.
Otherwise, WriteTo returns nil.
The debug parameter enables additional output.
Passing debug=0 writes the gzip-compressed protocol buffer described
in https://github.com/google/pprof/tree/main/proto#overview.
Passing debug=1 writes the legacy text format with comments
translating addresses to function names and line numbers, so that a
programmer can read the profile without tools.
The predefined profiles may assign meaning to other debug values;
for example, when printing the "goroutine" profile, debug=2 means to
print the goroutine stacks in the same form that a Go program uses
when dying due to an unrecovered panic.
func Lookup(name string) *Profile
func NewProfile(name string) *Profile
func Profiles() []*Profile
func net/http/pprof.collectProfile(p *Profile) (*profile.Profile, error)
var allocsProfile *Profile
var blockProfile *Profile
var goroutineProfile *Profile
var heapProfile *Profile
var mutexProfile *Profile
var threadcreateProfile *Profile
A countProfile is a set of stack traces to be printed as counts
grouped by stack trace. There are multiple implementations:
all that matters is that we can find out how many traces there are
and obtain each trace in turn.( countProfile) Label(i int) *labelMap( countProfile) Len() int( countProfile) Stack(i int) []uintptr
*runtimeProfilestackProfile
func printCountProfile(w io.Writer, debug int, name string, p countProfile) error
keysByCount sorts keys with higher counts first, breaking ties by key string order.countmap[string]intkeys[]string(*keysByCount) Len() int(*keysByCount) Less(i, j int) bool(*keysByCount) Swap(i, j int)
*keysByCount : sort.Interface
labelContextKey is the type of contextKeys used for profiler labels.
labelMap is the representation of the label set held in the context type.
This is an initial implementation, but it will be replaced with something
that admits incremental immutable modification more efficiently.LabelSetLabelSetLabelSet.list[]label String satisfies Stringer and returns key, value pairs in a consistent
order.
*labelMap : expvar.Var
*labelMap : fmt.Stringer
*labelMap : runtime.stringer
*labelMap : context.stringer
*labelMap : github.com/aws/smithy-go/middleware.stringer
func labelValue(ctx context.Context) labelMap
firstPCFrames and firstPCSymbolizeResult hold the results of the
allFrames call for the first (leaf-most) PC this locInfo representsfirstPCSymbolizeResultsymbolizeFlag location id assigned by the profileBuilder sequence of PCs, including the fake PCs returned by the traceback
to represent inlined functions
https://github.com/golang/go/blob/d6f2f833c93a41ec1c68e49804b8387a06b131c5/src/runtime/traceback.go#L347-L368
// A string that uniquely identifies a particular program version with high probability. // The limit of the address range occupied by this mapping. // map entry was faked; /proc/self/maps wasn't available // The object this entry is loaded from.funcssymbolizeFlag // Offset in the binary that corresponds to the first mapped address. initialized as reading mapping // Address at which the binary (or DLL) is loaded into memory.
pcDeck is a helper to detect a sequence of inlined functions from
a stack trace returned by the runtime.
The stack traces returned by runtime's trackback functions are fully
expanded (at least for Go functions) and include the fake pcs representing
inlined functions. The profile proto expects the inlined functions to be
encoded in one Location message.
https://github.com/google/pprof/blob/5e965273ee43930341d897407202dd5e10e952cb/proto/profile.proto#L177-L184
Runtime does not directly expose whether a frame is for an inlined function
and looking up debug info is not ideal, so we use a heuristic to filter
the fake pcs and restore the inlined and entry functions. Inlined functions
have the following properties:
Frame's Func is nil (note: also true for non-Go functions), and
Frame's Entry matches its entry function frame's Entry (note: could also be true for recursive calls and non-Go functions), and
Frame's Name does not match its entry function frame's name (note: inlined functions cannot be directly recursive).
As reading and processing the pcs in a stack trace one by one (from leaf to the root),
we use pcDeck to temporarily hold the observed pcs and their expanded frames
until we observe the entry function frame. firstPCFrames indicates the number of frames associated with the first
(leaf-most) PC in the deck firstPCSymbolizeResult holds the results of the allFrames call for the
first (leaf-most) PC in the deckframes[]runtime.Framepcs[]uintptrsymbolizeResultsymbolizeFlag(*pcDeck) reset() tryAdd tries to add the pc and Frames expanded from it (most likely one,
since the stack trace is already fully expanded) and the symbolizeResult
to the deck. If it fails the caller needs to flush the deck and retry.
A profileBuilder writes a profile incrementally from a
stream of profile samples delivered by the runtime.deckpcDeckendtime.Time // Package path-qualified function name to Function.IDhavePeriodbool // list of locInfo starting with the given PC.mprofMapmem[]memMappbprotobufperiodint64starttime.TimestringMapmap[string]intstrings[]string encoding statezw*gzip.Writer addCPUData adds the CPU profiling data to the profile.
The data must be a whole number of records, as delivered by the runtime.
len(tags) must be equal to the number of records in data.(*profileBuilder) addMapping(lo, hi, offset uint64, file, buildID string)(*profileBuilder) addMappingEntry(lo, hi, offset uint64, file, buildID string, fake bool) appendLocsForStack appends the location IDs for the given stack trace to the given
location ID slice, locs. The addresses in the stack are return PCs or 1 + the PC of
an inline marker as the runtime traceback function returns.
It may return an empty slice even if locs is non-empty, for example if locs consists
solely of runtime.goexit. We still count these empty stacks in profiles in order to
get the right cumulative sample count.
It may emit to b.pb, so there must be no message encoding in progress. build completes and returns the constructed profile. emitLocation emits the new location and function information recorded in the deck
and returns the location ID encoded in the profile protobuf.
It emits to b.pb, so there must be no message encoding in progress.
It resets the deck.(*profileBuilder) flush() pbLabel encodes a Label message to b.pb. pbLine encodes a Line message to b.pb. pbMapping encodes a Mapping message to b.pb. pbSample encodes a Sample message to b.pb. pbValueType encodes a ValueType message to b.pb. readMapping reads /proc/self/maps and writes mappings to b.pb.
It saves the address ranges of the mappings in b.mem for use
when emitting locations. stringIndex adds s to the string table if not already present
and returns the index of s in the string table.
func newProfileBuilder(w io.Writer) *profileBuilder
symbolizeFlag keeps track of symbolization result.
0 : no symbol lookup was performed
1<<0 (lookupTried) : symbol lookup was performed
1<<1 (lookupFailed): symbol lookup was performed but failed
func allFrames(addr uintptr) ([]runtime.Frame, symbolizeFlag)
const lookupFailed
const lookupTried
Package-Level Functions (total 60, in which 12 are exported)
Do calls f with a copy of the parent context with the
given labels added to the parent's label map.
Goroutines spawned while executing f will inherit the augmented label-set.
Each key/value pair in labels is inserted into the label map in the
order provided, overriding any previous value for the same key.
The augmented label map will be set for the duration of the call to f
and restored once f returns.
ForLabels invokes f with each label set on the context.
The function f should return true to continue iteration or false to stop iteration early.
Label returns the value of the label with the given key on ctx, and a boolean indicating
whether that label exists.
Labels takes an even number of strings representing key-value pairs
and makes a [LabelSet] containing them.
A label overwrites a prior label with the same key.
Currently only the CPU and goroutine profiles utilize any labels
information.
See https://golang.org/issue/23458 for details.
Lookup returns the profile with the given name, or nil if no such profile exists.
NewProfile creates a new profile with the given name.
If a profile with that name already exists, NewProfile panics.
The convention is to use a 'import/path.' prefix to create
separate name spaces for each package.
For compatibility with various tools that read pprof data,
profile names should not contain spaces.
Profiles returns a slice of all the known profiles, sorted by name.
SetGoroutineLabels sets the current goroutine's labels to match ctx.
A new goroutine inherits the labels of the goroutine that created it.
This is a lower-level API than [Do], which should be used instead when possible.
StartCPUProfile enables CPU profiling for the current process.
While profiling, the profile will be buffered and written to w.
StartCPUProfile returns an error if profiling is already enabled.
On Unix-like systems, StartCPUProfile does not work by default for
Go code built with -buildmode=c-archive or -buildmode=c-shared.
StartCPUProfile relies on the SIGPROF signal, but that signal will
be delivered to the main program's SIGPROF signal handler (if any)
not to the one used by Go. To make it work, call [os/signal.Notify]
for [syscall.SIGPROF], but note that doing so may break any profiling
being done by the main program.
StopCPUProfile stops the current CPU profile, if any.
StopCPUProfile only returns after all the writes for the
profile have completed.
WithLabels returns a new [context.Context] with the given labels added.
A label overwrites a prior label with the same key.
WriteHeapProfile is shorthand for [Lookup]("heap").WriteTo(w, 0).
It is preserved for backwards compatibility.
countBlock returns the number of records in the blocking profile.
countGoroutine returns the number of goroutines.
countHeap returns the number of records in the heap profile.
countMutex returns the number of records in the mutex profile.
countThreadCreate returns the size of the current ThreadCreateProfile.
elfBuildID returns the GNU build ID of the named ELF binary,
without introducing a dependency on debug/elf and its dependencies.
expandInlinedFrames copies the call stack from pcs into dst, expanding any
PCs corresponding to inlined calls into the corresponding PCs for the inlined
functions. Returns the number of frames copied to dst.
newProfileBuilder returns a new profileBuilder.
CPU profiling data obtained from the runtime can be added
by calling b.addCPUData, and then the eventual profile
can be obtained by calling b.finish.
peBuildID returns a best effort unique ID for the named executable.
It would be wasteful to calculate the hash of the whole file,
instead use the binary name and the last modified time for the buildid.
printCountCycleProfile outputs block profile records (for block or mutex profiles)
as the pprof-proto format output. Translations from cycle count to time duration
are done because The proto expects count and time (nanoseconds) instead of count
and the number of cycles for block, contention profiles.
printCountProfile prints a countProfile at the specified debug level.
The profile will be in compressed proto format unless debug is nonzero.
printStackRecord prints the function + source line information
for a single stack trace.
readProfile, provided by the runtime, returns the next chunk of
binary CPU profiling stack trace data, blocking until data is available.
If profiling is turned off and all the profile data accumulated while it was
on has been returned, readProfile returns eof=true.
The caller must save the returned data and tags before calling readProfile again.
runtime_expandFinalInlineFrame is defined in runtime/symtab.go.
runtime_FrameStartLine is defined in runtime/symtab.go.
runtime_FrameSymbolName is defined in runtime/symtab.go.
runtime_getProfLabel is defined in runtime/proflabel.go.
runtime_setProfLabel is defined in runtime/proflabel.go.
scaleHeapSample adjusts the data from a heap Sample to
account for its probability of appearing in the collected
data. heap profiles are a sampling of the memory allocations
requests in a program. We estimate the unsampled value by dividing
each collected sample by its probability of appearing in the
profile. heap profiles rely on a poisson process to determine
which samples to collect, based on the desired average collection
rate R. The probability of a sample of size S to appear in that
profile is 1-exp(-S/R).