1 // Copyright 2009 The Go Authors. All rights reserved. 2 // Use of this source code is governed by a BSD-style 3 // license that can be found in the LICENSE file. 4 5 // Garbage collector (GC). 6 // 7 // The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple 8 // GC thread to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is 9 // non-generational and non-compacting. Allocation is done using size segregated per P allocation 10 // areas to minimize fragmentation while eliminating locks in the common case. 11 // 12 // The algorithm decomposes into several steps. 13 // This is a high level description of the algorithm being used. For an overview of GC a good 14 // place to start is Richard Jones' gchandbook.org. 15 // 16 // The algorithm's intellectual heritage includes Dijkstra's on-the-fly algorithm, see 17 // Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. 1978. 18 // On-the-fly garbage collection: an exercise in cooperation. Commun. ACM 21, 11 (November 1978), 19 // 966-975. 20 // For journal quality proofs that these steps are complete, correct, and terminate see 21 // Hudson, R., and Moss, J.E.B. Copying Garbage Collection without stopping the world. 22 // Concurrency and Computation: Practice and Experience 15(3-5), 2003. 23 // 24 // 1. GC performs sweep termination. 25 // 26 // a. Stop the world. This causes all Ps to reach a GC safe-point. 27 // 28 // b. Sweep any unswept spans. There will only be unswept spans if 29 // this GC cycle was forced before the expected time. 30 // 31 // 2. GC performs the mark phase. 32 // 33 // a. Prepare for the mark phase by setting gcphase to _GCmark 34 // (from _GCoff), enabling the write barrier, enabling mutator 35 // assists, and enqueueing root mark jobs. No objects may be 36 // scanned until all Ps have enabled the write barrier, which is 37 // accomplished using STW. 38 // 39 // b. Start the world. From this point, GC work is done by mark 40 // workers started by the scheduler and by assists performed as 41 // part of allocation. The write barrier shades both the 42 // overwritten pointer and the new pointer value for any pointer 43 // writes (see mbarrier.go for details). Newly allocated objects 44 // are immediately marked black. 45 // 46 // c. GC performs root marking jobs. This includes scanning all 47 // stacks, shading all globals, and shading any heap pointers in 48 // off-heap runtime data structures. Scanning a stack stops a 49 // goroutine, shades any pointers found on its stack, and then 50 // resumes the goroutine. 51 // 52 // d. GC drains the work queue of grey objects, scanning each grey 53 // object to black and shading all pointers found in the object 54 // (which in turn may add those pointers to the work queue). 55 // 56 // e. Because GC work is spread across local caches, GC uses a 57 // distributed termination algorithm to detect when there are no 58 // more root marking jobs or grey objects (see gcMarkDone). At this 59 // point, GC transitions to mark termination. 60 // 61 // 3. GC performs mark termination. 62 // 63 // a. Stop the world. 64 // 65 // b. Set gcphase to _GCmarktermination, and disable workers and 66 // assists. 67 // 68 // c. Perform housekeeping like flushing mcaches. 69 // 70 // 4. GC performs the sweep phase. 71 // 72 // a. Prepare for the sweep phase by setting gcphase to _GCoff, 73 // setting up sweep state and disabling the write barrier. 74 // 75 // b. Start the world. From this point on, newly allocated objects 76 // are white, and allocating sweeps spans before use if necessary. 77 // 78 // c. GC does concurrent sweeping in the background and in response 79 // to allocation. See description below. 80 // 81 // 5. When sufficient allocation has taken place, replay the sequence 82 // starting with 1 above. See discussion of GC rate below. 83 84 // Concurrent sweep. 85 // 86 // The sweep phase proceeds concurrently with normal program execution. 87 // The heap is swept span-by-span both lazily (when a goroutine needs another span) 88 // and concurrently in a background goroutine (this helps programs that are not CPU bound). 89 // At the end of STW mark termination all spans are marked as "needs sweeping". 90 // 91 // The background sweeper goroutine simply sweeps spans one-by-one. 92 // 93 // To avoid requesting more OS memory while there are unswept spans, when a 94 // goroutine needs another span, it first attempts to reclaim that much memory 95 // by sweeping. When a goroutine needs to allocate a new small-object span, it 96 // sweeps small-object spans for the same object size until it frees at least 97 // one object. When a goroutine needs to allocate large-object span from heap, 98 // it sweeps spans until it frees at least that many pages into heap. There is 99 // one case where this may not suffice: if a goroutine sweeps and frees two 100 // nonadjacent one-page spans to the heap, it will allocate a new two-page 101 // span, but there can still be other one-page unswept spans which could be 102 // combined into a two-page span. 103 // 104 // It's critical to ensure that no operations proceed on unswept spans (that would corrupt 105 // mark bits in GC bitmap). During GC all mcaches are flushed into the central cache, 106 // so they are empty. When a goroutine grabs a new span into mcache, it sweeps it. 107 // When a goroutine explicitly frees an object or sets a finalizer, it ensures that 108 // the span is swept (either by sweeping it, or by waiting for the concurrent sweep to finish). 109 // The finalizer goroutine is kicked off only when all spans are swept. 110 // When the next GC starts, it sweeps all not-yet-swept spans (if any). 111 112 // GC rate. 113 // Next GC is after we've allocated an extra amount of memory proportional to 114 // the amount already in use. The proportion is controlled by GOGC environment variable 115 // (100 by default). If GOGC=100 and we're using 4M, we'll GC again when we get to 8M 116 // (this mark is computed by the gcController.heapGoal method). This keeps the GC cost in 117 // linear proportion to the allocation cost. Adjusting GOGC just changes the linear constant 118 // (and also the amount of extra memory used). 119 120 // Oblets 121 // 122 // In order to prevent long pauses while scanning large objects and to 123 // improve parallelism, the garbage collector breaks up scan jobs for 124 // objects larger than maxObletBytes into "oblets" of at most 125 // maxObletBytes. When scanning encounters the beginning of a large 126 // object, it scans only the first oblet and enqueues the remaining 127 // oblets as new scan jobs. 128 129 package runtime 130 131 import ( 132 "internal/cpu" 133 "runtime/internal/atomic" 134 "unsafe" 135 ) 136 137 const ( 138 _DebugGC = 0 139 _ConcurrentSweep = true 140 _FinBlockSize = 4 * 1024 141 142 // debugScanConservative enables debug logging for stack 143 // frames that are scanned conservatively. 144 debugScanConservative = false 145 146 // sweepMinHeapDistance is a lower bound on the heap distance 147 // (in bytes) reserved for concurrent sweeping between GC 148 // cycles. 149 sweepMinHeapDistance = 1024 * 1024 150 ) 151 152 func gcinit() { 153 if unsafe.Sizeof(workbuf{}) != _WorkbufSize { 154 throw("size of Workbuf is suboptimal") 155 } 156 // No sweep on the first cycle. 157 sweep.active.state.Store(sweepDrainedMask) 158 159 // Initialize GC pacer state. 160 // Use the environment variable GOGC for the initial gcPercent value. 161 // Use the environment variable GOMEMLIMIT for the initial memoryLimit value. 162 gcController.init(readGOGC(), readGOMEMLIMIT()) 163 164 work.startSema = 1 165 work.markDoneSema = 1 166 lockInit(&work.sweepWaiters.lock, lockRankSweepWaiters) 167 lockInit(&work.assistQueue.lock, lockRankAssistQueue) 168 lockInit(&work.wbufSpans.lock, lockRankWbufSpans) 169 } 170 171 // gcenable is called after the bulk of the runtime initialization, 172 // just before we're about to start letting user code run. 173 // It kicks off the background sweeper goroutine, the background 174 // scavenger goroutine, and enables GC. 175 func gcenable() { 176 // Kick off sweeping and scavenging. 177 c := make(chan int, 2) 178 go bgsweep(c) 179 go bgscavenge(c) 180 <-c 181 <-c 182 memstats.enablegc = true // now that runtime is initialized, GC is okay 183 } 184 185 // Garbage collector phase. 186 // Indicates to write barrier and synchronization task to perform. 187 var gcphase uint32 188 189 // The compiler knows about this variable. 190 // If you change it, you must change builtin/runtime.go, too. 191 // If you change the first four bytes, you must also change the write 192 // barrier insertion code. 193 var writeBarrier struct { 194 enabled bool // compiler emits a check of this before calling write barrier 195 pad [3]byte // compiler uses 32-bit load for "enabled" field 196 needed bool // whether we need a write barrier for current GC phase 197 cgo bool // whether we need a write barrier for a cgo check 198 alignme uint64 // guarantee alignment so that compiler can use a 32 or 64-bit load 199 } 200 201 // gcBlackenEnabled is 1 if mutator assists and background mark 202 // workers are allowed to blacken objects. This must only be set when 203 // gcphase == _GCmark. 204 var gcBlackenEnabled uint32 205 206 const ( 207 _GCoff = iota // GC not running; sweeping in background, write barrier disabled 208 _GCmark // GC marking roots and workbufs: allocate black, write barrier ENABLED 209 _GCmarktermination // GC mark termination: allocate black, P's help GC, write barrier ENABLED 210 ) 211 212 //go:nosplit 213 func setGCPhase(x uint32) { 214 atomic.Store(&gcphase, x) 215 writeBarrier.needed = gcphase == _GCmark || gcphase == _GCmarktermination 216 writeBarrier.enabled = writeBarrier.needed || writeBarrier.cgo 217 } 218 219 // gcMarkWorkerMode represents the mode that a concurrent mark worker 220 // should operate in. 221 // 222 // Concurrent marking happens through four different mechanisms. One 223 // is mutator assists, which happen in response to allocations and are 224 // not scheduled. The other three are variations in the per-P mark 225 // workers and are distinguished by gcMarkWorkerMode. 226 type gcMarkWorkerMode int 227 228 const ( 229 // gcMarkWorkerNotWorker indicates that the next scheduled G is not 230 // starting work and the mode should be ignored. 231 gcMarkWorkerNotWorker gcMarkWorkerMode = iota 232 233 // gcMarkWorkerDedicatedMode indicates that the P of a mark 234 // worker is dedicated to running that mark worker. The mark 235 // worker should run without preemption. 236 gcMarkWorkerDedicatedMode 237 238 // gcMarkWorkerFractionalMode indicates that a P is currently 239 // running the "fractional" mark worker. The fractional worker 240 // is necessary when GOMAXPROCS*gcBackgroundUtilization is not 241 // an integer and using only dedicated workers would result in 242 // utilization too far from the target of gcBackgroundUtilization. 243 // The fractional worker should run until it is preempted and 244 // will be scheduled to pick up the fractional part of 245 // GOMAXPROCS*gcBackgroundUtilization. 246 gcMarkWorkerFractionalMode 247 248 // gcMarkWorkerIdleMode indicates that a P is running the mark 249 // worker because it has nothing else to do. The idle worker 250 // should run until it is preempted and account its time 251 // against gcController.idleMarkTime. 252 gcMarkWorkerIdleMode 253 ) 254 255 // gcMarkWorkerModeStrings are the strings labels of gcMarkWorkerModes 256 // to use in execution traces. 257 var gcMarkWorkerModeStrings = [...]string{ 258 "Not worker", 259 "GC (dedicated)", 260 "GC (fractional)", 261 "GC (idle)", 262 } 263 264 // pollFractionalWorkerExit reports whether a fractional mark worker 265 // should self-preempt. It assumes it is called from the fractional 266 // worker. 267 func pollFractionalWorkerExit() bool { 268 // This should be kept in sync with the fractional worker 269 // scheduler logic in findRunnableGCWorker. 270 now := nanotime() 271 delta := now - gcController.markStartTime 272 if delta <= 0 { 273 return true 274 } 275 p := getg().m.p.ptr() 276 selfTime := p.gcFractionalMarkTime + (now - p.gcMarkWorkerStartTime) 277 // Add some slack to the utilization goal so that the 278 // fractional worker isn't behind again the instant it exits. 279 return float64(selfTime)/float64(delta) > 1.2*gcController.fractionalUtilizationGoal 280 } 281 282 var work workType 283 284 type workType struct { 285 full lfstack // lock-free list of full blocks workbuf 286 empty lfstack // lock-free list of empty blocks workbuf 287 pad0 cpu.CacheLinePad // prevents false-sharing between full/empty and nproc/nwait 288 289 wbufSpans struct { 290 lock mutex 291 // free is a list of spans dedicated to workbufs, but 292 // that don't currently contain any workbufs. 293 free mSpanList 294 // busy is a list of all spans containing workbufs on 295 // one of the workbuf lists. 296 busy mSpanList 297 } 298 299 // Restore 64-bit alignment on 32-bit. 300 _ uint32 301 302 // bytesMarked is the number of bytes marked this cycle. This 303 // includes bytes blackened in scanned objects, noscan objects 304 // that go straight to black, and permagrey objects scanned by 305 // markroot during the concurrent scan phase. This is updated 306 // atomically during the cycle. Updates may be batched 307 // arbitrarily, since the value is only read at the end of the 308 // cycle. 309 // 310 // Because of benign races during marking, this number may not 311 // be the exact number of marked bytes, but it should be very 312 // close. 313 // 314 // Put this field here because it needs 64-bit atomic access 315 // (and thus 8-byte alignment even on 32-bit architectures). 316 bytesMarked uint64 317 318 markrootNext uint32 // next markroot job 319 markrootJobs uint32 // number of markroot jobs 320 321 nproc uint32 322 tstart int64 323 nwait uint32 324 325 // Number of roots of various root types. Set by gcMarkRootPrepare. 326 // 327 // nStackRoots == len(stackRoots), but we have nStackRoots for 328 // consistency. 329 nDataRoots, nBSSRoots, nSpanRoots, nStackRoots int 330 331 // Base indexes of each root type. Set by gcMarkRootPrepare. 332 baseData, baseBSS, baseSpans, baseStacks, baseEnd uint32 333 334 // stackRoots is a snapshot of all of the Gs that existed 335 // before the beginning of concurrent marking. The backing 336 // store of this must not be modified because it might be 337 // shared with allgs. 338 stackRoots []*g 339 340 // Each type of GC state transition is protected by a lock. 341 // Since multiple threads can simultaneously detect the state 342 // transition condition, any thread that detects a transition 343 // condition must acquire the appropriate transition lock, 344 // re-check the transition condition and return if it no 345 // longer holds or perform the transition if it does. 346 // Likewise, any transition must invalidate the transition 347 // condition before releasing the lock. This ensures that each 348 // transition is performed by exactly one thread and threads 349 // that need the transition to happen block until it has 350 // happened. 351 // 352 // startSema protects the transition from "off" to mark or 353 // mark termination. 354 startSema uint32 355 // markDoneSema protects transitions from mark to mark termination. 356 markDoneSema uint32 357 358 bgMarkReady note // signal background mark worker has started 359 bgMarkDone uint32 // cas to 1 when at a background mark completion point 360 // Background mark completion signaling 361 362 // mode is the concurrency mode of the current GC cycle. 363 mode gcMode 364 365 // userForced indicates the current GC cycle was forced by an 366 // explicit user call. 367 userForced bool 368 369 // totaltime is the CPU nanoseconds spent in GC since the 370 // program started if debug.gctrace > 0. 371 totaltime int64 372 373 // initialHeapLive is the value of gcController.heapLive at the 374 // beginning of this GC cycle. 375 initialHeapLive uint64 376 377 // assistQueue is a queue of assists that are blocked because 378 // there was neither enough credit to steal or enough work to 379 // do. 380 assistQueue struct { 381 lock mutex 382 q gQueue 383 } 384 385 // sweepWaiters is a list of blocked goroutines to wake when 386 // we transition from mark termination to sweep. 387 sweepWaiters struct { 388 lock mutex 389 list gList 390 } 391 392 // cycles is the number of completed GC cycles, where a GC 393 // cycle is sweep termination, mark, mark termination, and 394 // sweep. This differs from memstats.numgc, which is 395 // incremented at mark termination. 396 cycles uint32 397 398 // Timing/utilization stats for this cycle. 399 stwprocs, maxprocs int32 400 tSweepTerm, tMark, tMarkTerm, tEnd int64 // nanotime() of phase start 401 402 pauseNS int64 // total STW time this cycle 403 pauseStart int64 // nanotime() of last STW 404 405 // debug.gctrace heap sizes for this cycle. 406 heap0, heap1, heap2 uint64 407 } 408 409 // GC runs a garbage collection and blocks the caller until the 410 // garbage collection is complete. It may also block the entire 411 // program. 412 func GC() { 413 // We consider a cycle to be: sweep termination, mark, mark 414 // termination, and sweep. This function shouldn't return 415 // until a full cycle has been completed, from beginning to 416 // end. Hence, we always want to finish up the current cycle 417 // and start a new one. That means: 418 // 419 // 1. In sweep termination, mark, or mark termination of cycle 420 // N, wait until mark termination N completes and transitions 421 // to sweep N. 422 // 423 // 2. In sweep N, help with sweep N. 424 // 425 // At this point we can begin a full cycle N+1. 426 // 427 // 3. Trigger cycle N+1 by starting sweep termination N+1. 428 // 429 // 4. Wait for mark termination N+1 to complete. 430 // 431 // 5. Help with sweep N+1 until it's done. 432 // 433 // This all has to be written to deal with the fact that the 434 // GC may move ahead on its own. For example, when we block 435 // until mark termination N, we may wake up in cycle N+2. 436 437 // Wait until the current sweep termination, mark, and mark 438 // termination complete. 439 n := atomic.Load(&work.cycles) 440 gcWaitOnMark(n) 441 442 // We're now in sweep N or later. Trigger GC cycle N+1, which 443 // will first finish sweep N if necessary and then enter sweep 444 // termination N+1. 445 gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1}) 446 447 // Wait for mark termination N+1 to complete. 448 gcWaitOnMark(n + 1) 449 450 // Finish sweep N+1 before returning. We do this both to 451 // complete the cycle and because runtime.GC() is often used 452 // as part of tests and benchmarks to get the system into a 453 // relatively stable and isolated state. 454 for atomic.Load(&work.cycles) == n+1 && sweepone() != ^uintptr(0) { 455 sweep.nbgsweep++ 456 Gosched() 457 } 458 459 // Callers may assume that the heap profile reflects the 460 // just-completed cycle when this returns (historically this 461 // happened because this was a STW GC), but right now the 462 // profile still reflects mark termination N, not N+1. 463 // 464 // As soon as all of the sweep frees from cycle N+1 are done, 465 // we can go ahead and publish the heap profile. 466 // 467 // First, wait for sweeping to finish. (We know there are no 468 // more spans on the sweep queue, but we may be concurrently 469 // sweeping spans, so we have to wait.) 470 for atomic.Load(&work.cycles) == n+1 && !isSweepDone() { 471 Gosched() 472 } 473 474 // Now we're really done with sweeping, so we can publish the 475 // stable heap profile. Only do this if we haven't already hit 476 // another mark termination. 477 mp := acquirem() 478 cycle := atomic.Load(&work.cycles) 479 if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) { 480 mProf_PostSweep() 481 } 482 releasem(mp) 483 } 484 485 // gcWaitOnMark blocks until GC finishes the Nth mark phase. If GC has 486 // already completed this mark phase, it returns immediately. 487 func gcWaitOnMark(n uint32) { 488 for { 489 // Disable phase transitions. 490 lock(&work.sweepWaiters.lock) 491 nMarks := atomic.Load(&work.cycles) 492 if gcphase != _GCmark { 493 // We've already completed this cycle's mark. 494 nMarks++ 495 } 496 if nMarks > n { 497 // We're done. 498 unlock(&work.sweepWaiters.lock) 499 return 500 } 501 502 // Wait until sweep termination, mark, and mark 503 // termination of cycle N complete. 504 work.sweepWaiters.list.push(getg()) 505 goparkunlock(&work.sweepWaiters.lock, waitReasonWaitForGCCycle, traceEvGoBlock, 1) 506 } 507 } 508 509 // gcMode indicates how concurrent a GC cycle should be. 510 type gcMode int 511 512 const ( 513 gcBackgroundMode gcMode = iota // concurrent GC and sweep 514 gcForceMode // stop-the-world GC now, concurrent sweep 515 gcForceBlockMode // stop-the-world GC now and STW sweep (forced by user) 516 ) 517 518 // A gcTrigger is a predicate for starting a GC cycle. Specifically, 519 // it is an exit condition for the _GCoff phase. 520 type gcTrigger struct { 521 kind gcTriggerKind 522 now int64 // gcTriggerTime: current time 523 n uint32 // gcTriggerCycle: cycle number to start 524 } 525 526 type gcTriggerKind int 527 528 const ( 529 // gcTriggerHeap indicates that a cycle should be started when 530 // the heap size reaches the trigger heap size computed by the 531 // controller. 532 gcTriggerHeap gcTriggerKind = iota 533 534 // gcTriggerTime indicates that a cycle should be started when 535 // it's been more than forcegcperiod nanoseconds since the 536 // previous GC cycle. 537 gcTriggerTime 538 539 // gcTriggerCycle indicates that a cycle should be started if 540 // we have not yet started cycle number gcTrigger.n (relative 541 // to work.cycles). 542 gcTriggerCycle 543 ) 544 545 // test reports whether the trigger condition is satisfied, meaning 546 // that the exit condition for the _GCoff phase has been met. The exit 547 // condition should be tested when allocating. 548 func (t gcTrigger) test() bool { 549 if !memstats.enablegc || panicking != 0 || gcphase != _GCoff { 550 return false 551 } 552 switch t.kind { 553 case gcTriggerHeap: 554 // Non-atomic access to gcController.heapLive for performance. If 555 // we are going to trigger on this, this thread just 556 // atomically wrote gcController.heapLive anyway and we'll see our 557 // own write. 558 trigger, _ := gcController.trigger() 559 return atomic.Load64(&gcController.heapLive) >= trigger 560 case gcTriggerTime: 561 if gcController.gcPercent.Load() < 0 { 562 return false 563 } 564 lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime)) 565 return lastgc != 0 && t.now-lastgc > forcegcperiod 566 case gcTriggerCycle: 567 // t.n > work.cycles, but accounting for wraparound. 568 return int32(t.n-work.cycles) > 0 569 } 570 return true 571 } 572 573 // gcStart starts the GC. It transitions from _GCoff to _GCmark (if 574 // debug.gcstoptheworld == 0) or performs all of GC (if 575 // debug.gcstoptheworld != 0). 576 // 577 // This may return without performing this transition in some cases, 578 // such as when called on a system stack or with locks held. 579 func gcStart(trigger gcTrigger) { 580 // Since this is called from malloc and malloc is called in 581 // the guts of a number of libraries that might be holding 582 // locks, don't attempt to start GC in non-preemptible or 583 // potentially unstable situations. 584 mp := acquirem() 585 if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" { 586 releasem(mp) 587 return 588 } 589 releasem(mp) 590 mp = nil 591 592 // Pick up the remaining unswept/not being swept spans concurrently 593 // 594 // This shouldn't happen if we're being invoked in background 595 // mode since proportional sweep should have just finished 596 // sweeping everything, but rounding errors, etc, may leave a 597 // few spans unswept. In forced mode, this is necessary since 598 // GC can be forced at any point in the sweeping cycle. 599 // 600 // We check the transition condition continuously here in case 601 // this G gets delayed in to the next GC cycle. 602 for trigger.test() && sweepone() != ^uintptr(0) { 603 sweep.nbgsweep++ 604 } 605 606 // Perform GC initialization and the sweep termination 607 // transition. 608 semacquire(&work.startSema) 609 // Re-check transition condition under transition lock. 610 if !trigger.test() { 611 semrelease(&work.startSema) 612 return 613 } 614 615 // For stats, check if this GC was forced by the user. 616 work.userForced = trigger.kind == gcTriggerCycle 617 618 // In gcstoptheworld debug mode, upgrade the mode accordingly. 619 // We do this after re-checking the transition condition so 620 // that multiple goroutines that detect the heap trigger don't 621 // start multiple STW GCs. 622 mode := gcBackgroundMode 623 if debug.gcstoptheworld == 1 { 624 mode = gcForceMode 625 } else if debug.gcstoptheworld == 2 { 626 mode = gcForceBlockMode 627 } 628 629 // Ok, we're doing it! Stop everybody else 630 semacquire(&gcsema) 631 semacquire(&worldsema) 632 633 if trace.enabled { 634 traceGCStart() 635 } 636 637 // Check that all Ps have finished deferred mcache flushes. 638 for _, p := range allp { 639 if fg := atomic.Load(&p.mcache.flushGen); fg != mheap_.sweepgen { 640 println("runtime: p", p.id, "flushGen", fg, "!= sweepgen", mheap_.sweepgen) 641 throw("p mcache not flushed") 642 } 643 } 644 645 gcBgMarkStartWorkers() 646 647 systemstack(gcResetMarkState) 648 649 work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs 650 if work.stwprocs > ncpu { 651 // This is used to compute CPU time of the STW phases, 652 // so it can't be more than ncpu, even if GOMAXPROCS is. 653 work.stwprocs = ncpu 654 } 655 work.heap0 = atomic.Load64(&gcController.heapLive) 656 work.pauseNS = 0 657 work.mode = mode 658 659 now := nanotime() 660 work.tSweepTerm = now 661 work.pauseStart = now 662 if trace.enabled { 663 traceGCSTWStart(1) 664 } 665 systemstack(stopTheWorldWithSema) 666 // Finish sweep before we start concurrent scan. 667 systemstack(func() { 668 finishsweep_m() 669 }) 670 671 // clearpools before we start the GC. If we wait they memory will not be 672 // reclaimed until the next GC cycle. 673 clearpools() 674 675 work.cycles++ 676 677 // Assists and workers can start the moment we start 678 // the world. 679 gcController.startCycle(now, int(gomaxprocs), trigger) 680 681 // Notify the CPU limiter that assists may begin. 682 gcCPULimiter.startGCTransition(true, now) 683 684 // In STW mode, disable scheduling of user Gs. This may also 685 // disable scheduling of this goroutine, so it may block as 686 // soon as we start the world again. 687 if mode != gcBackgroundMode { 688 schedEnableUser(false) 689 } 690 691 // Enter concurrent mark phase and enable 692 // write barriers. 693 // 694 // Because the world is stopped, all Ps will 695 // observe that write barriers are enabled by 696 // the time we start the world and begin 697 // scanning. 698 // 699 // Write barriers must be enabled before assists are 700 // enabled because they must be enabled before 701 // any non-leaf heap objects are marked. Since 702 // allocations are blocked until assists can 703 // happen, we want enable assists as early as 704 // possible. 705 setGCPhase(_GCmark) 706 707 gcBgMarkPrepare() // Must happen before assist enable. 708 gcMarkRootPrepare() 709 710 // Mark all active tinyalloc blocks. Since we're 711 // allocating from these, they need to be black like 712 // other allocations. The alternative is to blacken 713 // the tiny block on every allocation from it, which 714 // would slow down the tiny allocator. 715 gcMarkTinyAllocs() 716 717 // At this point all Ps have enabled the write 718 // barrier, thus maintaining the no white to 719 // black invariant. Enable mutator assists to 720 // put back-pressure on fast allocating 721 // mutators. 722 atomic.Store(&gcBlackenEnabled, 1) 723 724 // In STW mode, we could block the instant systemstack 725 // returns, so make sure we're not preemptible. 726 mp = acquirem() 727 728 // Concurrent mark. 729 systemstack(func() { 730 now = startTheWorldWithSema(trace.enabled) 731 work.pauseNS += now - work.pauseStart 732 work.tMark = now 733 memstats.gcPauseDist.record(now - work.pauseStart) 734 735 // Release the CPU limiter. 736 gcCPULimiter.finishGCTransition(now) 737 }) 738 739 // Release the world sema before Gosched() in STW mode 740 // because we will need to reacquire it later but before 741 // this goroutine becomes runnable again, and we could 742 // self-deadlock otherwise. 743 semrelease(&worldsema) 744 releasem(mp) 745 746 // Make sure we block instead of returning to user code 747 // in STW mode. 748 if mode != gcBackgroundMode { 749 Gosched() 750 } 751 752 semrelease(&work.startSema) 753 } 754 755 // gcMarkDoneFlushed counts the number of P's with flushed work. 756 // 757 // Ideally this would be a captured local in gcMarkDone, but forEachP 758 // escapes its callback closure, so it can't capture anything. 759 // 760 // This is protected by markDoneSema. 761 var gcMarkDoneFlushed uint32 762 763 // gcMarkDone transitions the GC from mark to mark termination if all 764 // reachable objects have been marked (that is, there are no grey 765 // objects and can be no more in the future). Otherwise, it flushes 766 // all local work to the global queues where it can be discovered by 767 // other workers. 768 // 769 // This should be called when all local mark work has been drained and 770 // there are no remaining workers. Specifically, when 771 // 772 // work.nwait == work.nproc && !gcMarkWorkAvailable(p) 773 // 774 // The calling context must be preemptible. 775 // 776 // Flushing local work is important because idle Ps may have local 777 // work queued. This is the only way to make that work visible and 778 // drive GC to completion. 779 // 780 // It is explicitly okay to have write barriers in this function. If 781 // it does transition to mark termination, then all reachable objects 782 // have been marked, so the write barrier cannot shade any more 783 // objects. 784 func gcMarkDone() { 785 // Ensure only one thread is running the ragged barrier at a 786 // time. 787 semacquire(&work.markDoneSema) 788 789 top: 790 // Re-check transition condition under transition lock. 791 // 792 // It's critical that this checks the global work queues are 793 // empty before performing the ragged barrier. Otherwise, 794 // there could be global work that a P could take after the P 795 // has passed the ragged barrier. 796 if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) { 797 semrelease(&work.markDoneSema) 798 return 799 } 800 801 // forEachP needs worldsema to execute, and we'll need it to 802 // stop the world later, so acquire worldsema now. 803 semacquire(&worldsema) 804 805 // Flush all local buffers and collect flushedWork flags. 806 gcMarkDoneFlushed = 0 807 systemstack(func() { 808 gp := getg().m.curg 809 // Mark the user stack as preemptible so that it may be scanned. 810 // Otherwise, our attempt to force all P's to a safepoint could 811 // result in a deadlock as we attempt to preempt a worker that's 812 // trying to preempt us (e.g. for a stack scan). 813 casgstatus(gp, _Grunning, _Gwaiting) 814 forEachP(func(_p_ *p) { 815 // Flush the write barrier buffer, since this may add 816 // work to the gcWork. 817 wbBufFlush1(_p_) 818 819 // Flush the gcWork, since this may create global work 820 // and set the flushedWork flag. 821 // 822 // TODO(austin): Break up these workbufs to 823 // better distribute work. 824 _p_.gcw.dispose() 825 // Collect the flushedWork flag. 826 if _p_.gcw.flushedWork { 827 atomic.Xadd(&gcMarkDoneFlushed, 1) 828 _p_.gcw.flushedWork = false 829 } 830 }) 831 casgstatus(gp, _Gwaiting, _Grunning) 832 }) 833 834 if gcMarkDoneFlushed != 0 { 835 // More grey objects were discovered since the 836 // previous termination check, so there may be more 837 // work to do. Keep going. It's possible the 838 // transition condition became true again during the 839 // ragged barrier, so re-check it. 840 semrelease(&worldsema) 841 goto top 842 } 843 844 // There was no global work, no local work, and no Ps 845 // communicated work since we took markDoneSema. Therefore 846 // there are no grey objects and no more objects can be 847 // shaded. Transition to mark termination. 848 now := nanotime() 849 work.tMarkTerm = now 850 work.pauseStart = now 851 getg().m.preemptoff = "gcing" 852 if trace.enabled { 853 traceGCSTWStart(0) 854 } 855 systemstack(stopTheWorldWithSema) 856 // The gcphase is _GCmark, it will transition to _GCmarktermination 857 // below. The important thing is that the wb remains active until 858 // all marking is complete. This includes writes made by the GC. 859 860 // There is sometimes work left over when we enter mark termination due 861 // to write barriers performed after the completion barrier above. 862 // Detect this and resume concurrent mark. This is obviously 863 // unfortunate. 864 // 865 // See issue #27993 for details. 866 // 867 // Switch to the system stack to call wbBufFlush1, though in this case 868 // it doesn't matter because we're non-preemptible anyway. 869 restart := false 870 systemstack(func() { 871 for _, p := range allp { 872 wbBufFlush1(p) 873 if !p.gcw.empty() { 874 restart = true 875 break 876 } 877 } 878 }) 879 if restart { 880 getg().m.preemptoff = "" 881 systemstack(func() { 882 now := startTheWorldWithSema(true) 883 work.pauseNS += now - work.pauseStart 884 memstats.gcPauseDist.record(now - work.pauseStart) 885 }) 886 semrelease(&worldsema) 887 goto top 888 } 889 890 gcComputeStartingStackSize() 891 892 // Disable assists and background workers. We must do 893 // this before waking blocked assists. 894 atomic.Store(&gcBlackenEnabled, 0) 895 896 // Notify the CPU limiter that GC assists will now cease. 897 gcCPULimiter.startGCTransition(false, now) 898 899 // Wake all blocked assists. These will run when we 900 // start the world again. 901 gcWakeAllAssists() 902 903 // Likewise, release the transition lock. Blocked 904 // workers and assists will run when we start the 905 // world again. 906 semrelease(&work.markDoneSema) 907 908 // In STW mode, re-enable user goroutines. These will be 909 // queued to run after we start the world. 910 schedEnableUser(true) 911 912 // endCycle depends on all gcWork cache stats being flushed. 913 // The termination algorithm above ensured that up to 914 // allocations since the ragged barrier. 915 gcController.endCycle(now, int(gomaxprocs), work.userForced) 916 917 // Perform mark termination. This will restart the world. 918 gcMarkTermination() 919 } 920 921 // World must be stopped and mark assists and background workers must be 922 // disabled. 923 func gcMarkTermination() { 924 // Start marktermination (write barrier remains enabled for now). 925 setGCPhase(_GCmarktermination) 926 927 work.heap1 = gcController.heapLive 928 startTime := nanotime() 929 930 mp := acquirem() 931 mp.preemptoff = "gcing" 932 _g_ := getg() 933 _g_.m.traceback = 2 934 gp := _g_.m.curg 935 casgstatus(gp, _Grunning, _Gwaiting) 936 gp.waitreason = waitReasonGarbageCollection 937 938 // Run gc on the g0 stack. We do this so that the g stack 939 // we're currently running on will no longer change. Cuts 940 // the root set down a bit (g0 stacks are not scanned, and 941 // we don't need to scan gc's internal state). We also 942 // need to switch to g0 so we can shrink the stack. 943 systemstack(func() { 944 gcMark(startTime) 945 // Must return immediately. 946 // The outer function's stack may have moved 947 // during gcMark (it shrinks stacks, including the 948 // outer function's stack), so we must not refer 949 // to any of its variables. Return back to the 950 // non-system stack to pick up the new addresses 951 // before continuing. 952 }) 953 954 systemstack(func() { 955 work.heap2 = work.bytesMarked 956 if debug.gccheckmark > 0 { 957 // Run a full non-parallel, stop-the-world 958 // mark using checkmark bits, to check that we 959 // didn't forget to mark anything during the 960 // concurrent mark process. 961 startCheckmarks() 962 gcResetMarkState() 963 gcw := &getg().m.p.ptr().gcw 964 gcDrain(gcw, 0) 965 wbBufFlush1(getg().m.p.ptr()) 966 gcw.dispose() 967 endCheckmarks() 968 } 969 970 // marking is complete so we can turn the write barrier off 971 setGCPhase(_GCoff) 972 gcSweep(work.mode) 973 }) 974 975 _g_.m.traceback = 0 976 casgstatus(gp, _Gwaiting, _Grunning) 977 978 if trace.enabled { 979 traceGCDone() 980 } 981 982 // all done 983 mp.preemptoff = "" 984 985 if gcphase != _GCoff { 986 throw("gc done but gcphase != _GCoff") 987 } 988 989 // Record heapInUse for scavenger. 990 memstats.lastHeapInUse = gcController.heapInUse.load() 991 992 // Update GC trigger and pacing, as well as downstream consumers 993 // of this pacing information, for the next cycle. 994 systemstack(gcControllerCommit) 995 996 // Update timing memstats 997 now := nanotime() 998 sec, nsec, _ := time_now() 999 unixNow := sec*1e9 + int64(nsec) 1000 work.pauseNS += now - work.pauseStart 1001 work.tEnd = now 1002 memstats.gcPauseDist.record(now - work.pauseStart) 1003 atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user 1004 atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us 1005 memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS) 1006 memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow) 1007 memstats.pause_total_ns += uint64(work.pauseNS) 1008 1009 // Update work.totaltime. 1010 sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm) 1011 // We report idle marking time below, but omit it from the 1012 // overall utilization here since it's "free". 1013 markCpu := gcController.assistTime.Load() + gcController.dedicatedMarkTime + gcController.fractionalMarkTime 1014 markTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm) 1015 cycleCpu := sweepTermCpu + markCpu + markTermCpu 1016 work.totaltime += cycleCpu 1017 1018 // Compute overall GC CPU utilization. 1019 totalCpu := sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs) 1020 memstats.gc_cpu_fraction = float64(work.totaltime) / float64(totalCpu) 1021 1022 // Reset assist time stat. 1023 // 1024 // Do this now, instead of at the start of the next GC cycle, because 1025 // these two may keep accumulating even if the GC is not active. 1026 mheap_.pages.scav.assistTime.Store(0) 1027 1028 // Reset sweep state. 1029 sweep.nbgsweep = 0 1030 sweep.npausesweep = 0 1031 1032 if work.userForced { 1033 memstats.numforcedgc++ 1034 } 1035 1036 // Bump GC cycle count and wake goroutines waiting on sweep. 1037 lock(&work.sweepWaiters.lock) 1038 memstats.numgc++ 1039 injectglist(&work.sweepWaiters.list) 1040 unlock(&work.sweepWaiters.lock) 1041 1042 // Release the CPU limiter. 1043 gcCPULimiter.finishGCTransition(now) 1044 1045 // Finish the current heap profiling cycle and start a new 1046 // heap profiling cycle. We do this before starting the world 1047 // so events don't leak into the wrong cycle. 1048 mProf_NextCycle() 1049 1050 // There may be stale spans in mcaches that need to be swept. 1051 // Those aren't tracked in any sweep lists, so we need to 1052 // count them against sweep completion until we ensure all 1053 // those spans have been forced out. 1054 sl := sweep.active.begin() 1055 if !sl.valid { 1056 throw("failed to set sweep barrier") 1057 } 1058 1059 systemstack(func() { startTheWorldWithSema(true) }) 1060 1061 // Flush the heap profile so we can start a new cycle next GC. 1062 // This is relatively expensive, so we don't do it with the 1063 // world stopped. 1064 mProf_Flush() 1065 1066 // Prepare workbufs for freeing by the sweeper. We do this 1067 // asynchronously because it can take non-trivial time. 1068 prepareFreeWorkbufs() 1069 1070 // Free stack spans. This must be done between GC cycles. 1071 systemstack(freeStackSpans) 1072 1073 // Ensure all mcaches are flushed. Each P will flush its own 1074 // mcache before allocating, but idle Ps may not. Since this 1075 // is necessary to sweep all spans, we need to ensure all 1076 // mcaches are flushed before we start the next GC cycle. 1077 systemstack(func() { 1078 forEachP(func(_p_ *p) { 1079 _p_.mcache.prepareForSweep() 1080 }) 1081 }) 1082 // Now that we've swept stale spans in mcaches, they don't 1083 // count against unswept spans. 1084 sweep.active.end(sl) 1085 1086 // Print gctrace before dropping worldsema. As soon as we drop 1087 // worldsema another cycle could start and smash the stats 1088 // we're trying to print. 1089 if debug.gctrace > 0 { 1090 util := int(memstats.gc_cpu_fraction * 100) 1091 1092 var sbuf [24]byte 1093 printlock() 1094 print("gc ", memstats.numgc, 1095 " @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ", 1096 util, "%: ") 1097 prev := work.tSweepTerm 1098 for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} { 1099 if i != 0 { 1100 print("+") 1101 } 1102 print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev)))) 1103 prev = ns 1104 } 1105 print(" ms clock, ") 1106 for i, ns := range []int64{ 1107 sweepTermCpu, 1108 gcController.assistTime.Load(), 1109 gcController.dedicatedMarkTime + gcController.fractionalMarkTime, 1110 gcController.idleMarkTime, 1111 markTermCpu, 1112 } { 1113 if i == 2 || i == 3 { 1114 // Separate mark time components with /. 1115 print("/") 1116 } else if i != 0 { 1117 print("+") 1118 } 1119 print(string(fmtNSAsMS(sbuf[:], uint64(ns)))) 1120 } 1121 print(" ms cpu, ", 1122 work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ", 1123 gcController.lastHeapGoal>>20, " MB goal, ", 1124 atomic.Load64(&gcController.maxStackScan)>>20, " MB stacks, ", 1125 gcController.globalsScan>>20, " MB globals, ", 1126 work.maxprocs, " P") 1127 if work.userForced { 1128 print(" (forced)") 1129 } 1130 print("\n") 1131 printunlock() 1132 } 1133 1134 semrelease(&worldsema) 1135 semrelease(&gcsema) 1136 // Careful: another GC cycle may start now. 1137 1138 releasem(mp) 1139 mp = nil 1140 1141 // now that gc is done, kick off finalizer thread if needed 1142 if !concurrentSweep { 1143 // give the queued finalizers, if any, a chance to run 1144 Gosched() 1145 } 1146 } 1147 1148 // gcBgMarkStartWorkers prepares background mark worker goroutines. These 1149 // goroutines will not run until the mark phase, but they must be started while 1150 // the work is not stopped and from a regular G stack. The caller must hold 1151 // worldsema. 1152 func gcBgMarkStartWorkers() { 1153 // Background marking is performed by per-P G's. Ensure that each P has 1154 // a background GC G. 1155 // 1156 // Worker Gs don't exit if gomaxprocs is reduced. If it is raised 1157 // again, we can reuse the old workers; no need to create new workers. 1158 for gcBgMarkWorkerCount < gomaxprocs { 1159 go gcBgMarkWorker() 1160 1161 notetsleepg(&work.bgMarkReady, -1) 1162 noteclear(&work.bgMarkReady) 1163 // The worker is now guaranteed to be added to the pool before 1164 // its P's next findRunnableGCWorker. 1165 1166 gcBgMarkWorkerCount++ 1167 } 1168 } 1169 1170 // gcBgMarkPrepare sets up state for background marking. 1171 // Mutator assists must not yet be enabled. 1172 func gcBgMarkPrepare() { 1173 // Background marking will stop when the work queues are empty 1174 // and there are no more workers (note that, since this is 1175 // concurrent, this may be a transient state, but mark 1176 // termination will clean it up). Between background workers 1177 // and assists, we don't really know how many workers there 1178 // will be, so we pretend to have an arbitrarily large number 1179 // of workers, almost all of which are "waiting". While a 1180 // worker is working it decrements nwait. If nproc == nwait, 1181 // there are no workers. 1182 work.nproc = ^uint32(0) 1183 work.nwait = ^uint32(0) 1184 } 1185 1186 // gcBgMarkWorker is an entry in the gcBgMarkWorkerPool. It points to a single 1187 // gcBgMarkWorker goroutine. 1188 type gcBgMarkWorkerNode struct { 1189 // Unused workers are managed in a lock-free stack. This field must be first. 1190 node lfnode 1191 1192 // The g of this worker. 1193 gp guintptr 1194 1195 // Release this m on park. This is used to communicate with the unlock 1196 // function, which cannot access the G's stack. It is unused outside of 1197 // gcBgMarkWorker(). 1198 m muintptr 1199 } 1200 1201 func gcBgMarkWorker() { 1202 gp := getg() 1203 1204 // We pass node to a gopark unlock function, so it can't be on 1205 // the stack (see gopark). Prevent deadlock from recursively 1206 // starting GC by disabling preemption. 1207 gp.m.preemptoff = "GC worker init" 1208 node := new(gcBgMarkWorkerNode) 1209 gp.m.preemptoff = "" 1210 1211 node.gp.set(gp) 1212 1213 node.m.set(acquirem()) 1214 notewakeup(&work.bgMarkReady) 1215 // After this point, the background mark worker is generally scheduled 1216 // cooperatively by gcController.findRunnableGCWorker. While performing 1217 // work on the P, preemption is disabled because we are working on 1218 // P-local work buffers. When the preempt flag is set, this puts itself 1219 // into _Gwaiting to be woken up by gcController.findRunnableGCWorker 1220 // at the appropriate time. 1221 // 1222 // When preemption is enabled (e.g., while in gcMarkDone), this worker 1223 // may be preempted and schedule as a _Grunnable G from a runq. That is 1224 // fine; it will eventually gopark again for further scheduling via 1225 // findRunnableGCWorker. 1226 // 1227 // Since we disable preemption before notifying bgMarkReady, we 1228 // guarantee that this G will be in the worker pool for the next 1229 // findRunnableGCWorker. This isn't strictly necessary, but it reduces 1230 // latency between _GCmark starting and the workers starting. 1231 1232 for { 1233 // Go to sleep until woken by 1234 // gcController.findRunnableGCWorker. 1235 gopark(func(g *g, nodep unsafe.Pointer) bool { 1236 node := (*gcBgMarkWorkerNode)(nodep) 1237 1238 if mp := node.m.ptr(); mp != nil { 1239 // The worker G is no longer running; release 1240 // the M. 1241 // 1242 // N.B. it is _safe_ to release the M as soon 1243 // as we are no longer performing P-local mark 1244 // work. 1245 // 1246 // However, since we cooperatively stop work 1247 // when gp.preempt is set, if we releasem in 1248 // the loop then the following call to gopark 1249 // would immediately preempt the G. This is 1250 // also safe, but inefficient: the G must 1251 // schedule again only to enter gopark and park 1252 // again. Thus, we defer the release until 1253 // after parking the G. 1254 releasem(mp) 1255 } 1256 1257 // Release this G to the pool. 1258 gcBgMarkWorkerPool.push(&node.node) 1259 // Note that at this point, the G may immediately be 1260 // rescheduled and may be running. 1261 return true 1262 }, unsafe.Pointer(node), waitReasonGCWorkerIdle, traceEvGoBlock, 0) 1263 1264 // Preemption must not occur here, or another G might see 1265 // p.gcMarkWorkerMode. 1266 1267 // Disable preemption so we can use the gcw. If the 1268 // scheduler wants to preempt us, we'll stop draining, 1269 // dispose the gcw, and then preempt. 1270 node.m.set(acquirem()) 1271 pp := gp.m.p.ptr() // P can't change with preemption disabled. 1272 1273 if gcBlackenEnabled == 0 { 1274 println("worker mode", pp.gcMarkWorkerMode) 1275 throw("gcBgMarkWorker: blackening not enabled") 1276 } 1277 1278 if pp.gcMarkWorkerMode == gcMarkWorkerNotWorker { 1279 throw("gcBgMarkWorker: mode not set") 1280 } 1281 1282 startTime := nanotime() 1283 pp.gcMarkWorkerStartTime = startTime 1284 var trackLimiterEvent bool 1285 if pp.gcMarkWorkerMode == gcMarkWorkerIdleMode { 1286 trackLimiterEvent = pp.limiterEvent.start(limiterEventIdleMarkWork, startTime) 1287 } 1288 1289 decnwait := atomic.Xadd(&work.nwait, -1) 1290 if decnwait == work.nproc { 1291 println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc) 1292 throw("work.nwait was > work.nproc") 1293 } 1294 1295 systemstack(func() { 1296 // Mark our goroutine preemptible so its stack 1297 // can be scanned. This lets two mark workers 1298 // scan each other (otherwise, they would 1299 // deadlock). We must not modify anything on 1300 // the G stack. However, stack shrinking is 1301 // disabled for mark workers, so it is safe to 1302 // read from the G stack. 1303 casgstatus(gp, _Grunning, _Gwaiting) 1304 switch pp.gcMarkWorkerMode { 1305 default: 1306 throw("gcBgMarkWorker: unexpected gcMarkWorkerMode") 1307 case gcMarkWorkerDedicatedMode: 1308 gcDrain(&pp.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit) 1309 if gp.preempt { 1310 // We were preempted. This is 1311 // a useful signal to kick 1312 // everything out of the run 1313 // queue so it can run 1314 // somewhere else. 1315 if drainQ, n := runqdrain(pp); n > 0 { 1316 lock(&sched.lock) 1317 globrunqputbatch(&drainQ, int32(n)) 1318 unlock(&sched.lock) 1319 } 1320 } 1321 // Go back to draining, this time 1322 // without preemption. 1323 gcDrain(&pp.gcw, gcDrainFlushBgCredit) 1324 case gcMarkWorkerFractionalMode: 1325 gcDrain(&pp.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit) 1326 case gcMarkWorkerIdleMode: 1327 gcDrain(&pp.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit) 1328 } 1329 casgstatus(gp, _Gwaiting, _Grunning) 1330 }) 1331 1332 // Account for time and mark us as stopped. 1333 now := nanotime() 1334 duration := now - startTime 1335 gcController.markWorkerStop(pp.gcMarkWorkerMode, duration) 1336 if trackLimiterEvent { 1337 pp.limiterEvent.stop(limiterEventIdleMarkWork, now) 1338 } 1339 if pp.gcMarkWorkerMode == gcMarkWorkerFractionalMode { 1340 atomic.Xaddint64(&pp.gcFractionalMarkTime, duration) 1341 } 1342 1343 // Was this the last worker and did we run out 1344 // of work? 1345 incnwait := atomic.Xadd(&work.nwait, +1) 1346 if incnwait > work.nproc { 1347 println("runtime: p.gcMarkWorkerMode=", pp.gcMarkWorkerMode, 1348 "work.nwait=", incnwait, "work.nproc=", work.nproc) 1349 throw("work.nwait > work.nproc") 1350 } 1351 1352 // We'll releasem after this point and thus this P may run 1353 // something else. We must clear the worker mode to avoid 1354 // attributing the mode to a different (non-worker) G in 1355 // traceGoStart. 1356 pp.gcMarkWorkerMode = gcMarkWorkerNotWorker 1357 1358 // If this worker reached a background mark completion 1359 // point, signal the main GC goroutine. 1360 if incnwait == work.nproc && !gcMarkWorkAvailable(nil) { 1361 // We don't need the P-local buffers here, allow 1362 // preemption because we may schedule like a regular 1363 // goroutine in gcMarkDone (block on locks, etc). 1364 releasem(node.m.ptr()) 1365 node.m.set(nil) 1366 1367 gcMarkDone() 1368 } 1369 } 1370 } 1371 1372 // gcMarkWorkAvailable reports whether executing a mark worker 1373 // on p is potentially useful. p may be nil, in which case it only 1374 // checks the global sources of work. 1375 func gcMarkWorkAvailable(p *p) bool { 1376 if p != nil && !p.gcw.empty() { 1377 return true 1378 } 1379 if !work.full.empty() { 1380 return true // global work available 1381 } 1382 if work.markrootNext < work.markrootJobs { 1383 return true // root scan work available 1384 } 1385 return false 1386 } 1387 1388 // gcMark runs the mark (or, for concurrent GC, mark termination) 1389 // All gcWork caches must be empty. 1390 // STW is in effect at this point. 1391 func gcMark(startTime int64) { 1392 if debug.allocfreetrace > 0 { 1393 tracegc() 1394 } 1395 1396 if gcphase != _GCmarktermination { 1397 throw("in gcMark expecting to see gcphase as _GCmarktermination") 1398 } 1399 work.tstart = startTime 1400 1401 // Check that there's no marking work remaining. 1402 if work.full != 0 || work.markrootNext < work.markrootJobs { 1403 print("runtime: full=", hex(work.full), " next=", work.markrootNext, " jobs=", work.markrootJobs, " nDataRoots=", work.nDataRoots, " nBSSRoots=", work.nBSSRoots, " nSpanRoots=", work.nSpanRoots, " nStackRoots=", work.nStackRoots, "\n") 1404 panic("non-empty mark queue after concurrent mark") 1405 } 1406 1407 if debug.gccheckmark > 0 { 1408 // This is expensive when there's a large number of 1409 // Gs, so only do it if checkmark is also enabled. 1410 gcMarkRootCheck() 1411 } 1412 if work.full != 0 { 1413 throw("work.full != 0") 1414 } 1415 1416 // Drop allg snapshot. allgs may have grown, in which case 1417 // this is the only reference to the old backing store and 1418 // there's no need to keep it around. 1419 work.stackRoots = nil 1420 1421 // Clear out buffers and double-check that all gcWork caches 1422 // are empty. This should be ensured by gcMarkDone before we 1423 // enter mark termination. 1424 // 1425 // TODO: We could clear out buffers just before mark if this 1426 // has a non-negligible impact on STW time. 1427 for _, p := range allp { 1428 // The write barrier may have buffered pointers since 1429 // the gcMarkDone barrier. However, since the barrier 1430 // ensured all reachable objects were marked, all of 1431 // these must be pointers to black objects. Hence we 1432 // can just discard the write barrier buffer. 1433 if debug.gccheckmark > 0 { 1434 // For debugging, flush the buffer and make 1435 // sure it really was all marked. 1436 wbBufFlush1(p) 1437 } else { 1438 p.wbBuf.reset() 1439 } 1440 1441 gcw := &p.gcw 1442 if !gcw.empty() { 1443 printlock() 1444 print("runtime: P ", p.id, " flushedWork ", gcw.flushedWork) 1445 if gcw.wbuf1 == nil { 1446 print(" wbuf1=<nil>") 1447 } else { 1448 print(" wbuf1.n=", gcw.wbuf1.nobj) 1449 } 1450 if gcw.wbuf2 == nil { 1451 print(" wbuf2=<nil>") 1452 } else { 1453 print(" wbuf2.n=", gcw.wbuf2.nobj) 1454 } 1455 print("\n") 1456 throw("P has cached GC work at end of mark termination") 1457 } 1458 // There may still be cached empty buffers, which we 1459 // need to flush since we're going to free them. Also, 1460 // there may be non-zero stats because we allocated 1461 // black after the gcMarkDone barrier. 1462 gcw.dispose() 1463 } 1464 1465 // Flush scanAlloc from each mcache since we're about to modify 1466 // heapScan directly. If we were to flush this later, then scanAlloc 1467 // might have incorrect information. 1468 // 1469 // Note that it's not important to retain this information; we know 1470 // exactly what heapScan is at this point via scanWork. 1471 for _, p := range allp { 1472 c := p.mcache 1473 if c == nil { 1474 continue 1475 } 1476 c.scanAlloc = 0 1477 } 1478 1479 // Reset controller state. 1480 gcController.resetLive(work.bytesMarked) 1481 } 1482 1483 // gcSweep must be called on the system stack because it acquires the heap 1484 // lock. See mheap for details. 1485 // 1486 // The world must be stopped. 1487 // 1488 //go:systemstack 1489 func gcSweep(mode gcMode) { 1490 assertWorldStopped() 1491 1492 if gcphase != _GCoff { 1493 throw("gcSweep being done but phase is not GCoff") 1494 } 1495 1496 lock(&mheap_.lock) 1497 mheap_.sweepgen += 2 1498 sweep.active.reset() 1499 mheap_.pagesSwept.Store(0) 1500 mheap_.sweepArenas = mheap_.allArenas 1501 mheap_.reclaimIndex.Store(0) 1502 mheap_.reclaimCredit.Store(0) 1503 unlock(&mheap_.lock) 1504 1505 sweep.centralIndex.clear() 1506 1507 if !_ConcurrentSweep || mode == gcForceBlockMode { 1508 // Special case synchronous sweep. 1509 // Record that no proportional sweeping has to happen. 1510 lock(&mheap_.lock) 1511 mheap_.sweepPagesPerByte = 0 1512 unlock(&mheap_.lock) 1513 // Sweep all spans eagerly. 1514 for sweepone() != ^uintptr(0) { 1515 sweep.npausesweep++ 1516 } 1517 // Free workbufs eagerly. 1518 prepareFreeWorkbufs() 1519 for freeSomeWbufs(false) { 1520 } 1521 // All "free" events for this mark/sweep cycle have 1522 // now happened, so we can make this profile cycle 1523 // available immediately. 1524 mProf_NextCycle() 1525 mProf_Flush() 1526 return 1527 } 1528 1529 // Background sweep. 1530 lock(&sweep.lock) 1531 if sweep.parked { 1532 sweep.parked = false 1533 ready(sweep.g, 0, true) 1534 } 1535 unlock(&sweep.lock) 1536 } 1537 1538 // gcResetMarkState resets global state prior to marking (concurrent 1539 // or STW) and resets the stack scan state of all Gs. 1540 // 1541 // This is safe to do without the world stopped because any Gs created 1542 // during or after this will start out in the reset state. 1543 // 1544 // gcResetMarkState must be called on the system stack because it acquires 1545 // the heap lock. See mheap for details. 1546 // 1547 //go:systemstack 1548 func gcResetMarkState() { 1549 // This may be called during a concurrent phase, so lock to make sure 1550 // allgs doesn't change. 1551 forEachG(func(gp *g) { 1552 gp.gcscandone = false // set to true in gcphasework 1553 gp.gcAssistBytes = 0 1554 }) 1555 1556 // Clear page marks. This is just 1MB per 64GB of heap, so the 1557 // time here is pretty trivial. 1558 lock(&mheap_.lock) 1559 arenas := mheap_.allArenas 1560 unlock(&mheap_.lock) 1561 for _, ai := range arenas { 1562 ha := mheap_.arenas[ai.l1()][ai.l2()] 1563 for i := range ha.pageMarks { 1564 ha.pageMarks[i] = 0 1565 } 1566 } 1567 1568 work.bytesMarked = 0 1569 work.initialHeapLive = atomic.Load64(&gcController.heapLive) 1570 } 1571 1572 // Hooks for other packages 1573 1574 var poolcleanup func() 1575 var boringCaches []unsafe.Pointer // for crypto/internal/boring 1576 1577 //go:linkname sync_runtime_registerPoolCleanup sync.runtime_registerPoolCleanup 1578 func sync_runtime_registerPoolCleanup(f func()) { 1579 poolcleanup = f 1580 } 1581 1582 //go:linkname boring_registerCache crypto/internal/boring/bcache.registerCache 1583 func boring_registerCache(p unsafe.Pointer) { 1584 boringCaches = append(boringCaches, p) 1585 } 1586 1587 func clearpools() { 1588 // clear sync.Pools 1589 if poolcleanup != nil { 1590 poolcleanup() 1591 } 1592 1593 // clear boringcrypto caches 1594 for _, p := range boringCaches { 1595 atomicstorep(p, nil) 1596 } 1597 1598 // Clear central sudog cache. 1599 // Leave per-P caches alone, they have strictly bounded size. 1600 // Disconnect cached list before dropping it on the floor, 1601 // so that a dangling ref to one entry does not pin all of them. 1602 lock(&sched.sudoglock) 1603 var sg, sgnext *sudog 1604 for sg = sched.sudogcache; sg != nil; sg = sgnext { 1605 sgnext = sg.next 1606 sg.next = nil 1607 } 1608 sched.sudogcache = nil 1609 unlock(&sched.sudoglock) 1610 1611 // Clear central defer pool. 1612 // Leave per-P pools alone, they have strictly bounded size. 1613 lock(&sched.deferlock) 1614 // disconnect cached list before dropping it on the floor, 1615 // so that a dangling ref to one entry does not pin all of them. 1616 var d, dlink *_defer 1617 for d = sched.deferpool; d != nil; d = dlink { 1618 dlink = d.link 1619 d.link = nil 1620 } 1621 sched.deferpool = nil 1622 unlock(&sched.deferlock) 1623 } 1624 1625 // Timing 1626 1627 // itoaDiv formats val/(10**dec) into buf. 1628 func itoaDiv(buf []byte, val uint64, dec int) []byte { 1629 i := len(buf) - 1 1630 idec := i - dec 1631 for val >= 10 || i >= idec { 1632 buf[i] = byte(val%10 + '0') 1633 i-- 1634 if i == idec { 1635 buf[i] = '.' 1636 i-- 1637 } 1638 val /= 10 1639 } 1640 buf[i] = byte(val + '0') 1641 return buf[i:] 1642 } 1643 1644 // fmtNSAsMS nicely formats ns nanoseconds as milliseconds. 1645 func fmtNSAsMS(buf []byte, ns uint64) []byte { 1646 if ns >= 10e6 { 1647 // Format as whole milliseconds. 1648 return itoaDiv(buf, ns/1e6, 0) 1649 } 1650 // Format two digits of precision, with at most three decimal places. 1651 x := ns / 1e3 1652 if x == 0 { 1653 buf[0] = '0' 1654 return buf[:1] 1655 } 1656 dec := 3 1657 for x >= 100 { 1658 x /= 10 1659 dec-- 1660 } 1661 return itoaDiv(buf, x, dec) 1662 } 1663 1664 // Helpers for testing GC. 1665 1666 // gcTestMoveStackOnNextCall causes the stack to be moved on a call 1667 // immediately following the call to this. It may not work correctly 1668 // if any other work appears after this call (such as returning). 1669 // Typically the following call should be marked go:noinline so it 1670 // performs a stack check. 1671 // 1672 // In rare cases this may not cause the stack to move, specifically if 1673 // there's a preemption between this call and the next. 1674 func gcTestMoveStackOnNextCall() { 1675 gp := getg() 1676 gp.stackguard0 = stackForceMove 1677 } 1678 1679 // gcTestIsReachable performs a GC and returns a bit set where bit i 1680 // is set if ptrs[i] is reachable. 1681 func gcTestIsReachable(ptrs ...unsafe.Pointer) (mask uint64) { 1682 // This takes the pointers as unsafe.Pointers in order to keep 1683 // them live long enough for us to attach specials. After 1684 // that, we drop our references to them. 1685 1686 if len(ptrs) > 64 { 1687 panic("too many pointers for uint64 mask") 1688 } 1689 1690 // Block GC while we attach specials and drop our references 1691 // to ptrs. Otherwise, if a GC is in progress, it could mark 1692 // them reachable via this function before we have a chance to 1693 // drop them. 1694 semacquire(&gcsema) 1695 1696 // Create reachability specials for ptrs. 1697 specials := make([]*specialReachable, len(ptrs)) 1698 for i, p := range ptrs { 1699 lock(&mheap_.speciallock) 1700 s := (*specialReachable)(mheap_.specialReachableAlloc.alloc()) 1701 unlock(&mheap_.speciallock) 1702 s.special.kind = _KindSpecialReachable 1703 if !addspecial(p, &s.special) { 1704 throw("already have a reachable special (duplicate pointer?)") 1705 } 1706 specials[i] = s 1707 // Make sure we don't retain ptrs. 1708 ptrs[i] = nil 1709 } 1710 1711 semrelease(&gcsema) 1712 1713 // Force a full GC and sweep. 1714 GC() 1715 1716 // Process specials. 1717 for i, s := range specials { 1718 if !s.done { 1719 printlock() 1720 println("runtime: object", i, "was not swept") 1721 throw("IsReachable failed") 1722 } 1723 if s.reachable { 1724 mask |= 1 << i 1725 } 1726 lock(&mheap_.speciallock) 1727 mheap_.specialReachableAlloc.free(unsafe.Pointer(s)) 1728 unlock(&mheap_.speciallock) 1729 } 1730 1731 return mask 1732 } 1733 1734 // gcTestPointerClass returns the category of what p points to, one of: 1735 // "heap", "stack", "data", "bss", "other". This is useful for checking 1736 // that a test is doing what it's intended to do. 1737 // 1738 // This is nosplit simply to avoid extra pointer shuffling that may 1739 // complicate a test. 1740 // 1741 //go:nosplit 1742 func gcTestPointerClass(p unsafe.Pointer) string { 1743 p2 := uintptr(noescape(p)) 1744 gp := getg() 1745 if gp.stack.lo <= p2 && p2 < gp.stack.hi { 1746 return "stack" 1747 } 1748 if base, _, _ := findObject(p2, 0, 0); base != 0 { 1749 return "heap" 1750 } 1751 for _, datap := range activeModules() { 1752 if datap.data <= p2 && p2 < datap.edata || datap.noptrdata <= p2 && p2 < datap.enoptrdata { 1753 return "data" 1754 } 1755 if datap.bss <= p2 && p2 < datap.ebss || datap.noptrbss <= p2 && p2 <= datap.enoptrbss { 1756 return "bss" 1757 } 1758 } 1759 KeepAlive(p) 1760 return "other" 1761 } 1762