Skip to main content

Goroutine: Performance & Tradeoffs

Adding more goroutines won't always lead to improved performance because of several practical limits and inefficiencies in concurrent systems. These factors include contention, overhead, and resource constraints.

Contention

Contention occurs when multiple goroutines compete for the same resource (e.g., memory, CPU, or I/O).

  • Shared Memory Access: Goroutines might need to access shared variables or data structures. Using synchronization primitives like sync.Mutex or sync.RWMutex can serialize access, causing some goroutines to block.
  • I/O Bottlenecks: When multiple goroutines make network or disk I/O requests simultaneously, they compete for underlying hardware, creating contention.
  • CPU Resources: Even with multiple CPU cores, if the workload is CPU-bound, adding more goroutines won't help beyond the available cores.

Example

package main

import (
"fmt"
"sync"
)

func incrementCounter(wg *sync.WaitGroup, mu *sync.Mutex, counter *int) {
defer wg.Done()
mu.Lock()
*counter++
mu.Unlock()
}

func main() {
var wg sync.WaitGroup
var mu sync.Mutex
counter := 0

// Launch 1000 goroutines
for i := 0; i < 1000; i++ {
wg.Add(1)
go incrementCounter(&wg, &mu, &counter)
}

wg.Wait()
fmt.Println("Final Counter Value:", counter)
}

In this example:

  • Although there are 1000 goroutines, contention occurs at the mu.Lock() section, where only one goroutine can update the counter at a time.
  • This limits performance improvement.

Overhead

Every goroutine has a small but non-zero cost. Adding a large number of goroutines can increase the overhead for the Go runtime.

  • Goroutine Scheduling: The Go runtime schedules goroutines onto OS threads. Managing thousands of goroutines adds scheduling overhead.
  • Memory Consumption: Each goroutine starts with a small stack (e.g., 2KB), which can grow as needed. Creating millions of goroutines can lead to significant memory usage.
  • Garbage Collection (GC): A large number of goroutines and shared memory usage can increase the workload for the garbage collector.
package main

import (
"fmt"
"time"
)

func work() {
time.Sleep(2 * time.Second) // Simulate a workload
}

func main() {
for i := 0; i < 1_000_000; i++ { // Launch 1 million goroutines
go work()
}
fmt.Println("All goroutines started")
time.Sleep(5 * time.Second)
}

In this example:

  • While Go is efficient, launching 1 million goroutines might cause the program to run out of memory or experience significant delays due to scheduling overhead.

Resource Limits

Adding more goroutines cannot overcome the physical and logical resource limits of the system.

  • CPU-Bound Tasks: If the system has 4 CPU cores and the workload is entirely CPU-bound, adding more than 4 goroutines doesn't help. The additional goroutines will just increase context-switching overhead.
  • I/O-Bound Tasks: The underlying hardware (e.g., disk or network) might have limits on the number of simultaneous operations it can handle.

Example

package main

import (
"fmt"
"io"
"net/http"
"sync"
)

func fetchURL(url string, wg *sync.WaitGroup) {
defer wg.Done()
resp, err := http.Get(url)
if err != nil {
fmt.Println("Error:", err)
return
}
defer resp.Body.Close()
io.Copy(io.Discard, resp.Body)
}

func main() {
var wg sync.WaitGroup
url := "https://example.com"

for i := 0; i < 1000; i++ { // Launch 1000 concurrent requests
wg.Add(1)
go fetchURL(url, &wg)
}

wg.Wait()
fmt.Println("All requests completed")
}

In this example:

  • If the network or the server cannot handle 1000 simultaneous connections, requests will fail or be queued, limiting performance.

How to Optimize

If you're reaching a point of diminishing returns with goroutines, consider these strategies:

Control Goroutine Concurrency

Use worker pools to limit the number of concurrent goroutines. This reduces contention and resource usage.

Example: Worker Pool

package main

import (
"fmt"
"time"
)

func worker(id int, jobs <-chan int, results chan<- int) {
for job := range jobs {
fmt.Printf("Worker %d processing job %d\n", id, job)
time.Sleep(time.Second) // Simulate work
results <- job * 2
}
}

func main() {
const numJobs = 5
const numWorkers = 2

jobs := make(chan int, numJobs)
results := make(chan int, numJobs)

for w := 1; w <= numWorkers; w++ {
go worker(w, jobs, results)
}

for j := 1; j <= numJobs; j++ {
jobs <- j
}
close(jobs)

for a := 1; a <= numJobs; a++ {
fmt.Println("Result:", <-results)
}
}

Load Balancing

Distribute work more evenly across resources or interfaces.

Optimize Resource Usage

Profile your application using tools like pprof to identify bottlenecks in memory, CPU, or I/O.