Garbage Collection (GC) and Memory Performance in Rails

Garbage Collection (GC) is a critical aspect of memory management in Ruby and Rails applications. It automatically reclaims memory that is no longer in use, preventing memory leaks and ensuring efficient memory utilization. However, improper handling of memory allocation and GC can lead to performance bottlenecks, especially in large applications.

Ruby’s GC manages memory by tracking objects in the heap. When an object is no longer referenced (i.e., it is no longer reachable), the GC marks it for collection and frees the memory. Ruby uses a generational garbage collection algorithm, which is optimized for performance by focusing on younger objects (which are more likely to be garbage).

Steps in Ruby’s GC:

  1. Marking: Traverse all live objects starting from root objects (e.g., global variables, stack frames) and mark them as “in use.”
  2. Sweeping: Free memory occupied by unmarked objects (garbage).
  3. Compacting (optional): Rearrange live objects to reduce memory fragmentation.

Types of GC in Ruby

1) Mark-and-Sweep GC:

  • The default GC algorithm in older Ruby versions.
  • It scans all objects and frees unreachable ones.
  • Example: Ruby 1.8 used this approach.

2) Generational GC:

  • Introduced in Ruby 2.1.
  • Divides objects into “young” and “old” generations.
  • Young objects are collected more frequently, as they are more likely to become garbage.
# Ruby automatically uses generational GC
obj = Object.new
# obj is in the young generation
GC.start(full_mark: true, immediate_sweep: true) # Force a full GC

3) Incremental GC:

  • Introduced in Ruby 2.2.
  • Reduces GC pause times by breaking the marking phase into smaller steps.
# Ruby automatically uses incremental GC
GC.start(full_mark: false) # Perform an incremental GC

4) Compaction GC:

  • Introduced in Ruby 2.7.
  • Reduces memory fragmentation by moving live objects closer together.
GC.auto_compact = true # Enable automatic compaction
GC.compact # Manually trigger compaction

Memory Allocation Best Practices

1) Use Symbols Instead of Strings:

Symbols are immutable and use less memory.

# Bad: Allocates new strings
hash = { "key" => "value" }

# Good: Uses symbols
hash = { key: :value }

2) Freeze Strings:

Use frozen strings to reduce allocations.

# Bad: Allocates new strings
"foo".upcase

# Good: Freezes the string
"foo".freeze.upcase

3) Use Efficient Data Structures:

Choose the right data structure for the task.

# Bad: Uses an array for lookups
array = ["a", "b", "c"]
array.include?("b")

# Good: Uses a set for lookups
set = Set.new(["a", "b", "c"])
set.include?("b")

4) Garbage Collection:

  1. Let Ruby handle GC automatically; avoid frequent manual GC calls.
  2. Compact memory periodically in long-running applications.

5) Memory Management:

  1. Use lazy enumerables to reduce intermediate object creation.
  2. Paginate or batch process large datasets.
  3. Monitor memory usage with tools like memory_profiler and rack-mini-profiler.

6) Avoid Circular References

Circular references prevent objects from being garbage collected.

# Bad: Circular reference
class Node
  attr_accessor :next
end

node1 = Node.new
node2 = Node.new
node1.next = node2
node2.next = node1

# Good: Break circular references
node1.next = nil
node2.next = nil

7) Avoid Large Payloads

Pass only necessary data to background jobs to reduce memory usage.

# Bad: Passes a large object
MyJob.perform_large_object(large_object)

# Good: Passes an ID or minimal data
MyJob.perform_later(object.id)

8) Avoid Unnecessary Object Creation

Minimize the creation of temporary or duplicate objects to reduce memory usage.

Bad Practice:

# Creates a new string object on every iteration
array = [1, 2, 3]
array.map { |i| "#{i}" }

Best Practice:

# Reuses frozen strings
array = [1, 2, 3]
array.map { |i| i.to_s.freeze }

9) Use find_each Instead of all

find_each processes records in batches, reducing memory usage compared to loading all records at once.

Bad Practice:

User.all.each do |user|
process(user)
end

Best Practice:

User.find_each(batch_size: 1000) do |user|
process(user)
end

10) Avoid Large Hashes/Arrays

Large data structures retained in memory can cause leaks if not managed properly.

Bad Practice:

# Stores unnecessary data in memory
users = User.all.to_a

Best Practice:

# Process records in small chunks
User.find_each(batch_size: 500) do |user|
process(user)
end

11) Limit String Mutations

Repeatedly modifying strings can create multiple objects in memory.

Bad Practice:

string = ""
1000.times { string += "data" }

Best Practice:

string = String.new
1000.times { string << "data" }

12) Avoid Retaining References to Objects

Long-lived variables or global references can prevent objects from being garbage collected.

Bad Practice:

# Retains references indefinitely
@users = User.all

Best Practice:

# Use local variables
users = User.all

13) Use Batching for Large Tasks

Process data in smaller batches to avoid loading large amounts of data into memory.

Bad Practice:

class LargeJob
def perform
User.all.each { |user| process(user) }
end
end

Best Practice:

class LargeJob
def perform
User.find_each(batch_size: 1000) { |user| process(user) }
end
end

14) Expire Cache Keys

Always set expiration times (expires_in) for cache entries to prevent indefinite retention.

Bad Practice:

Rails.cache.write("users", User.all)

Best Practice:

Rails.cache.write("users", User.all, expires_in: 1.hour)

15) Avoid Caching Large Datasets

Large datasets can lead to excessive memory usage.

Bad Practice:

Rails.cache.write("large_dataset", LargeModel.all)

Best Practice:

Rails.cache.write("small_dataset", LargeModel.limit(100).pluck(:id, :name))

16) Use Low-Level Cache

Use Rails.cache.fetch with a block to ensure proper retrieval and expiration.

Example:

data = Rails.cache.fetch("key", expires_in: 10.minutes) do
fetch_data_from_db
end

17) Use Cache Compression

Enable compression for large cache entries to reduce memory usage.

Example:

Rails.cache.write("key", large_object, compress: true)

Choosing the Right Tools for Your Application

For Memory Profiling:

  1. Use memory_profiler and ObjectSpace for detailed memory leak analysis.
  2. Use derailed_benchmarks for benchmarking overall memory usage.

For GC Monitoring:

  1. Use GC.stat for real-time insights into GC behavior.
  2. Use GC.compact (Ruby 2.7+) to handle memory fragmentation.

For Background Jobs:

  1. Use sidekiq-memory-killer to manage memory usage in Sidekiq workers.

For Performance Monitoring:

  1. Use rack-mini-profiler for real-time profiling in development.
  2. Use New Relic or ScoutAPM for production-grade performance monitoring.

For SQL Query Optimization:

  1. Use Bullet to identify and fix N+1 queries.
CategoryBest PracticeWhy It’s ImportantCode Example
Garbage CollectionLet Ruby handle GC automatically.Manual GC calls can cause unnecessary overhead.GC.start only when debugging memory issues.
Compact memory in long-running apps.Reduces memory fragmentation, especially for apps with high allocation.GC.compact
Tune GC parameters for large apps.Optimizes GC frequency and heap size for better performance.Set ENV variables like RUBY_GC_HEAP_GROWTH_FACTOR=1.5 and RUBY_GC_MALLOC_LIMIT=2000000.
Monitor GC with GC.stat and enable profiling in development.Helps track GC activity and optimize GC settings.puts GC.stat or enable profiling with GC::Profiler.enable.
Avoid frequent manual GC calls.Interrupts application flow unnecessarily and creates overhead.Use GC.start sparingly, e.g., GC.start if condition.
Avoid excessive object promotion (young → old generation).Reduces GC pressure and improves performance.Avoid creating temporary long-lived objects.
Enable incremental GC for reduced pause time.Spreads GC work over time, reducing application response delays.Supported by default in Ruby 2.2+
Restart background processes periodically to clear retained memory.Prevents memory leaks and unmanaged GC accumulation.Use sidekiq-memory-killer to automatically restart workers after memory exceeds a threshold.
Object ManagementUse immutable objects where possible.Reduces memory allocations and prevents unnecessary garbage."constant".freeze
Avoid creating unnecessary temporary objects.Reduces memory usage and garbage creation.Use `array.map {
Use in-place operations to avoid creating new objects.Reduces memory overhead in loops and repeated operations.str << "data" instead of str += "data".
Reuse objects in performance-critical sections.Prevents frequent allocation and deallocation of objects.Use a connection or object pool for reusable objects.
ActiveRecord QueriesUse find_each instead of all for batch processing.Prevents loading all records into memory at once.`User.find_each(batch_size: 1000) {
Select only required fields with .pluck or .select.Reduces memory used by ActiveRecord objects.User.pluck(:id, :name)
Avoid N+1 queries by preloading associations.Prevents excessive memory and database usage.`Post.includes(:comments).each {
Stream results for large datasets to reduce memory usage.Processes data incrementally to avoid loading everything into memory.send_data generate_csv, stream: true
Cache query results where appropriate.Reduces redundant database calls and memory usage.Rails.cache.fetch("key") { expensive_query }
Background JobsProcess large datasets in smaller batches.Prevents excessive memory usage by workers.`User.find_each(batch_size: 500) {
Avoid passing large objects as job arguments.Reduces memory and serialization overhead in job queues.Pass IDs instead: MyJob.perform_async(user.id).
Restart workers periodically for long-running jobs.Clears accumulated memory leaks and refreshes GC state.Use sidekiq-memory-killer or restart workers manually.
Clear temporary objects within the job after processing.Frees memory occupied by temporary variables.Set objects to nil after use: temp_var = nil; GC.start.
CachingSet expiration times for cache entries.Prevents indefinite memory retention and stale cache entries.Rails.cache.write("users", User.all, expires_in: 1.hour)
Avoid caching large datasets in memory.Large cache entries can bloat memory unnecessarily.Cache only what you need: Rails.cache.write("summary", User.limit(100).pluck(:id, :name)).
Use compression for large cache entries.Saves memory by compressing large objects.Rails.cache.write("key", large_object, compress: true)
Clear stale or unused cache entries periodically.Prevents memory bloat from growing caches.Use Rails.cache.clear during scheduled cleanup tasks.
Monitoring Memory UsageUse memory_profiler to track memory leaks.Identifies sources of memory bloat in specific code blocks.MemoryProfiler.report { User.all.to_a }
Analyze live objects with ObjectSpace.Tracks object types and counts to identify leaks.ObjectSpace.each_object(String).count
Use heap dumps for advanced memory analysis.Dumps all live objects to analyze memory retention and leaks.ObjectSpace.dump_all(output: File.open("heap.json", "w"))
Monitor gem memory usage with derailed_benchmarks.Identifies memory-hungry gems in your application.bundle exec derailed bundle:mem
Performance OptimizationUse lazy enumerables for large operations.Avoids creating intermediate arrays in memory.`(1..1_000_000).lazy.map {
Stream large responses instead of preloading them.Reduces memory usage for large file downloads or API responses.send_data large_file, stream: true
Minimize string mutations in loops.Avoids creating excessive temporary objects.Use string << "data" instead of string += "data".
Precompile assets in production to save memory.Reduces memory usage by serving compressed and minified files.Use rails assets:precompile.
Testing for Memory LeaksEnable memory profiling tools in development.Detects memory leaks early before deployment.Use rack-mini-profiler or memory_profiler.
Test heap allocation with ObjectSpace.dump_all.Helps analyze memory usage at runtime.ObjectSpace.dump_all(output: File.open("heap.json", "w"))
Profile memory usage on endpoints with rack-mini-profiler.Tracks memory usage per request and identifies bottlenecks.Integrate rack-mini-profiler in your Rails app.
Log and monitor GC activity in production.Provides insights into how GC behaves under load.Enable GC profiling with GC::Profiler.enable.

Official Documentation

  1. Ruby Garbage Collection Documentation
    Learn about Ruby’s built-in garbage collection, including GC.stat and GC.compact.
  2. Rails Performance Guide
    A comprehensive guide from the Rails team on optimizing Rails applications.
  3. ObjectSpace Documentation
    Details on tracking and analyzing objects in Ruby’s object space.

Articles and Tutorials

  1. Understanding Ruby Garbage Collection
  2. How to Detect and Fix Memory Leaks in Ruby

Monitoring and SaaS Tools

  1. Scout APM
    • Real-time memory and performance monitoring for Ruby applications.
    • Scout APM
  2. Skylight
    • Performance monitoring specifically designed for Rails applications.
    • Skylight
  3. New Relic
    • Advanced application monitoring for Ruby, with GC and memory insights.
    • New Relic
  4. AppSignal
    • A lightweight alternative for Rails performance monitoring and GC analysis.
    • AppSignal

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top