Garbage Collection (GC) is a critical aspect of memory management in Ruby and Rails applications. It automatically reclaims memory that is no longer in use, preventing memory leaks and ensuring efficient memory utilization. However, improper handling of memory allocation and GC can lead to performance bottlenecks, especially in large applications.
Ruby’s GC manages memory by tracking objects in the heap. When an object is no longer referenced (i.e., it is no longer reachable), the GC marks it for collection and frees the memory. Ruby uses a generational garbage collection algorithm, which is optimized for performance by focusing on younger objects (which are more likely to be garbage).
Steps in Ruby’s GC:
- Marking: Traverse all live objects starting from root objects (e.g., global variables, stack frames) and mark them as “in use.”
- Sweeping: Free memory occupied by unmarked objects (garbage).
- Compacting (optional): Rearrange live objects to reduce memory fragmentation.
Types of GC in Ruby
1) Mark-and-Sweep GC:
- The default GC algorithm in older Ruby versions.
- It scans all objects and frees unreachable ones.
- Example: Ruby 1.8 used this approach.
2) Generational GC:
- Introduced in Ruby 2.1.
- Divides objects into “young” and “old” generations.
- Young objects are collected more frequently, as they are more likely to become garbage.
# Ruby automatically uses generational GC
obj = Object.new
# obj is in the young generation
GC.start(full_mark: true, immediate_sweep: true) # Force a full GC
3) Incremental GC:
- Introduced in Ruby 2.2.
- Reduces GC pause times by breaking the marking phase into smaller steps.
# Ruby automatically uses incremental GC
GC.start(full_mark: false) # Perform an incremental GC
4) Compaction GC:
- Introduced in Ruby 2.7.
- Reduces memory fragmentation by moving live objects closer together.
GC.auto_compact = true # Enable automatic compaction
GC.compact # Manually trigger compaction
Memory Allocation Best Practices
1) Use Symbols Instead of Strings:
Symbols are immutable and use less memory.
# Bad: Allocates new strings
hash = { "key" => "value" }
# Good: Uses symbols
hash = { key: :value }
2) Freeze Strings:
Use frozen strings to reduce allocations.
# Bad: Allocates new strings
"foo".upcase
# Good: Freezes the string
"foo".freeze.upcase
3) Use Efficient Data Structures:
Choose the right data structure for the task.
# Bad: Uses an array for lookups
array = ["a", "b", "c"]
array.include?("b")
# Good: Uses a set for lookups
set = Set.new(["a", "b", "c"])
set.include?("b")
4) Garbage Collection:
- Let Ruby handle GC automatically; avoid frequent manual GC calls.
- Compact memory periodically in long-running applications.
5) Memory Management:
- Use lazy enumerables to reduce intermediate object creation.
- Paginate or batch process large datasets.
- Monitor memory usage with tools like
memory_profiler
andrack-mini-profiler
.
6) Avoid Circular References
Circular references prevent objects from being garbage collected.
# Bad: Circular reference
class Node
attr_accessor :next
end
node1 = Node.new
node2 = Node.new
node1.next = node2
node2.next = node1
# Good: Break circular references
node1.next = nil
node2.next = nil
7) Avoid Large Payloads
Pass only necessary data to background jobs to reduce memory usage.
# Bad: Passes a large object
MyJob.perform_large_object(large_object)
# Good: Passes an ID or minimal data
MyJob.perform_later(object.id)
8) Avoid Unnecessary Object Creation
Minimize the creation of temporary or duplicate objects to reduce memory usage.
Bad Practice:
# Creates a new string object on every iteration
array = [1, 2, 3]
array.map { |i| "#{i}" }
Best Practice:
# Reuses frozen strings
array = [1, 2, 3]
array.map { |i| i.to_s.freeze }
9) Use find_each
Instead of all
find_each
processes records in batches, reducing memory usage compared to loading all records at once.
Bad Practice:
User.all.each do |user|
process(user)
end
Best Practice:
User.find_each(batch_size: 1000) do |user|
process(user)
end
10) Avoid Large Hashes/Arrays
Large data structures retained in memory can cause leaks if not managed properly.
Bad Practice:
# Stores unnecessary data in memory
users = User.all.to_a
Best Practice:
# Process records in small chunks
User.find_each(batch_size: 500) do |user|
process(user)
end
11) Limit String Mutations
Repeatedly modifying strings can create multiple objects in memory.
Bad Practice:
string = ""
1000.times { string += "data" }
Best Practice:
string = String.new
1000.times { string << "data" }
12) Avoid Retaining References to Objects
Long-lived variables or global references can prevent objects from being garbage collected.
Bad Practice:
# Retains references indefinitely
@users = User.all
Best Practice:
# Use local variables
users = User.all
13) Use Batching for Large Tasks
Process data in smaller batches to avoid loading large amounts of data into memory.
Bad Practice:
class LargeJob
def perform
User.all.each { |user| process(user) }
end
end
Best Practice:
class LargeJob
def perform
User.find_each(batch_size: 1000) { |user| process(user) }
end
end
14) Expire Cache Keys
Always set expiration times (expires_in
) for cache entries to prevent indefinite retention.
Bad Practice:
Rails.cache.write("users", User.all)
Best Practice:
Rails.cache.write("users", User.all, expires_in: 1.hour)
15) Avoid Caching Large Datasets
Large datasets can lead to excessive memory usage.
Bad Practice:
Rails.cache.write("large_dataset", LargeModel.all)
Best Practice:
Rails.cache.write("small_dataset", LargeModel.limit(100).pluck(:id, :name))
16) Use Low-Level Cache
Use Rails.cache.fetch
with a block to ensure proper retrieval and expiration.
Example:
data = Rails.cache.fetch("key", expires_in: 10.minutes) do
fetch_data_from_db
end
17) Use Cache Compression
Enable compression for large cache entries to reduce memory usage.
Example:
Rails.cache.write("key", large_object, compress: true)
Choosing the Right Tools for Your Application
For Memory Profiling:
- Use
memory_profiler
andObjectSpace
for detailed memory leak analysis. - Use
derailed_benchmarks
for benchmarking overall memory usage.
For GC Monitoring:
- Use
GC.stat
for real-time insights into GC behavior. - Use
GC.compact
(Ruby 2.7+) to handle memory fragmentation.
For Background Jobs:
- Use
sidekiq-memory-killer
to manage memory usage in Sidekiq workers.
For Performance Monitoring:
- Use
rack-mini-profiler
for real-time profiling in development. - Use
New Relic
orScoutAPM
for production-grade performance monitoring.
For SQL Query Optimization:
- Use
Bullet
to identify and fix N+1 queries.
Category | Best Practice | Why It’s Important | Code Example |
---|---|---|---|
Garbage Collection | Let Ruby handle GC automatically. | Manual GC calls can cause unnecessary overhead. | GC.start only when debugging memory issues. |
Compact memory in long-running apps. | Reduces memory fragmentation, especially for apps with high allocation. | GC.compact | |
Tune GC parameters for large apps. | Optimizes GC frequency and heap size for better performance. | Set ENV variables like RUBY_GC_HEAP_GROWTH_FACTOR=1.5 and RUBY_GC_MALLOC_LIMIT=2000000 . | |
Monitor GC with GC.stat and enable profiling in development. | Helps track GC activity and optimize GC settings. | puts GC.stat or enable profiling with GC::Profiler.enable . | |
Avoid frequent manual GC calls. | Interrupts application flow unnecessarily and creates overhead. | Use GC.start sparingly, e.g., GC.start if condition . | |
Avoid excessive object promotion (young → old generation). | Reduces GC pressure and improves performance. | Avoid creating temporary long-lived objects. | |
Enable incremental GC for reduced pause time. | Spreads GC work over time, reducing application response delays. | Supported by default in Ruby 2.2+ | |
Restart background processes periodically to clear retained memory. | Prevents memory leaks and unmanaged GC accumulation. | Use sidekiq-memory-killer to automatically restart workers after memory exceeds a threshold. | |
Object Management | Use immutable objects where possible. | Reduces memory allocations and prevents unnecessary garbage. | "constant".freeze |
Avoid creating unnecessary temporary objects. | Reduces memory usage and garbage creation. | Use `array.map { | |
Use in-place operations to avoid creating new objects. | Reduces memory overhead in loops and repeated operations. | str << "data" instead of str += "data" . | |
Reuse objects in performance-critical sections. | Prevents frequent allocation and deallocation of objects. | Use a connection or object pool for reusable objects. | |
ActiveRecord Queries | Use find_each instead of all for batch processing. | Prevents loading all records into memory at once. | `User.find_each(batch_size: 1000) { |
Select only required fields with .pluck or .select . | Reduces memory used by ActiveRecord objects. | User.pluck(:id, :name) | |
Avoid N+1 queries by preloading associations. | Prevents excessive memory and database usage. | `Post.includes(:comments).each { | |
Stream results for large datasets to reduce memory usage. | Processes data incrementally to avoid loading everything into memory. | send_data generate_csv, stream: true | |
Cache query results where appropriate. | Reduces redundant database calls and memory usage. | Rails.cache.fetch("key") { expensive_query } | |
Background Jobs | Process large datasets in smaller batches. | Prevents excessive memory usage by workers. | `User.find_each(batch_size: 500) { |
Avoid passing large objects as job arguments. | Reduces memory and serialization overhead in job queues. | Pass IDs instead: MyJob.perform_async(user.id) . | |
Restart workers periodically for long-running jobs. | Clears accumulated memory leaks and refreshes GC state. | Use sidekiq-memory-killer or restart workers manually. | |
Clear temporary objects within the job after processing. | Frees memory occupied by temporary variables. | Set objects to nil after use: temp_var = nil; GC.start . | |
Caching | Set expiration times for cache entries. | Prevents indefinite memory retention and stale cache entries. | Rails.cache.write("users", User.all, expires_in: 1.hour) |
Avoid caching large datasets in memory. | Large cache entries can bloat memory unnecessarily. | Cache only what you need: Rails.cache.write("summary", User.limit(100).pluck(:id, :name)) . | |
Use compression for large cache entries. | Saves memory by compressing large objects. | Rails.cache.write("key", large_object, compress: true) | |
Clear stale or unused cache entries periodically. | Prevents memory bloat from growing caches. | Use Rails.cache.clear during scheduled cleanup tasks. | |
Monitoring Memory Usage | Use memory_profiler to track memory leaks. | Identifies sources of memory bloat in specific code blocks. | MemoryProfiler.report { User.all.to_a } |
Analyze live objects with ObjectSpace . | Tracks object types and counts to identify leaks. | ObjectSpace.each_object(String).count | |
Use heap dumps for advanced memory analysis. | Dumps all live objects to analyze memory retention and leaks. | ObjectSpace.dump_all(output: File.open("heap.json", "w")) | |
Monitor gem memory usage with derailed_benchmarks . | Identifies memory-hungry gems in your application. | bundle exec derailed bundle:mem | |
Performance Optimization | Use lazy enumerables for large operations. | Avoids creating intermediate arrays in memory. | `(1..1_000_000).lazy.map { |
Stream large responses instead of preloading them. | Reduces memory usage for large file downloads or API responses. | send_data large_file, stream: true | |
Minimize string mutations in loops. | Avoids creating excessive temporary objects. | Use string << "data" instead of string += "data" . | |
Precompile assets in production to save memory. | Reduces memory usage by serving compressed and minified files. | Use rails assets:precompile . | |
Testing for Memory Leaks | Enable memory profiling tools in development. | Detects memory leaks early before deployment. | Use rack-mini-profiler or memory_profiler . |
Test heap allocation with ObjectSpace.dump_all . | Helps analyze memory usage at runtime. | ObjectSpace.dump_all(output: File.open("heap.json", "w")) | |
Profile memory usage on endpoints with rack-mini-profiler . | Tracks memory usage per request and identifies bottlenecks. | Integrate rack-mini-profiler in your Rails app. | |
Log and monitor GC activity in production. | Provides insights into how GC behaves under load. | Enable GC profiling with GC::Profiler.enable . |
Official Documentation
- Ruby Garbage Collection Documentation
Learn about Ruby’s built-in garbage collection, includingGC.stat
andGC.compact
. - Rails Performance Guide
A comprehensive guide from the Rails team on optimizing Rails applications. - ObjectSpace Documentation
Details on tracking and analyzing objects in Ruby’s object space.
Articles and Tutorials
- Understanding Ruby Garbage Collection
- A deep dive into Ruby’s GC, including how generational GC and incremental GC work.
- BigBinary Blog: Ruby GC
- How to Detect and Fix Memory Leaks in Ruby
- A tutorial on identifying and fixing memory leaks using
memory_profiler
andObjectSpace
. - Scout APM: Memory Leaks in Ruby
- A tutorial on identifying and fixing memory leaks using
Monitoring and SaaS Tools
- Scout APM
- Real-time memory and performance monitoring for Ruby applications.
- Scout APM
- Skylight
- Performance monitoring specifically designed for Rails applications.
- Skylight
- New Relic
- Advanced application monitoring for Ruby, with GC and memory insights.
- New Relic
- AppSignal
- A lightweight alternative for Rails performance monitoring and GC analysis.
- AppSignal