Optimizing Ruby on Rails App for High Performance (Part 1)
Introduction:
Ruby on Rails is a powerful framework that can be optimized for high performance. With the right techniques and tools, you can significantly improve the performance of your Rails application. So I will explore some tips and tricks for optimizing Ruby on Rails.
For Optimize Rails there are multi ways, examples:
Optimize Rails, Optimize Ruby, Caching, Database optimization, Load balancing, and Using Tools for optimizing.
Part 1: Optimize Using ActiveRecord:
Knowing how to use ActiveRecord will help you Optimize Rails, so what is ActiveRecord?
Active Record is the M in MVC — the model — which is the system layer responsible for representing business data and logic. Active Record facilitates the creation and use of business objects whose data requires persistent storage in a database. It is an implementation of the Active Record pattern which itself is a description of an Object Relational Mapping system. ActiveRecord is the default ORM for Rails which interacts with your database by generating and executing SQL.
So how can I write fast code using ActiveRecord?
It depends on your issue, so you will try to do your best to find where the issue is.
Here are some tips for Optimizing ActiveRecord:
NOTE: The Result for every test by benchmark depends on the PC.
1- Avoid Using “SELECT *”:
Avoid using the “SELECT *” syntax when querying the database, and instead, only select the specific columns that are needed.
Example:
require 'benchmark'
Benchmark.bm do |x|
x.report('Without Select with specific columns') { User.select('*').where('age > 17')}
x.report('Select with specific columns') { User.select(:id, :name).where('age > 17')}
end
Explain:
The first query uses select('*')
which means that it selects all columns from the users table. The second query uses select(:id, :name)
which means that it only selects the id and name columns and this will help in the optimization and performance.
Result:
user system total real
Without Select with specific columns
0.001934 0.001727 0.003661 ( 0.003655)
Select with specific columns
0.000040 0.000001 0.000041 ( 0.000040)
=>
[#<Benchmark::Tms:0x0000000107dd4020
@cstime=0.0,
@cutime=0.0,
@label="Without Select with specific columns",
@real=0.003655000007711351,
@stime=0.001727000000000034,
@total=0.0036610000000000253,
@utime=0.0019339999999999913>,
#<Benchmark::Tms:0x0000000107ddf1f0
@cstime=0.0,
@cutime=0.0,
@label="Select with specific columns",
@real=3.999998443759978e-05,
@stime=1.0000000000287557e-06,
@total=4.100000000006876e-05,
@utime=4.0000000000040004e-05>]
2- Use the pluck:
You can use the pluck method to select only the columns that are needed. The pluck method returns an array of values for a given column or set of columns, without loading the entire record.
selecting specific columns instead of using select('*')
can improve performance by reducing the amount of data that needs to be loaded from the database.
For example:
User.where('age > 17').pluck(:id, :name, :email)
This will return an array of all the active users’ names without loading any other columns, which can lead to better performance.
Using pluck can be especially useful when you only need a specific subset of data from a large table, as it can help to minimize the amount of data loaded into memory and can improve the overall speed of your application.
Different Between Select And Pluck:
select
returns ActiveRecord::Relation.
pluck
returns an array of raw data.
3- Avoid unnecessary database queries:
In Ruby, the ||= operator is called the “conditional assignment” operator. It assigns the value on the right-hand side of the operator to the variable on the left-hand side, only if the variable is nil or false.
This technique you can use to avoid unnecessary database queries by evaluating and assigning to an instance variable, the code will not execute a database query to retrieve data again if an instance variable is not null. This can help improve the performance of your Ruby on Rails application by minimizing the number of database queries that are executed.
Example:
require 'benchmark'
n = 1000
Benchmark.bm do |x|
x.report("Without Cache") do
n.times do
User.where('age > 17').limit(1000).to_a
end
end
x.report("With Cache") do
n.times do
@user ||= User.where('age > 17').limit(1000).to_a
end
end
end
Explain:
The first query without cache which finds all users with an age greater than 17 and limits the result to 1000 records, then returns them as an array and the query is executed directly without caching.
The second one is with cache which finds all users with an age greater than 17 and limits the result to 1000 records, then returns them as an array, the result of the query is stored in an instance variable and reused on each iteration.
Result:
user system total real
Without Query Cache 11.016094 0.306331 11.322425 ( 14.048669)
With Query Cache 0.010477 0.000280 0.010757 ( 0.013251)
=>
[#<Benchmark::Tms:0x000000010ba05fa8
@cstime=0.0,
@cutime=0.0,
@label="Without Query Cache",
@real=14.0486689999816,
@stime=0.306331,
@total=11.322424999999999,
@utime=11.016093999999999>,
#<Benchmark::Tms:0x000000010baf6480
@cstime=0.0,
@cutime=0.0,
@label="With Query Cache",
@real=0.013251000025775284,
@stime=0.00027999999999994696,
@total=0.010756999999999795,
@utime=0.010476999999999848>]
For more Info: rails-activerecord-caching,
4- Indexed Querying:
the querying unindexed columns can lead to unnecessary full table scans, which can be very inefficient and slow down the performance of the application.
To avoid this, it is recommended to index the columns that are frequently queried. Indexing creates a data structure that helps to quickly locate the rows that match a query, which can significantly improve the performance of the application.
Example:
1- Without Using a Column that has Index
require 'benchmark'
Benchmark.bm do |x|
x.report("Without Index") do
User.where('age > 17').first
end
end
Explain:
This querying user table without an added index on the ‘age’ column uses the ‘where’ ActiveRecord method to find the first record where ‘age’ is greater than 17 and when the data is big will be slow in performance.
Result:
user system total real
Without Index 0.013519 0.011939 0.025458 ( 0.042348)
=>
[#<Benchmark::Tms:0x000000010a88e590
@cstime=0.0,
@cutime=0.0,
@label="Without Index",
@real=0.042347999988123775,
@stime=0.011939000000000033,
@total=0.02545800000000009,
@utime=0.013519000000000059>]
2- With Index:
class AddIndexToEmailColumnInUser < ActiveRecord::Migration[7.0]
def change
add_index(:users, :age, algorithm: :concurrently)
end
end
Run Test Benchmark
Benchmark.bm do |x|
x.report("With Index") do
User.where('age > 17').first
end
end
Explain:
This querying user table with an added index on the ‘age’ column uses the ‘where’ ActiveRecord method to find the first record where ‘age’ is greater than 17 and when the data is big will be more speed in performance because the column that using for search is indexed and will help for performance.
Result:
user system total real
With Index 0.001651 0.000219 0.001870 ( 0.003978)
=>
[#<Benchmark::Tms:0x000000010b21e128
@cstime=0.0,
@cutime=0.0,
@label="With Index",
@real=0.003978000022470951,
@stime=0.00021899999999996922,
@total=0.0018699999999998163,
@utime=0.001650999999999847>]
For more Info: rails-postgresql-queries
5- Cache ActiveRecord:
Query Cache is a feature in Ruby on Rails that helps to improve the performance of applications by caching the results of SQL queries. It works by storing the results of SQL queries in memory so that if the same query is executed again, the cached result can be returned instead of executing the query again. This can help to reduce the number of redundant SQL executions and can improve the performance of the application.
By using Query Cache to avoid using ActiveRecord. you can use Rails.cache.fetch for the cache.
NOTE: It’s important to note that caching can have limitations and should be used appropriately based on the needs of the application. It’s also important to consider cache invalidation strategies to ensure that stale data is not being served from the cache For More Info.
Example:
require 'benchmark'
Rails.cache.fetch("first_user_with_age_greater_than_17", expires_in: 5.minutes) do
User.where('age > 17').first
end
n = 1000
Benchmark.bm do |x|
x.report("Without Cache") do
n.times { User.where('age > 17').first }
end
x.report("With Cache") do
1000.times { Rails.cache.fetch('first_user_with_age_greater_than_17') }
end
end
Explain:
The first query fetches the same user record without caching and always will query them from the database and this will slow performance.
The second one caches the result of a query using Rails.cache.fetch. The query retrieves the first user record with an age greater than 17 from the database, and the cache is set to expire after 5 minutes. When the cache is empty, the query is executed and the result is stored in the cache. When the cache is not empty, the result is retrieved from the cache instead of executing the query again.
Result:
user system total real
Without Cache 0.174255 0.029472 0.203727 ( 0.465946)
With Cache 0.006842 0.000187 0.007029 ( 0.007029)
=>
[#<Benchmark::Tms:0x00000001077f4290
@cstime=0.0,
@cutime=0.0,
@label="Without Cache",
@real=0.4659459999820683,
@stime=0.029471999999999998,
@total=0.20372700000000005,
@utime=0.17425500000000005>,
#<Benchmark::Tms:0x000000010790cd80
@cstime=0.0,
@cutime=0.0,
@label="With Cache",
@real=0.007029000000329688,
@stime=0.00018699999999999273,
@total=0.0070290000000000075,
@utime=0.006842000000000015>]
6- Eager loading and Avoid N+1 queries:
ActiveRecord can help optimize database queries by: eager loading and avoiding N+1 queries. With eager loading, associated records are automatically loaded when querying a model, reducing the number of database queries needed and improving performance. This can be done using the includes method. On the other hand, N+1 queries occur when querying associated records in a loop, resulting in many unnecessary database queries. To avoid N+1 queries, you can use eager loading or preload associated records using the preload method.
Example:
require 'benchmark'
Benchmark.bm do |x|
x.report("Without Includes") do
User.where('age > 17').each do |user|
user.orders.select(:id)
end
end
x.report("With Includes") do
User.includes(:orders).where('age > 17').each do |user|
user.orders.select(:id)
end
end
x.report("Eager Loading with Joins") do
User.joins(:orders).where('age > 17').each do |user|
user.orders.select(:id)
end
end
x.report("Eager Loading with Preloading") do
User.where('age > 17').preload(:orders).each do |user|
user.orders.select(:id)
end
end
end
Explain:
The first query fetch all User
records where the age is greater than 17, and then for each user, selects the id
column of each associated Order
record. This results in an N+1 query, where each associated Order
record is queried separately from the database.
The second query uses eager loading by calling the includes
a method, which will fetch all associated Order
records with a single query to the database. This reduces the number of queries and can improve performance.
The third query uses eager loading with a joins
method, which will fetch all User
and Order
records in a single query, with User
records duplicated for each associated Order
. While this reduces the number of queries to the database, and result maybe has duplicated data.
The fourth query uses preloading, which loads all associated Order
records with a separate query to the database before iterating over the User
records. This can also reduce the number of queries and improve performance.
Result:
user system total real
Without Includes 86.801217 37.577502 124.378719 (124.912928)
With Includes 97.626392 11.808023 109.434415 (111.486347)
Eager Loading with Joins 2.702298 0.052320 2.754618 ( 2.786072)
Eager Loading with Preloading 94.085891 37.495771 131.581662 (133.945658)
=>
[#<Benchmark::Tms:0x00000002e397da20
@cstime=0.0,
@cutime=0.0,
@label="Without Includes",
@real=124.91292799997609,
@stime=37.577501999999996,
@total=124.378719,
@utime=86.80121700000001>,
#<Benchmark::Tms:0x00000002e576d148
@cstime=0.0,
@cutime=0.0,
@label="With Includes",
@real=111.4863469999982,
@stime=11.808022999999999,
@total=109.434415,
@utime=97.626392>,
#<Benchmark::Tms:0x000000010e22e1b0
@cstime=0.0,
@cutime=0.0,
@label="Eager Loading with Joins",
@real=2.786072000017157,
@stime=0.0523200000000017,
@total=2.754618000000015,
@utime=2.702298000000013>,
#<Benchmark::Tms:0x00000002b2e55fb0
@cstime=0.0,
@cutime=0.0,
@label="Eager Loading with Preloading",
@real=133.94565799998236,
@stime=37.495771000000005,
@total=131.581662,
@utime=94.08589099999998>]
For more Info: rails-performance-3-tips-for-removing-n-1-queries, rails-n-1-queries-and-eager-loading-10eh
7- Batch processing:
When processing large datasets, you can use batch processing to load and process data in small groups and iterate between them. This can improve performance and reduce memory usage.
Example:
User.where('age > 17').limit(100000).find_in_batches(batch_size: 10000) do |batch|
batch.each do |user|
user.orders.first
end
end
Explain:
ActiveRecord’s find_in_batches
method is used to load records in batches from the database, instead of loading all records at once. This helps to avoid memory bloat and reduce the overhead of loading large numbers of records.
By specifying a batch size of 10,000, the records are loaded in batches of 10,000 at a time, which is an optimal size for many databases.
This approach can be helpful in optimizing performance when working with large data sets by reducing memory usage and the number of database queries.
For more Info: Blog kiprosh, Rails Guide
Conclusion:
Optimizing Ruby on Rails for high performance requires a holistic approach that includes techniques such as caching, database optimization, load balancing, and the use of third-party tools. By implementing these techniques, you can significantly improve the performance of your Rails application and ensure it can handle high traffic volumes.
There are other tips in ActiveRecord we can use to optimize the rails app, I will try to cover some of them in the next parts, and in the next parts I will cover other ways to optimize the rails app.
Please follow me: [GitHub — LinkedIn]