Today I was debugging performance issues in one of our reports, and I could not make heads nor tails of it for a long time.
I was pretty sure the problem was with one of the more complex scopes where UNION was being used, but having spent a couple hours trying several approaches, I ended up pasting the scope in GPT and asking for tips.
This is what I stared out with:
scope :special_records, ->(project_id:) do
project_config_entries = CompanyProjects::TeklaReports::TotalsConfigEntry.where(project_id: project_id)
specials_from_report_file = unscoped.where(special: true).select(:id)
specials_by_exact_config_match = unscoped.where(uid: project_config_entries.lookup_exact.select(:lookup_term)).select(:id)
specials_by_prefix_config_match = unscoped.joins(
"INNER JOIN (#{project_config_entries.lookup_prefix.to_sql}) AS specials_config_entries ON #{table_name}.uid LIKE specials_config_entries.lookup_term || '%'"
).select(:id)
union_sql = [
specials_from_report_file.to_sql,
specials_by_exact_config_match.to_sql,
specials_by_prefix_config_match.to_sql
].join("\nUNION\n")
where(id: from("(#{union_sql}) AS singles_from_several_sources").select(:id))
end
The tricky part here is that I need to UNION several datasources that could identify a record as "special", and I need (or at least I thought I did) to JOIN using LIKE to get the prefix-based match lookup to work.
Turns out an EXISTS pattern can be used both for the JOIN and for the ultimate WHERE id IN
, leading to several orders of magnitude improvement - in my test data I went from 30s report generation query time to under 1s!
And this is what I ended up with:
scope :specials, ->(project_id:) do
project_config_entries = CompanyProjects::TeklaReports::TotalsConfigEntry.where(project_id: project_id)
specials_from_report_file = unscoped.where(special: true).select(:id)
specials_by_exact_config_match = unscoped.where(
"EXISTS (
SELECT 1 FROM (#{project_config_entries.lookup_exact.to_sql}) AS specials_config_entries
WHERE specials_config_entries.lookup_term = #{table_name}.uid
)"
).select(:id)
specials_by_prefix_config_match = unscoped.where(
"EXISTS (
SELECT 1 FROM (#{project_config_entries.lookup_prefix.to_sql}) AS specials_config_entries
WHERE #{table_name}.uid LIKE specials_config_entries.lookup_term || '%'
)"
).select(:id)
union_sql = [
specials_from_report_file.to_sql,
specials_by_exact_config_match.to_sql,
specials_by_prefix_config_match.to_sql
].join("\nUNION\n")
where(
"EXISTS (
SELECT 1 FROM (#{union_sql}) AS specials_from_several_sources
WHERE specials_from_several_sources.id = #{table_name}.id
)"
)
end
Notice that replacing WHERE field IN
with WHERE EXISTS (SELECT 1 ..
is a general pattern and can yield good results wherever the subquery has a lot of records.