I've been looking deeper into this prompted by a PM from xrg who warned me not to invest the effort patching before checking that Postgres can still make use of the index when the query gets as complex as is used in A2B. The query I've been working on improving tonight is (the actual digits used have been obscured):
Code:
SELECT tariffgroupname, lcrtype, idtariffgroup, cc_tariffgroup_plan.idtariffplan, tariffname, destination,
cc_ratecard.id, dialprefix, destination, buyrate, buyrateinitblock, buyrateincrement, rateinitial, initblock, billingblock,
connectcharge, disconnectcharge, stepchargea, chargea, timechargea, billingblocka, stepchargeb, chargeb,
timechargeb, billingblockb, stepchargec, chargec, timechargec, billingblockc,
cc_tariffplan.id_trunk AS tp_id_trunk, tp_trunk.trunkprefix AS tp_trunk, tp_trunk.providertech AS tp_providertech,
tp_trunk.providerip AS tp_providerip, tp_trunk.removeprefix AS tp_removeprefix,
cc_ratecard.id_trunk AS rc_id_trunk, rt_trunk.trunkprefix AS rc_trunkprefix, rt_trunk.providertech AS rc_providertech,
rt_trunk.providerip AS rc_providerip, rt_trunk.removeprefix AS rc_removeprefix,
musiconhold,
tp_trunk.failover_trunk AS tp_failover_trunk,
rt_trunk.failover_trunk AS rt_failover_trunk,
tp_trunk.addparameter AS tp_addparameter_trunk,
rt_trunk.addparameter AS rt_addparameter_trunk,
id_outbound_cidgroup,
freetimetocall_package_offer, freetimetocall, packagetype, billingtype, startday, id_cc_package_offer
FROM cc_tariffgroup
RIGHT JOIN cc_tariffgroup_plan ON cc_tariffgroup.id=3
INNER JOIN cc_tariffplan ON (cc_tariffplan.id=cc_tariffgroup_plan.idtariffplan )
LEFT JOIN cc_ratecard ON cc_ratecard.idtariffplan=cc_tariffplan.id
LEFT JOIN cc_trunk AS rt_trunk ON cc_ratecard.id_trunk=rt_trunk.id_trunk
LEFT JOIN cc_trunk AS tp_trunk ON cc_tariffplan.id_trunk=tp_trunk.id_trunk
LEFT JOIN cc_package_offer ON cc_package_offer.id=cc_tariffgroup.id_cc_package_offer
WHERE (dialprefix=SUBSTRING('923XXXXXXX',1,length(dialprefix)) OR dialprefix='defaultprefix')
AND startingdate<= CURRENT_TIMESTAMP AND (expirationdate > CURRENT_TIMESTAMP OR expirationdate IS NULL OR LENGTH(expirationdate)<5)
AND startdate<= CURRENT_TIMESTAMP AND (stopdate > CURRENT_TIMESTAMP OR stopdate IS NULL OR LENGTH(stopdate)<5)
AND (starttime <= 7945 AND endtime >=7945)
AND idtariffgroup='3'
AND ( dnidprefix=SUBSTRING('00923XXXXXXXXX',1,length(dnidprefix)) OR (dnidprefix='all' AND 0 = (SELECT count(dnidprefix) FROM cc_tariffplan RIGHT JOIN cc_tariffgroup_plan ON cc_tariffgroup_plan.idtariffgroup=3 WHERE dnidprefix=SUBSTRING('00923XXXXXXXXX',1,length(dnidprefix)) ) ))
AND ( calleridprefix=SUBSTRING('XXXXXXXXXX',1,length(calleridprefix)) OR (calleridprefix='all' AND 0 = (SELECT count(calleridprefix) FROM cc_tariffplan RIGHT JOIN cc_tariffgroup_plan ON cc_tariffgroup_plan.idtariffgroup=3 WHERE calleridprefix=SUBSTRING('XXXXXXXXXX',1,length(calleridprefix)) ) ))
ORDER BY LENGTH(dialprefix) DESC
This is a real-world example gleaned from a production server.
Using ANALYZE EXPLAIN we can see what Postgres' query planner makes of it:
Code:
Sort (cost=13495.23..13495.24 rows=4 width=406) (actual time=769.143..769.146 rows=2 loops=1)
Sort Key: length(cc_ratecard.dialprefix)
InitPlan
-> Aggregate (cost=8.09..8.09 rows=1 width=7) (actual time=0.140..0.141 rows=1 loops=1)
-> Nested Loop (cost=0.00..8.07 rows=4 width=7) (actual time=0.123..0.123 rows=0 loops=1)
-> Seq Scan on cc_tariffplan (cost=0.00..6.47 rows=1 width=7) (actual time=0.120..0.120 rows=0 loops=1)
Filter: (dnidprefix = "substring"('00923XXXXXXXXX'::text, 1, length(dnidprefix)))
-> Seq Scan on cc_tariffgroup_plan (cost=0.00..1.56 rows=4 width=0) (never executed)
Filter: (idtariffgroup = 3)
-> Aggregate (cost=8.09..8.09 rows=1 width=7) (actual time=0.063..0.065 rows=1 loops=1)
-> Nested Loop (cost=0.00..8.07 rows=4 width=7) (actual time=0.049..0.049 rows=0 loops=1)
-> Seq Scan on cc_tariffplan (cost=0.00..6.47 rows=1 width=7) (actual time=0.045..0.045 rows=0 loops=1)
Filter: (calleridprefix = "substring"('XXXXXXXXXX'::text, 1, length(calleridprefix)))
-> Seq Scan on cc_tariffgroup_plan (cost=0.00..1.56 rows=4 width=0) (never executed)
Filter: (idtariffgroup = 3)
-> Nested Loop Left Join (cost=0.00..13479.01 rows=4 width=406) (actual time=205.005..769.115 rows=2 loops=1)
Join Filter: ("inner".id = "outer".id_cc_package_offer)
-> Nested Loop Left Join (cost=0.00..13474.19 rows=4 width=390) (actual time=204.885..768.926 rows=2 loops=1)
-> Nested Loop Left Join (cost=0.00..13453.05 rows=4 width=330) (actual time=204.842..768.846 rows=2 loops=1)
-> Nested Loop (cost=0.00..13431.91 rows=4 width=270) (actual time=204.773..768.745 rows=2 loops=1)
Join Filter: ("inner".idtariffplan = "outer".idtariffplan)
-> Nested Loop (cost=0.00..13.76 rows=1 width=81) (actual time=0.724..3.554 rows=3 loops=1)
Join Filter: ("outer".id = "inner".idtariffplan)
-> Seq Scan on cc_tariffplan (cost=0.00..7.55 rows=1 width=39) (actual time=0.284..0.666 rows=27 loops=1)
Filter: ((startingdate <= ('now'::text)::timestamp(6) with time zone) AND ((expirationdate > ('now'::text)::timestamp(6) with time zone) OR (expirationdate IS NULL) OR (length((expirationdate)::text) < 5)) AND ((dnidprefix = "substring"('00923XXXXXXXXX'::text, 1, length(dnidprefix))) OR ((dnidprefix = 'all'::text) AND (0 = $0))) AND ((calleridprefix = "substring"('XXXXXXXXXX'::text, 1, length(calleridprefix))) OR ((calleridprefix = 'all'::text) AND (0 = $1))))
-> Nested Loop Left Join (cost=0.00..6.15 rows=4 width=42) (actual time=0.015..0.094 rows=4 loops=27)
-> Seq Scan on cc_tariffgroup_plan (cost=0.00..1.56 rows=4 width=8) (actual time=0.005..0.026 rows=4 loops=27)
Filter: (idtariffgroup = 3)
-> Seq Scan on cc_tariffgroup (cost=0.00..1.14 rows=1 width=34) (actual time=0.004..0.008 rows=1 loops=108)
Filter: (id = 3)
-> Seq Scan on cc_ratecard (cost=0.00..13407.81 rows=828 width=197) (actual time=13.120..254.958 rows=26 loops=3)
Filter: (((dialprefix = "substring"('923XXXXXXXXX'::text, 1, length(dialprefix))) OR (dialprefix = 'defaultprefix'::text)) AND (startdate <= ('now'::text)::timestamp(6) with time zone) AND ((stopdate > ('now'::text)::timestamp(6) with time zone) OR (stopdate IS NULL) OR (length((stopdate)::text) < 5)) AND (starttime <= 7945) AND (endtime >= 7945))
-> Index Scan using cc_trunk_pkey on cc_trunk rt_trunk (cost=0.00..5.27 rows=1 width=64) (actual time=0.017..0.019 rows=1 loops=2)
Index Cond: ("outer".id_trunk = rt_trunk.id_trunk)
-> Index Scan using cc_trunk_pkey on cc_trunk tp_trunk (cost=0.00..5.27 rows=1 width=64) (actual time=0.004..0.006 rows=1 loops=2)
Index Cond: ("outer".id_trunk = tp_trunk.id_trunk)
-> Seq Scan on cc_package_offer (cost=0.00..1.09 rows=9 width=24) (actual time=0.011..0.028 rows=9 loops=2)
Total runtime: 769.675 ms
Ouch! 0.77 seconds to look up a rate, and contrary to my expectations it also returned two matches.
I tested this 'feature' again by looking up '441392XXXXXX' and got back a rate for 441392 as expected, but additionally I got a rate for 441 and then my fallback of 44 alone.
Code:
tariffgroupname | lcrtype | idtariffgroup | idtariffplan | tariffname | destination | id | dialprefix | destination | buyrate | buyrateinitblock
--------------------+---------+---------------+--------------+--------------------------+---------------+-------+------------+---------------+---------+------------------+
Pay As You Go - EU | 0 | 3 | 17 | A-Z Rates - EU Customers | UK, exeter | 85196 | 441392 | UK, exeter | 0.XX | 1 |
Pay As You Go - EU | 0 | 3 | 17 | A-Z Rates - EU Customers | UK geographic | 84986 | 441 | UK geographic | 0.XX | 1 |
Pay As You Go - EU | 0 | 3 | 17 | A-Z Rates - EU Customers | UK other | 79694 | 44 | UK other | X.XX | 1 |
Ouch once more! I'm guessing here, but I think this might interact very badly with failover. Surely this query is only meant to be returning the
longest match(es)?
So anyway, ignoring what I believed to a bug for a moment, I carried on and tried unrolling the query to remove that nasty sequential scan on cc_ratecard. The first query I run changes the WHERE clause to start with "WHERE (dialprefix LIKE '923XXXXXXXXX%'". Successive queries shorten this clause by one character until a match is found for the rate 923. EXPLAIN ANALYZE now says:
Code:
Sort (cost=333.10..333.11 rows=1 width=406) (actual time=3.371..3.373 rows=1 loops=1)
Sort Key: length(cc_ratecard.dialprefix)
InitPlan
-> Aggregate (cost=8.09..8.09 rows=1 width=7) (actual time=0.063..0.065 rows=1 loops=1)
-> Nested Loop (cost=0.00..8.07 rows=4 width=7) (actual time=0.056..0.056 rows=0 loops=1)
-> Seq Scan on cc_tariffplan (cost=0.00..6.47 rows=1 width=7) (actual time=0.053..0.053 rows=0 loops=1)
Filter: (dnidprefix = "substring"('00923XXXXXXXXX'::text, 1, length(dnidprefix)))
-> Seq Scan on cc_tariffgroup_plan (cost=0.00..1.56 rows=4 width=0) (never executed)
Filter: (idtariffgroup = 3)
-> Aggregate (cost=8.09..8.09 rows=1 width=7) (actual time=0.048..0.050 rows=1 loops=1)
-> Nested Loop (cost=0.00..8.07 rows=4 width=7) (actual time=0.043..0.043 rows=0 loops=1)
-> Seq Scan on cc_tariffplan (cost=0.00..6.47 rows=1 width=7) (actual time=0.040..0.040 rows=0 loops=1)
Filter: (calleridprefix = "substring"('XXXXXXXXXX'::text, 1, length(calleridprefix)))
-> Seq Scan on cc_tariffgroup_plan (cost=0.00..1.56 rows=4 width=0) (never executed)
Filter: (idtariffgroup = 3)
-> Nested Loop Left Join (cost=1.14..316.92 rows=1 width=406) (actual time=0.802..3.359 rows=1 loops=1)
Join Filter: ("inner".id = "outer".id_cc_package_offer)
-> Nested Loop Left Join (cost=1.14..315.71 rows=1 width=390) (actual time=0.728..3.282 rows=1 loops=1)
-> Nested Loop Left Join (cost=1.14..310.43 rows=1 width=330) (actual time=0.701..3.243 rows=1 loops=1)
-> Nested Loop (cost=1.14..305.14 rows=1 width=270) (actual time=0.674..3.210 rows=1 loops=1)
Join Filter: ("inner".idtariffplan = "outer".idtariffplan)
-> Nested Loop (cost=1.14..10.38 rows=1 width=81) (actual time=0.523..2.684 rows=3 loops=1)
Join Filter: ("outer".id = "inner".idtariffplan)
-> Seq Scan on cc_tariffplan (cost=0.00..7.55 rows=1 width=39) (actual time=0.160..0.418 rows=27 loops=1)
Filter: ((startingdate <= ('now'::text)::timestamp(6) with time zone) AND ((expirationdate > ('now'::text)::timestamp(6) with time zone) OR (expirationdate IS NULL) OR (length((expirationdate)::text) < 5)) AND ((dnidprefix = "substring"('00923XXXXXXXXX'::text, 1, length(dnidprefix))) OR ((dnidprefix = 'all'::text) AND (0 = $0))) AND ((calleridprefix = "substring"('XXXXXXXXXX'::text, 1, length(calleridprefix))) OR ((calleridprefix = 'all'::text) AND (0 = $1))))
-> Nested Loop Left Join (cost=1.14..2.78 rows=4 width=42) (actual time=0.013..0.072 rows=4 loops=27)
-> Seq Scan on cc_tariffgroup_plan (cost=0.00..1.56 rows=4 width=8) (actual time=0.005..0.026 rows=4 loops=27)
Filter: (idtariffgroup = 3)
-> Materialize (cost=1.14..1.15 rows=1 width=34) (actual time=0.002..0.003 rows=1 loops=108)
-> Seq Scan on cc_tariffgroup (cost=0.00..1.14 rows=1 width=34) (actual time=0.007..0.013 rows=1 loops=1)
Filter: (id = 3)
-> Index Scan using ind_cc_ratecard_dialprefix, ind_cc_ratecard_dialprefix on cc_ratecard (cost=0.00..293.87 rows=71 width=197) (actual time=0.021..0.138 rows=13 loops=3)
Index Cond: (((dialprefix >= '923'::text) AND (dialprefix < '924'::text)) OR (dialprefix = 'defaultprefix'::text))
Filter: (((dialprefix ~~ '923%'::text) OR (dialprefix = 'defaultprefix'::text)) AND (startdate <= ('now'::text)::timestamp(6) with time zone) AND ((stopdate > ('now'::text)::timestamp(6) with time zone) OR (stopdate IS NULL) OR (length((stopdate)::text) < 5)) AND (starttime <= 7945) AND (endtime >= 7945))
-> Index Scan using cc_trunk_pkey on cc_trunk rt_trunk (cost=0.00..5.27 rows=1 width=64) (actual time=0.008..0.011 rows=1 loops=1)
Index Cond: ("outer".id_trunk = rt_trunk.id_trunk)
-> Index Scan using cc_trunk_pkey on cc_trunk tp_trunk (cost=0.00..5.27 rows=1 width=64) (actual time=0.003..0.005 rows=1 loops=1)
Index Cond: ("outer".id_trunk = tp_trunk.id_trunk)
-> Seq Scan on cc_package_offer (cost=0.00..1.09 rows=9 width=24) (actual time=0.004..0.022 rows=9 loops=1)
Total runtime: 3.675 ms
Much more reasonable. These queries took a total of (9•1.12ms+3.67ms)==13.75ms, over 50x quicker than when we started.
As an added bonus this optimised query only spits out the longest matching rate(s), which is what I
thought the original was meant to do. Tomorrow I think I might start looking at how much work would be involved in modifying the rate engine to unroll the query in this manner, unless some PHP wizard would like to complete the task for me.
As ever, all input/criticism is eagerly received.