- Category: [[Database Optimization]]
- When filter some data, instead of loading all data into memory then filter, predicate pushdown allows filtering at the query execution layer/storage layer, in order to only load the filtered data into memory.
- Benefit:
- Reduce I/O
- Reduce Time
- How it works
- Before (without predicate pushdown)
- Read: The engine read 100% data from storage. logseq.order-list-type:: number
- Transfer: The engine transfers all data over the network to the compute engine. logseq.order-list-type:: number
- Filter: The compute engine loads data into memory and applies
WHERE. logseq.order-list-type:: number
- After
- Read: The compute engine sends the query and the filter to the storage/metadata layer. logseq.order-list-type:: number
- Filter: The storage/metadata layer uses statistics based on the
WHEREto skip entire redundant blocks of data. logseq.order-list-type:: number - Transfer: Smaller data transfers over the network. logseq.order-list-type:: number
- Before (without predicate pushdown)
- Pitfalls
- Casting: something like
WHERE CAST(string_col AS INTEGER) > 5will not work.- Because the compute engine has to read all data to do the cast before filtering (which Holistics usually use).
- Complex UDF.
- Unsorted: If data random across row partitions, the predicate pushdown won’t work effectively.
- Casting: something like
/ Predicate Pushdown
Created Mon, 25 May 2026 00:00:00 +0000
Modified Mon, 25 May 2026 06:02:25 +0000