Chinh (lelouvincx) / Predicate Pushdown

Created Mon, 25 May 2026 00:00:00 +0000 Modified Mon, 25 May 2026 06:02:25 +0000
178 Words
  • Category: [[Database Optimization]]
  • When filter some data, instead of loading all data into memory then filter, predicate pushdown allows filtering at the query execution layer/storage layer, in order to only load the filtered data into memory.
  • Benefit:
    • Reduce I/O
    • Reduce Time
  • How it works
    • Before (without predicate pushdown)
      • Read: The engine read 100% data from storage. logseq.order-list-type:: number
      • Transfer: The engine transfers all data over the network to the compute engine. logseq.order-list-type:: number
      • Filter: The compute engine loads data into memory and applies WHERE. logseq.order-list-type:: number
    • After
      • Read: The compute engine sends the query and the filter to the storage/metadata layer. logseq.order-list-type:: number
      • Filter: The storage/metadata layer uses statistics based on the WHERE to skip entire redundant blocks of data. logseq.order-list-type:: number
      • Transfer: Smaller data transfers over the network. logseq.order-list-type:: number
  • Pitfalls
    • Casting: something like WHERE CAST(string_col AS INTEGER) > 5 will not work.
      • Because the compute engine has to read all data to do the cast before filtering (which Holistics usually use).
    • Complex UDF.
    • Unsorted: If data random across row partitions, the predicate pushdown won’t work effectively.