Building a Dynamic, Single-Pass Funnel Explorer using MATCH_RECOGNIZE and Templated Filters


If you’ve ever tried to build a Funnel Explorer in SQL, you know the pain of writing massive, slow queries filled with nested LEFT JOINs and hardcoded event names. I recently worked through a much better pattern using Snowflake and BigQuery’s native MATCH_RECOGNIZE function combined with Omni’s templated filters, and I wanted to share it here!

Instead of just hardcoding a “Winners” funnel (users who complete every step), this pattern uses regex-style matching to build a true Funnel Explorer—tracking both completions and drop-offs, all while letting business users dynamically pick the steps.

Why this pattern is awesome (and kudos to Felipe Hoffa for writing this up in 2021!)

  1. It’s fast and cheap: MATCH_RECOGNIZE reads the events table exactly once. No self-joins blowing up your compute costs.

  2. It handles drop-offs gracefully: By using the optional ? modifier in the pattern definition, the engine logs users who start the funnel even if they never finish it.

  3. 100% Self-Serve for Stakeholders: By tying Omni’s filter-only fields to your event types using suggest_from_field, stakeholders get a dropdown of every live event in your database to build their own funnels on the fly.

The Omni Model

Here is the YAML/SQL.

# Reference this view as funnel_explorer
schema: omni_dbt_ecomm
sql: |
  WITH base_filtered_events AS (
    SELECT * FROM "ECOMM"."EVENTS"
    WHERE 1=1
      -- Omni Templated Filter syntax for dynamic WHERE clauses
      AND {{# funnel_explorer.date_filter.filter }} created_at {{/ funnel_explorer.date_filter.filter }}
  ),
  funnel_matches AS (
    SELECT 
      session_id,
      step_1_time,
      step_2_time,
      step_3_time
    FROM base_filtered_events
    MATCH_RECOGNIZE (
      PARTITION BY session_id
      ORDER BY created_at
      
      MEASURES 
        FIRST(s1.created_at) AS step_1_time,
        FIRST(s2.created_at) AS step_2_time,
        FIRST(s3.created_at) AS step_3_time
        
      ONE ROW PER MATCH
      
      PATTERN ( s1 not_s2* s2? not_s3* s3? )
      
      -- Omni Parameter Injection: Grabbing the raw 'value' of the filter-only fields
      DEFINE
        s1     AS event_type = {{filters.funnel_explorer.step_1_event.value}},
        s2     AS event_type = {{filters.funnel_explorer.step_2_event.value}},
        s3     AS event_type = {{filters.funnel_explorer.step_3_event.value}},
        
        not_s2 AS event_type <> {{filters.funnel_explorer.step_2_event.value}},
        not_s3 AS event_type <> {{filters.funnel_explorer.step_3_event.value}}
    )
  )

  SELECT
    COUNT(step_1_time) AS total_step_1,
    COUNT(step_2_time) AS total_step_2,
    COUNT(step_3_time) AS total_step_3,

    ROUND(COUNT(step_2_time) / NULLIF(COUNT(step_1_time), 0), 2) AS conv_rate_1_to_2,
    ROUND(COUNT(step_3_time) / NULLIF(COUNT(step_2_time), 0), 2) AS conv_rate_2_to_3,

    ROUND(COUNT(step_3_time) / NULLIF(COUNT(step_1_time), 0), 2) AS overall_conversion_rate
  FROM funnel_matches

# ---------------------------------------------------------
# OUTPUT DIMENSIONS
# ---------------------------------------------------------
dimensions:
  total_step_1:
    sql: '"TOTAL_STEP_1"'

  total_step_2:
    sql: '"TOTAL_STEP_2"'

  total_step_3:
    sql: '"TOTAL_STEP_3"'

  conv_rate_1_to_2:
    sql: '"CONV_RATE_1_TO_2"'
    format: PERCENT

  conv_rate_2_to_3:
    sql: '"CONV_RATE_2_TO_3"'
    format: PERCENT

  overall_conversion_rate:
    sql: '"OVERALL_CONVERSION_RATE"'
    format: PERCENT

# ---------------------------------------------------------
# OMNI FILTER-ONLY FIELDS (Acting as Parameters)
# ---------------------------------------------------------
filters:
  date_filter:
    type: timestamp

  step_1_event:
    type: string
    suggest_from_field: omni_dbt_ecomm__events.event_type
    default_filter:
      is: Home
    filter_single_select_only: true

  step_2_event:
    type: string
    suggest_from_field: omni_dbt_ecomm__events.event_type
    default_filter:
      is: Cart
    filter_single_select_only: true

  step_3_event:
    type: string
    suggest_from_field: omni_dbt_ecomm__events.event_type
    default_filter:
      is: Purchase
    filter_single_select_only: true

What if your database doesn’t support MATCH_RECOGNIZE?

There was an interesting paper from 2025 Democratize MATCH_RECOGNIZE ! with an interesting and performant pattern you can try. Here’s a query tested on Databricks:


WITH RECURSIVE 
  -- 1. PREPROCESSING: Filter out the noise first!
  preprocessing AS (
    SELECT 
      session_id,
      created_at,
      event_type,
      ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY created_at) as rid 
    FROM ecomm.events
  ),

  -- 2. BREADTH-FIRST SEARCH (The Automaton)
  funnel_bfs AS (
    SELECT 
      session_id, 
      created_at AS step_1_time,
      CAST(NULL AS TIMESTAMP) AS step_2_time,
      CAST(NULL AS TIMESTAMP) AS step_3_time,
      rid,
      1 AS current_state 
    FROM preprocessing 
    WHERE event_type = 'Home'
    
    UNION ALL
    
    SELECT 
      b.session_id, 
      b.step_1_time,
      CASE WHEN b.current_state = 1 AND p.event_type = 'Cart' THEN p.created_at ELSE b.step_2_time END,
      CASE WHEN b.current_state = 2 AND p.event_type = 'Purchase' THEN p.created_at ELSE b.step_3_time END,
      p.rid,
      CASE 
        WHEN b.current_state = 1 AND p.event_type = 'Cart' THEN 2
        WHEN b.current_state = 2 AND p.event_type = 'Purchase' THEN 3
        ELSE b.current_state 
      END AS current_state
    FROM funnel_bfs b
    JOIN preprocessing p 
      ON b.session_id = p.session_id 
     AND p.rid = b.rid + 1
    WHERE b.current_state < 3
  ),

  -- 3. MATCH EVALUATION
  final_matches AS (
    SELECT 
      session_id,
      step_1_time,
      MAX(step_2_time) AS step_2_time,
      MAX(step_3_time) AS step_3_time
    FROM (SELECT * FROM funnel_bfs LIMIT 10000000) --spark.sql.cte.recursion.rowLimit
    GROUP BY session_id, step_1_time
  )

-- 4. AGGREGATE
SELECT
  COUNT(step_1_time) AS total_step_1,
  COUNT(step_2_time) AS total_step_2,
  COUNT(step_3_time) AS total_step_3,
  ROUND(COUNT(step_2_time) * 100.0 / NULLIF(COUNT(step_1_time), 0), 2) AS conv_rate_1_to_2
FROM final_matches;