Weekly Report — Apr 6 – Apr 12, 2026 (friday mode)
Work
Internal
DONE Fix failed dbt tests — snowplow events (DAT-569)
- Both PRs merged and deployed: dbt #848 and prefect #393.
- Ran backfill and snowplow pipeline — all runs successful. Confirmed via Prefect flow runs (snowplow, backfill).
DONE Check Zendesk — Tenant Inovonics (DAT-571)
- Triggered by Vincent’s Slack message: Zendesk Customer Support dashboard showed 0 for Inovonics. Traced full lineage from
fct_zendesk_tickets→itg_mappings__zendesk_tenant→stg_zendesk__organizations→_airbyte_raw_organizationsand found org13430326317849missing entirely. - Root cause: Airbyte Zendesk connector v0.2.6 uses offset-based pagination capped at 10K records by Zendesk API. Raw table had exactly 10,000 rows; 680 of 911 org IDs in tickets were missing. Entire downstream lineage from staging to mart was broken.
- Wrote upgrade plan evaluating 3 options: (1) Upgrade Airbyte platform v0.35 → v2.0 (fixes root cause for all connectors, but high DevOps effort + risk to HubSpot syncs during migration), (2) Rebuild in Prefect (data-team-only, orgs-only ≈ 2–3 days), (3) Use Fivetran (managed SaaS, ≈ 2–3 days, requires refactoring all
stg_zendesk__*.sqlmodels). Waiting for review. - Interim fix applied: manually upgraded connector in Airbyte UI to v2.6.6. Vincent confirmed dashboard now shows data. Long-term Airbyte upgrade tracked as a Linear project.
- Triggered by Vincent’s Slack message: Zendesk Customer Support dashboard showed 0 for Inovonics. Traced full lineage from
WAITING DAT-555 — Fix
fct_job_queue_performanceBigQuery resource limit (DAT-565)- Parent issue: Investigate dbt Mart Product flow failures — 3 sub-issues: DAT-567 (PARSE_JSON, done), DAT-566 (uniqueness test, analyzing), DAT-565 (resource limit, PR reviewing).
- Decision (discussed with anh Triet): keep the model at minute-level granularity for DevOps debug tracing. Originally planned to remove it, but DevOps needs it for job queue overload diagnosis. The in-app Job Monitoring dashboard is not a full replacement since it lacks historical minute-level data.
- Root cause: the model performs a
CROSS JOINbetween a minute-level timespine and jobs table._dbt_max_partitionwas 6 months behind (2025-10-08), causing unbounded lookback on incremental runs — consuming 5.7M CPU seconds vs 724K limit. - Fix: cap incremental lookback to 7 days using
greatest()+coalesce()on_dbt_max_partition. Added_loaded_atcolumn to distinguish current-run data from historical partitions (needed becauseinsert_overwriteleaves old partitions untouched). Added singular testassert_job_queue_performance_lookback_within_7_days.sql. - PRs open and awaiting review: dbt #854 (cap lookback, gemini-code-assist reviewed with feedback on NULL handling — addressed) and internal-aml-project #69 (remove downstream AML dataset/models, gemini approved). Notion doc: DE-212.
- Backfill plan: ~26 weekly chunks once PR is merged.
- Dashboard cleanup: delete [WIP] Job Queue Performace 3.0, keep Job Queue Performance Estimation (v2) for DevOps tracing, review Report Job Queue Performance Monitoring with DevOps for potential redundancy.
WAITING Calendly data pipeline (DAT-283 — “Sync all calls from Calendly”)
- Linear scope: currently only demo/onboarding calls ingested; need all 50+ event types (training, customer success, case study, etc.) for growth team’s sales rep performance tracking and Lead Funnel enrichment.
- Major progress this week: defined full source schema from Calendly OpenAPI spec (42 API paths, 49 schemas). Narrowed to 7 tables present in BigQuery
src_calendly:users,event_type,event,event_membership,event_invitee,event_guest,question_and_answer. Published ERD on dbdiagram. - Schema design decisions: flattened nested API objects (UTM params, location) into columns; broke arrays into separate relational tables; included Fivetran metadata columns for lineage and soft-delete handling; all PKs follow Calendly URI pattern.
- Set up Fivetran as interim ingestion tool (bypassing broken Zapier → Google Sheets path that only had 400 of 4,438 events). Data now syncing to
src_calendlyin BigQuery. - Wrote full 6-phase dbt modeling plan: (1) source definition + seed mapping for event type categories (Sales/Retention/Solo/Internal/Marketing), (2) staging layer (3 views: events, invitees, hosts), (3) domain layer (2 tables: consolidated events at event-invitee grain, normalized Q&A), (4) mart layer (
fct_calendly_events), (5) fix downstreamfct_sales_leadsjoin (replace brokenevent_uuid+src_gsheetswithdom_calendly__eventsoninvitee_email_domain), (6) deprecate 5 legacy models. - PR open: dbt #853 — review required, awaiting anh Dong.
- LEARNING: when doing data pipeline ingestion, always find an OpenAPI spec for source schema reference. Critical for tracking API changes.
DONE Review Usage Monitoring dashboard
- Anh Hieu asked to review his AML changes on us.holistics.io: added
tenantmodel, cleaned dataset (1st commit), updated filters on dashboards (2nd commit). Discussed in person, completed Apr 8.
- Anh Hieu asked to review his AML changes on us.holistics.io: added
TODO Lead Funnel by Sales Motion (DAT-560)
- Linear scope: add
sales_motiondimension (call-first, trial-first, trial-only, call-only) tofct_sales_leadsso DTS106 dataset can slice lead metrics by channel. 4 sub-issues: DAT-561 (enrich call sources, PR reviewing), DAT-562 (add dimension, backlog), DAT-563 (fix rep assignment, backlog), DAT-564 (re-audit with BizOps, backlog). - Moved detailed context to dedicated page [[Lead Funnel by Sales Motion]]. Synthesized classification logic, revenue planning baseline ($330K MRR target → 59 raw leads/month), data quality audit findings (Fanserv/Novigi mis-categorized due to same-day UTC ordering; missing call sources via non-Calendly channels;
event_uuidNULL since Dec 2024), and override mechanism (HubSpot “DataOps Override” property as fallback). - Next step: review and make a plan; blocked on completing Calendly pipeline first (DAT-283 provides the enriched call data needed for DAT-561).
- Linear scope: add
TODO Create agent skill for fixing dbt data pipelines in #data-ops-bot — carry-over from W13, not started.
TODO Fix excluding internal testing Zoho accounts (DAT-524) — carry-over since W7, not started.
Add2Cart
DONE Ingest countries data into Retailers table — completed Apr 5. Bridge elimination (Phases 0–5) fully done in W14.
DONE Document Add2Cart Dashboard Implementation Process — completed Apr 11, moved to Backlog/Done.
WAITING Guidance for countries-level data ingestion (D4 doc) — waiting for anh Huy’s review.
TODO Project retro with anh Dong — carry forward to W16.
Presales
DONE Onboarding call 1 with Jonas Chorum — completed Apr 10
- Hotel PMS company (~1,000 properties), single-tenant architecture (each customer = separate DB on SQL Server). Participants: Ahmad (PM for SMS host copy), Kevin Lane (tech lead for Sprinter Miller), Mitchell (SWE), plus product strategy lead. Led by anh Huy, I supported with technical prep.
- Slack thread: originally scheduled Mar 31, moved 1h later (11pm VNT). anh Huy flagged it as “embedding + dynamic data source setup call” and asked me to sit in and support Chukwudi.
- Vibe-coded a dynamic data source demo at holistics-embed-demo.pages.dev showing per-hotel JWT routing with
data_source_namein AML. - Impression: they have high expectations on the embedded solution. This deal will be a long run. Key limitation: cross-database aggregation (50-60 DBs) requires a data mart.
DONE Answer Erin and Harsha on embed portal questions (Showbie) — completed Apr 11
- Erin’s questions: (1) JWT token expired in sandbox embed URL → explained it’s a security measure, just regenerate in sandbox; (2) can left-side panel be restricted? → clarified what access control options exist; (3) dashboard title shows in all embedding; (4) pointed to Row-Level Permission docs for per-account filtering; (5) confirmed tabs can use different underlying datasets.
- Erin’s follow-up: comparing single dashboard iframe vs embed portal — confirmed single dashboard supports export + drill-through but NOT email subscriptions. Chukwudi confirmed the current quote is for single dashboard approach. I added: can pivot to embed portal with only 1 dashboard (no explore) to get email subscriptions while hiding explore panel.
- Harsha’s question: asked about Looker-like dynamic schema switching (
{{ _user_attributes['schema'] }}). Chukwudi pointed to Dynamic Data Source docs. I added Dynamic Schema docs with AML code example usingH.current_user.schemavariable. - Built a
Custom Embedtester section in holistics-embed-demo.pages.dev so presales team can quickly demo how embed looks in a real application. - LEARNING: the UX gap between embed single dashboard and embed portal is real — prospects need to compare functionalities side-by-side (export yes/no, email subscription yes/no, explore yes/no).
- LEARNING: In the future, hooli should be a unified toolbox for everyone to try embed portal in their real application. Customers: Basata, Superbexperience, Showbie, Innerspace, Jonas Chorum.
TODO Polish embedding documentation — vibe-coded holistics-embed-demo (React + Cloudflare Pages, JWT-based, multi-user simulation, RLS demo). Waiting on anh Tai and anh Huy to clarify scope.
Docs
TODO Add demo video for local development docs — waiting for team trigger.
Personal / Tooling
Vibe-coded holistics-embed-demo.pages.dev — full-stack React + Cloudflare Pages app demonstrating secure JWT embedding, multi-user impersonation for RLS testing, embed portal, Ask AI, single dashboard, and a custom embed URL tester. Dev tools panel shows raw JWT payload and generated iframe URL.
Bought ATK X1 Ultimate V2 mouse. No Bluetooth (1K dongle only). Has a web-based config panel at hub.atk.pro for Motion Sync and Straight Line Correction.
Edited Huế trip videos — published on YouTube.
Found the book Staff Engineer — saved for future reading. “At the moment don’t feel like I’m ready enough to be at that level.”
Settled on mochi.cards as Flashcard app (completed Apr 6).
Learning & Notes
LEARNING Asset dependency lineage typically flows:
dbt models → AML models (table → query) → datasets → query reports → dashboard widgets → dashboards → schedules/alerts/shareable links/embed links.LEARNING When doing data pipeline ingestion, always find an OpenAPI spec (like Calendly’s) for source schema reference. This is important because when the source API changes, you have a reference to update data code accordingly.
LEARNING dbt incremental models: capping lookback windows (e.g., 7 days) with
greatest()prevents resource exhaustion when_dbt_max_partitionfalls behind. Add_loaded_atcolumn to distinguish current-run data from historical partitions when usinginsert_overwrite.LEARNING Embed analytics UX: the gap between embed single dashboard and embed portal is a consistent friction point across prospects. Building a live demo app (not just docs) helps presales communicate the difference effectively.
NOTE Found a Logseq CLI idea for AI agents.
NOTE Started tracking TODO items: self reflection + CV update, find football dataset for Duc Anh’s teaching.
Next Week
P1 — Internal: Transfer MRR project with anh Hieu — due Apr 15. High urgency.
P1 — Internal: Calendly data pipeline (DAT-283) — get PR #853 reviewed by anh Dong. Once merged, run models and validate data. Then fix downstream
fct_sales_leadsjoin.P1 — Internal: Write 1-on-1 report (noted in Apr 15 journal).
P1 — Internal: Create agent skill for fixing dbt data pipelines in #data-ops-bot — carry-over from W13, elevated to P1.
P2 — Internal: DAT-555 — get PR #854 and #69 reviewed. Plan backfill after merge. Clean up unused dashboards.
P2 — Internal: DAT-571 — get Zendesk upgrade plan reviewed and decide on option (Fivetran vs Prefect orgs-only vs Airbyte upgrade).
P2 — Internal: Fix excluding internal testing Zoho accounts (DAT-524) — lingering since W7. Timebox: max 2h.
P2 — Docs: Polish embedding documentation — clarify scope with anh Tai and anh Huy.
P2 — Docs: Add demo video for local development docs — target Wed or Fri.
P2 — Teaching: Find a football dataset for Duc Anh.
P3 — Add2Cart: Inactive — wait for Simon/Anurag to trigger again, then follow up with anh Huy on D4 review. Project retro with anh Dong deferred.
P3 — Internal: Lead Funnel by Sales Motion (DAT-560) — review and make a plan after Calendly pipeline stabilizes.
P3 — Internal: Contribute to Holistics skills and internal-skills repos.
P3 — Presales: Read Modeling Patterns docs.
P3 — Personal: Self reflection and update CV.
Career & Personal Consulting
Progress Review (Start/Stop/Keep):
Start: Building reusable demo tooling for presales — the holistics-embed-demo app is already serving multiple customers (Showbie, Jonas Chorum). This is high-leverage work. Formalize it as a shared team resource and get buy-in from anh Huy.
Start: Proactively documenting decisions on dedicated pages (e.g., [[Lead Funnel by Sales Motion]]) — this consolidation pattern keeps project context accessible instead of buried in daily journals.
Stop: Letting carry-over items linger without resolution — DAT-524 has been open since W7 (6 weeks). Agent skill since W13 (3 weeks). Either timebox and do them or formally deprioritize with a note to manager. The backlog hygiene matters for credibility.
Keep: The “schema-first” approach to data pipelines — finding Calendly’s OpenAPI spec, mapping to ERD, then designing dbt layers. This is thorough and prevents rework.
Keep: Vibe-coding demo apps instead of only writing docs — the embed demo app communicates value faster than any document. Several customers are already benefiting.
Observations:
This week’s work pattern shows a healthy mix of closing items (DAT-569 deployed, Zendesk interim fix, Jonas Chorum call done, countries ingestion done) and advancing strategic work (Calendly pipeline schema + modeling plan, embed demo app).
The Calendly pipeline work demonstrates strong engineering maturity: traced from broken Zapier pipeline → identified limitations → chose Fivetran as interim → mapped full API schema → designed 6-phase dbt plan. The disciplined approach will pay off.
Presales embedding is solidifying as your specialty. The embed demo app is now serving real customer conversations (Showbie embed portal questions answered same-day). Consider proposing this as an official presales tool.
The DAT-555 investigation shows good judgment: initial instinct was to remove the model, but after discussing with DevOps (Triet), pivoted to keeping it with a fix. Listening to stakeholder needs > defaulting to deletion.
MRR transfer (due Apr 15) is the highest urgency item for next week. Don’t let it get crowded out by carry-over work.
Workload is spread across 5 projects this week (Internal, Add2Cart, Presales, Docs, Personal). The Internal project dominates with multiple parallel tracks (DAT-569, DAT-571, DAT-555, DAT-283, DAT-560). Consider flagging to manager if context-switching cost is high.
Recommended resources to learn
dbt Incremental Models & BigQuery (directly relevant to DAT-555 lookback cap and Calendly pipeline):
Article: How to Use Incremental Models in dbt for Efficient BigQuery Data Processing — covers all 3 strategies (merge, insert_overwrite, append), lookback windows for late-arriving data, incremental predicates for cost optimization, and testing patterns. Directly applicable to the
fct_job_queue_performancefix.Discussion: dbt incremental models with insert_overwrite: backfill data causing duplicates — r/dataengineering thread on exactly the partitioning + backfill pattern you’re dealing with. Community solutions for handling
_dbt_max_partitiongaps.Data Pipeline & API Ingestion (directly relevant to Calendly pipeline and Fivetran setup):
Article: How Fivetran, dbt, and genAI Can Supercharge Data Workloads — covers the Fivetran → dbt pipeline pattern you’re using for Calendly. Includes pre-built dbt packages for common sources (HubSpot, Zendesk, Jira) and MCP server integration.
Guide: 10 Best Data Ingestion Tools — evaluation criteria for ingestion tools (Fivetran vs Airbyte vs Hevo). Relevant for the Zendesk DAT-571 decision (Option 1 vs 3).
Embedded Analytics Architecture (your emerging specialty — Jonas Chorum, Showbie, Basata):
Guide: The Complete Guide to Embedded Analytics for SaaS Products (2026) — covers the 4-layer architecture (experience, data, security/governance, action), multi-tenancy patterns (JWT + RLS), and the 8 capabilities every SaaS product needs. Directly maps to the embed demo app you built.
Guide: Multi-Tenant Deployment: 2026 Complete Guide — deep dive on tenant isolation models, JWT-driven security contexts, and hybrid datasets. Addresses the exact Jonas Chorum single-tenant vs multi-tenant architecture question.
Comparison: Embedded Analytics vs Traditional BI: Complete Comparison (2026) — covers embed portal vs single dashboard tradeoffs (the exact question Erin/Showbie asked), white-labeling, and when to use each approach.
Career Growth & Presales (continuing from W14):
Book: Staff Engineer by Will Larson — carry-over. The Calendly pipeline project (tracing root cause → designing 6-phase plan → documenting for handoff) is textbook staff-level work.
Reading list: The Ultimate Presales Reading List for 2026 — 34 curated books for sales engineers. Highlights: The Trusted Advisor Sales Engineer by John Care (directly relevant as you transition from demo-runner to strategic advisor), The Six Habits of Highly Effective Sales Engineers by Chris White (practical habits for demo prep and discovery).
/ Weekly Report - 2026-W15
Created Mon, 25 May 2026 00:00:00 +0000
Modified Mon, 25 May 2026 06:02:25 +0000